πŸ€–AI

https://www.futuretools.io/ https://huggingface.co/

Language models

Machine learning models (neural networks)

2x Phases of machine learning models

1. Training stage - model is fed huge amounts of data, to train the model to "understand" whatever it is that it is being trained for

  1. Inference stage - the model is interrogated and it gives a response (infers an answer) based on the input

  • Is this a picture of a cat? yes or no

  • Translate this audio into text

  • Generate text (GPT) based on input

Training

  • Single-user tasks, while the model is being trained it isn't required to do anything else, just learn

  • Training requires a crazy amount of hardware. >285k CPU cores, >10k GPU cores (NVIDIA), 400 gb per second internet

  • takes days and days of computing

Inference stage

  • Mult user task - lots of "queries"

  • Lots of hardware because of scale

LLM (Large Language Models)

  • ChatGPT-3.5, ChatGPT-4 are examples of LLMs

  • ChatGPT has 175 billion parameters

  • Parameters including weights and biases

  • The number of parameters in a neural network is directly related to the number of neurons and the number of connections between them

  • Needs supercomputers for training and interference

LLMs are type of artificial intelligence (AI) algorithm that uses deep learning techniques and massively large data sets to understand, summarize, generate, and predict new content.

LLMs have billions of weights and are trained on large quantities of data with self-supervised and semi-supervised learning.

Supervised learning - (labeled) models learn to map the input to the target output (e.g. images labeled as a "cat" or "fish")

Self-supervised learning - Can learn off data "lower quality data" without labels. Ie audio. Self-supervised at the model applying a label to the data, not a person.

Weak supervision - (unlabled) Combines a small amount of labeled data with a large amount of unlabled data during training.

Parameters - Variables in an AI system (model) whose values are adjusted during learning

Weights - Neurons & edges have a weight that adjusts as learning proceeds. Weight increases or decreases the strength of the single at a connection.

Edges - the connections between neurons

Artificial neuron - Mathematical function which models biological neurons in a neural network.

https://en.wikipedia.org/wiki/Artificial_neuron

Biases - a phenomenon that occurs when an algorithm produces results that are systemically prejudiced due to erroneous assumptions in the machine learning process

Neural network (Artificial neural network)

Neural networks can be found in animal brains (ie biological neural networks). ANN is based on a collection of connected units/nodes called "artificial neurons" which loosely model the neurons in a biological brain. "Edges" are the connections between neurons, neurons and edges have a "weight" that adjusts as learning proceeds. Weight increases or decreases the strength of the signal at a connection.

Neurons are aggregated into "layers" where the first layer is the input, the middle layers are for computing, and the last layer is the "output".

Neural circuits make up the connections between neurons, which allow for "thought".

Training

Neural networks (and by association AI models) learn by processing examples (data). Each piece of data has an "input" & "result". This forms "probability-weighted associations" between the input and the result.

Lets use an image recognition model for example

  1. Input - Give a piece of data to the model (ie an image)

  2. Output - The model outputs information about the image (ie image is a dog)

  3. Result - compare the output to the expected result

  4. Adjust - Adjust the weighted associations according to the error from your results

  5. Ongoing - continue this process until the the results are correct

Local vs global maximum

training and inference

Pretraining datasets

LLMs pre-train on datasets. Most commonly used textual (text based) datasets are Commoncrawl, The Pile, MassiveRText, Wikipedia and github.

Types of AI

Current AI models Reactive machines - Able to react to stimuli, unable to remember and learn. Example is a chess AI IBM's deep blue

Limited memory - Almost all present-day AI applications fall under this category (Ie ChatGPT). In addition to having the capabilities of reactive machines, LMs are able to learn. Ie Chatgpt is a Large Language Model that uses big data to learn. It can then use this data to react to stimuli (Ie question, answer)

Artificial narrow Intelligence (ANI) - Represents all existing AI. AI systems that only perform a specific task autonomously using human-like capabilities.

Artificial General Intelligence (AGI) - Does not currently exist. Ability of an AI agent to learn, perceive, understand, and function completely like a human being

Artificial Superintelligence (ASI) - in addition to replicating the multi-faceted intelligence of human beings, will be exceedingly better at everything they do because of overwhelmingly greater memory

Concept or work in progress

Theory of mind - Machines that are able to understand emotions ie "understand" humans

Self-aware - Final stage of AI. AI that is akin to the human brain, the AI recognises itself as being a real being.

GPT - stands for generative pre-trained transformer which is a type of large language model (LLM)

Web assembly

  • WASM

Federated learning

  • we all have data on our phone

  • conversational data

Edge devise - edge computing, doing computation as close as possible to where the data is generated

Alpaca and LLAma - Stanford language model https://www.alpacaml.com/

https://ai.facebook.com/blog/large-language-model-llama-meta-ai/

Hugging face - open source central for AI stuff

Parameters - the data

Autonomous AI - artificial

Auto GPT - chat gpt that just runs by itself

Autogpt - Python

  • technical white paper

  • code this

Last updated