π€AI
https://www.futuretools.io/ https://huggingface.co/
Language models
Machine learning models (neural networks)
2x Phases of machine learning models
1. Training stage - model is fed huge amounts of data, to train the model to "understand" whatever it is that it is being trained for
Inference stage - the model is interrogated and it gives a response (infers an answer) based on the input
Is this a picture of a cat? yes or no
Translate this audio into text
Generate text (GPT) based on input
Training
Single-user tasks, while the model is being trained it isn't required to do anything else, just learn
Training requires a crazy amount of hardware. >285k CPU cores, >10k GPU cores (NVIDIA), 400 gb per second internet
takes days and days of computing
Inference stage
Mult user task - lots of "queries"
Lots of hardware because of scale
LLM (Large Language Models)
ChatGPT-3.5, ChatGPT-4 are examples of LLMs
ChatGPT has 175 billion parameters
Parameters including weights and biases
The number of parameters in a neural network is directly related to the number of neurons and the number of connections between them
Needs supercomputers for training and interference
LLMs are type of artificial intelligence (AI) algorithm that uses deep learning techniques and massively large data sets to understand, summarize, generate, and predict new content.
LLMs have billions of weights and are trained on large quantities of data with self-supervised and semi-supervised learning.
Supervised learning - (labeled) models learn to map the input to the target output (e.g. images labeled as a "cat" or "fish")
Self-supervised learning - Can learn off data "lower quality data" without labels. Ie audio. Self-supervised at the model applying a label to the data, not a person.
Weak supervision - (unlabled) Combines a small amount of labeled data with a large amount of unlabled data during training.
Parameters - Variables in an AI system (model) whose values are adjusted during learning
Weights - Neurons & edges have a weight that adjusts as learning proceeds. Weight increases or decreases the strength of the single at a connection.
Edges - the connections between neurons
Artificial neuron - Mathematical function which models biological neurons in a neural network.
https://en.wikipedia.org/wiki/Artificial_neuron
Biases - a phenomenon that occurs when an algorithm produces results that are systemically prejudiced due to erroneous assumptions in the machine learning process
Neural network (Artificial neural network)
Neural networks can be found in animal brains (ie biological neural networks). ANN is based on a collection of connected units/nodes called "artificial neurons" which loosely model the neurons in a biological brain. "Edges" are the connections between neurons, neurons and edges have a "weight" that adjusts as learning proceeds. Weight increases or decreases the strength of the signal at a connection.
Neurons are aggregated into "layers" where the first layer is the input, the middle layers are for computing, and the last layer is the "output".
Neural circuits make up the connections between neurons, which allow for "thought".
Training
Neural networks (and by association AI models) learn by processing examples (data). Each piece of data has an "input" & "result". This forms "probability-weighted associations" between the input and the result.
Lets use an image recognition model for example
Input - Give a piece of data to the model (ie an image)
Output - The model outputs information about the image (ie image is a dog)
Result - compare the output to the expected result
Adjust - Adjust the weighted associations according to the error from your results
Ongoing - continue this process until the the results are correct
Local vs global maximum
training and inference
Pretraining datasets
LLMs pre-train on datasets. Most commonly used textual (text based) datasets are Commoncrawl, The Pile, MassiveRText, Wikipedia and github.
Types of AI
Current AI models Reactive machines - Able to react to stimuli, unable to remember and learn. Example is a chess AI IBM's deep blue
Limited memory - Almost all present-day AI applications fall under this category (Ie ChatGPT). In addition to having the capabilities of reactive machines, LMs are able to learn. Ie Chatgpt is a Large Language Model that uses big data to learn. It can then use this data to react to stimuli (Ie question, answer)
Artificial narrow Intelligence (ANI) - Represents all existing AI. AI systems that only perform a specific task autonomously using human-like capabilities.
Artificial General Intelligence (AGI) - Does not currently exist. Ability of an AI agent to learn, perceive, understand, and function completely like a human being
Artificial Superintelligence (ASI) - in addition to replicating the multi-faceted intelligence of human beings, will be exceedingly better at everything they do because of overwhelmingly greater memory
Concept or work in progress
Theory of mind - Machines that are able to understand emotions ie "understand" humans
Self-aware - Final stage of AI. AI that is akin to the human brain, the AI recognises itself as being a real being.
GPT - stands for generative pre-trained transformer which is a type of large language model (LLM)
Web assembly
WASM
Federated learning
we all have data on our phone
conversational data
Edge devise - edge computing, doing computation as close as possible to where the data is generated
Alpaca and LLAma - Stanford language model https://www.alpacaml.com/
https://ai.facebook.com/blog/large-language-model-llama-meta-ai/
Hugging face - open source central for AI stuff
Parameters - the data
Autonomous AI - artificial
Auto GPT - chat gpt that just runs by itself
Autogpt - Python
technical white paper
code this
Last updated