Machine Learning is a sub-field of artificial intelligence that uses data to train predictive models.
- Supervised learning - learns from labeled training data.
- Unsupervised learning - learns from unlabled training data. Examples include Principal component analysis and clustering.
- Reinforcement learning - Maximize a reward. An agent interacts with an environment and learns to take action by maximizing a cumulative reward.
- Transfer learning - storing knowledge gained while solving one problem and applying it to a different but related problem.
- Semi-Supervised learning - Use a mix mostly unlabeled, with a small labeled subset data.
- Self-supervised learnng - A form of unsupervised learning where the model is trained on a task using the data itself to generate supervisory signals, rather than relying on externally-provided labels. (Example: Predict the next word (LLM pretraining) or predicting part of a masked image MAE, DINO, iBOT)
- regression - predicting a continuous value attribute (Example: house prices)
- classification - predicting a discrete value. (Example: pass or fail, hot dog/not hot dog)
Features - are the inputs to a machine learning model. They are the measurable property being observed. An example of a features is pixel brightness in computer vision tasks or the square footgage of a house in home pricing prediction.
Feature selection - The process of choosing the features. It is important to pick features that correlate with the output.
Dimensionality Reduction - Reducing the number of features while preserving important features. A simple example is selecting the area of a house as a feature rather than using width and length seperately. Other examples include singular value decomposition, variational auto-encoders, and t-SNE (for visualizations), and max pooling layers for CNNs.
In suprervised learning data is typically split into training, validation and test data.
An example is a single instance from your dataset.
Neural Networks - Neural networks are a suitable model for fixed input features.
Transformers - Transformers are a neural network architecture designed to process sequences (text, images, audio, video) using a mechanism called attention. Replacing recurrent neural networks. Originally described in 2017 paper Attention Is All You Need
Transformers are the architecture for
- LLMs such as chat gpt, nanogpt
- Vision Transformers (ViT)
- Multimodal foundation models (CLIP, SigLIP, OpenAI’s Vision models, Gemini, etc.)
These are common computer vision tasks (obsolete) methods for solving them.
Convolutional Neural Networks CNNS are suitable models for computer vision problems.
- Image classification: res net, Inception v4, dense net
- Object detection: (with bounding boxes) yolo v4 (still used for realtime object detection)
- Instance segmentation: mask r-cnn
- Semantic segmentation: U-Net