Supervised learning. Each training example is a pair (x, y). The model learns a function that maps x to y. Classification (discrete y) and regression (continuous y) are the two halves. Most "applied ML" problems start here because labels are expensive but tractable.
Unsupervised learning. Only x — no labels. The model has to find structure on its own: clusters, lower-dimensional manifolds, densities, anomalies. Useful when labels are unavailable or when you want to understand the data before predicting.
Self-supervised learning. Labels invented from the input itself: predict the next token from past tokens, predict the masked patch from the visible ones, contrast augmented views of the same image. Powers most modern foundation models — labels are essentially free, and the resulting representations transfer beautifully.
Semi-supervised learning. Mostly unlabelled x, with a small labelled subset. Often the realistic setting in industry: labels are expensive, unlabelled data is everywhere. Pseudo-labelling, consistency training, and pre-training-then-fine-tuning are the dominant strategies.
Reinforcement learning. An agent acts in an environment and receives rewards. No labels — just feedback on whether its actions are working. Used for control (robotics), strategy (games), and increasingly for aligning language models to human preferences.