Supervised & Unsupervised Learning

Overview

Machine learning can be broadly categorized into two main types: supervised and unsupervised learning. These approaches differ in how they learn from data and what kind of problems they're best suited for.

Supervised Learning

Supervised learning is a type of machine learning where the algorithm learns from labeled training data. The goal is to learn a mapping from inputs to outputs.

Key Characteristics

  • Uses labeled training data
  • Learns to predict outputs from inputs
  • Can be used for classification and regression
  • Requires human supervision for labeling

Common Applications

  • Image classification
  • Spam detection
  • Price prediction
  • Medical diagnosis

Example: Email Spam Classification

In spam detection, the algorithm learns from emails that are labeled as "spam" or "not spam" to predict the category of new emails.

Unsupervised Learning

Unsupervised learning is a type of machine learning where the algorithm learns patterns from unlabeled data without explicit supervision.

Key Characteristics

  • Uses unlabeled data
  • Discovers hidden patterns and structures
  • Used for clustering and dimensionality reduction
  • No human supervision required

Common Applications

  • Customer segmentation
  • Anomaly detection
  • Topic modeling
  • Image compression

Example: Customer Segmentation

In customer segmentation, the algorithm groups customers based on their behavior patterns without any predefined categories.

Comparison

Feature Supervised Learning Unsupervised Learning
Training Data Labeled Unlabeled
Learning Process Learning from examples Finding patterns
Output Predictions Patterns/Structures
Applications Classification, Regression Clustering, Dimensionality Reduction

Code Example

# Supervised Learning Example (Classification)
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

# Load data
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
    iris.data, iris.target, test_size=0.2
)

# Train model
model = RandomForestClassifier()
model.fit(X_train, y_train)

# Unsupervised Learning Example (Clustering)
from sklearn.cluster import KMeans

# Perform clustering
kmeans = KMeans(n_clusters=3)
clusters = kmeans.fit_predict(iris.data)

Further Reading