Support Vector Machines (SVM)

Support Vector Machines are powerful supervised learning algorithms that find optimal decision boundaries by maximizing the margin between classes. They are particularly effective in high-dimensional spaces and can handle both linear and non-linear classification problems. To understand the basic principles of SVMs, please check out this interactive visualization.

Core Concepts

SVMs are versatile algorithms that can be used for both classification and regression tasks. They are particularly valuable because:

  • Maximum Margin Classification

    SVMs find the optimal hyperplane that maximizes the margin between classes. This approach:

    • Provides better generalization
    • Is robust to outliers
    • Works well in high-dimensional spaces
    • Has strong theoretical guarantees

    You can see a beautiful visualization and explanation of how SVMs work here.


  • Kernel Methods

    SVMs can handle non-linear classification through kernel tricks, which:

    • Map data to higher dimensions
    • Enable non-linear decision boundaries
    • Maintain computational efficiency
    • Support various kernel functions
Aspect Linear SVM Kernel SVM
Complexity Lower Higher
Training Speed Faster Slower
Memory Usage Lower Higher
Use Cases Linear separable data Non-linear patterns
Hyperparameters Mainly C C and kernel parameters

Detailed Concepts

1. Basic Principles

2. Advanced Topics

Implementation Examples


# Import necessary libraries
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import numpy as np
import pandas as pd

# Prepare your data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, 
    test_size=0.2, 
    random_state=42
)

# Scale the features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Initialize and train the model
svm = SVC(
    kernel='linear',
    C=1.0,              # Regularization parameter
    random_state=42
)
svm.fit(X_train_scaled, y_train)

# Make predictions
y_pred = svm.predict(X_test_scaled)

# Model Evaluation
from sklearn.metrics import accuracy_score, classification_report
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

# Get support vectors
support_vectors = svm.support_vectors_
print(f"\nNumber of support vectors: {len(support_vectors)}")
                                                


# Import necessary libraries
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
import numpy as np
import pandas as pd

# Scale the features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Define parameter grid for grid search
param_grid = {
    'C': [0.1, 1, 10, 100],
    'gamma': ['scale', 'auto', 0.1, 0.01],
    'kernel': ['rbf', 'poly', 'sigmoid']
}

# Initialize and train the model with grid search
svm = SVC(random_state=42)
grid_search = GridSearchCV(
    svm, 
    param_grid, 
    cv=5, 
    scoring='accuracy',
    n_jobs=-1
)
grid_search.fit(X_train_scaled, y_train)

# Get best parameters
print("Best parameters:", grid_search.best_params_)

# Make predictions with best model
best_svm = grid_search.best_estimator_
y_pred = best_svm.predict(X_test_scaled)

# Model Evaluation
from sklearn.metrics import accuracy_score, classification_report
accuracy = accuracy_score(y_test, y_pred)
print(f"\nAccuracy: {accuracy:.2f}")
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

# Get support vectors
support_vectors = best_svm.support_vectors_
print(f"\nNumber of support vectors: {len(support_vectors)}")