CNN Logo Convolutional Neural Networks (CNNs) are specialized neural networks designed for processing grid-like data such as images. They use convolutional layers to automatically learn spatial hierarchies of features, making them particularly effective for computer vision tasks. CNNs have revolutionized image recognition, object detection, and other visual processing tasks. The best visual introduction I have found is here.

Convolutional Neural Networks (CNN)

Convolutional Neural Networks are specialized neural networks designed for processing grid-like data such as images. They use convolutional layers to automatically learn spatial hierarchies of features, making them particularly effective for computer vision tasks like image classification, object detection, and image segmentation.

Core Concepts

CNNs are built on several key concepts that enable them to effectively process visual data.

  • Network Architecture

    The structure of a CNN consists of:

    • Convolutional layers for feature extraction
    • Pooling layers for dimensionality reduction
    • Fully connected layers for classification
    • Activation functions for non-linearity

  • Key Operations

    The main operations in CNNs include:

    • Convolution for feature detection
    • Pooling for spatial invariance
    • Activation for non-linear transformation
    • Backpropagation for learning

Key Components

  • Convolutional layers
  • Pooling layers
  • Fully connected layers
  • Activation functions
  • Batch normalization

Implementation Examples

CNN with TensorFlow/Keras

import tensorflow as tf
from tensorflow.keras import layers, models

def create_cnn_model(input_shape, num_classes):
    model = models.Sequential([
        # First Convolutional Block
        layers.Conv2D(32, (3, 3), activation='relu', input_shape=input_shape),
        layers.MaxPooling2D((2, 2)),
        
        # Second Convolutional Block
        layers.Conv2D(64, (3, 3), activation='relu'),
        layers.MaxPooling2D((2, 2)),
        
        # Third Convolutional Block
        layers.Conv2D(64, (3, 3), activation='relu'),
        
        # Dense Layers
        layers.Flatten(),
        layers.Dense(64, activation='relu'),
        layers.Dense(num_classes, activation='softmax')
    ])
    
    model.compile(
        optimizer='adam',
        loss='categorical_crossentropy',
        metrics=['accuracy']
    )
    
    return model

# Example usage
input_shape = (32, 32, 3)  # For CIFAR-10 like data
num_classes = 10
model = create_cnn_model(input_shape, num_classes)
model.summary()

CNN with PyTorch

import torch
import torch.nn as nn
import torch.nn.functional as F

class CNN(nn.Module):
    def __init__(self, num_classes):
        super(CNN, self).__init__()
        # First Convolutional Block
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3)
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
        
        # Second Convolutional Block
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3)
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
        
        # Third Convolutional Block
        self.conv3 = nn.Conv2d(64, 64, kernel_size=3)
        
        # Dense Layers
        self.fc1 = nn.Linear(64 * 4 * 4, 64)  # Adjust size based on input
        self.fc2 = nn.Linear(64, num_classes)
    
    def forward(self, x):
        # First Block
        x = self.pool1(F.relu(self.conv1(x)))
        
        # Second Block
        x = self.pool2(F.relu(self.conv2(x)))
        
        # Third Block
        x = F.relu(self.conv3(x))
        
        # Flatten and Dense Layers
        x = x.view(-1, 64 * 4 * 4)  # Adjust size based on input
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        
        return F.softmax(x, dim=1)

# Example usage
num_classes = 10
model = CNN(num_classes)

# Print model summary
print(model)