Deep Neural Networks (DNN)

Deep Neural Networks are powerful models that can learn complex patterns through multiple layers of interconnected neurons. They form the foundation of modern deep learning and are used in various applications from image recognition to natural language processing. Understanding DNNs is crucial for working with more specialized architectures like CNNs or GNNs. I highly recommend looking at the MLU Explain website for a visual explanation of how DNNs work.

Core Concepts

Deep Neural Networks are built on several fundamental concepts that work together to enable powerful learning capabilities. It's good to know that there are several packages in python that implements DNNS. Tensorflow is generally a bit dated these days, keras is quite easy to learn, and pytorch is growing to become the most popular package for deep learning.

  • Layers and Neurons

    Neural networks are composed of layers of interconnected neurons:

    • Input Layer: Receives raw data and represents features
    • Hidden Layers: Process information through weighted connections
    • Output Layer: Produces final predictions or classifications
    • Each neuron computes a weighted sum of inputs and applies an activation function which introduces non-linearity into the network

  • Training Process

    The learning process involves:

    • Forward propagation of inputs through the network
    • Backpropagation of errors to update weights
    • Gradient descent optimization
    • Loss function minimization

Key Components

  • Loss functions - Visual explanation of common loss functions
  • Optimizers - Comprehensive guide to gradient descent optimizers
  • Learning rate - Deep Learning Book chapter on optimization
  • Batch size - Deep Learning Book chapter on optimization
  • Epochs - Deep Learning Book chapter on optimization

Implementation Examples

Simple DNN with TensorFlow/Keras

import tensorflow as tf
from tensorflow.keras import layers, models

# Create a simple DNN model
def create_dnn_model(input_shape, num_classes):
    model = models.Sequential([
        layers.Dense(128, activation='relu', input_shape=input_shape),
        layers.Dropout(0.2),
        layers.Dense(64, activation='relu'),
        layers.Dropout(0.2),
        layers.Dense(32, activation='relu'),
        layers.Dense(num_classes, activation='softmax')
    ])
    
    model.compile(
        optimizer='adam',
        loss='categorical_crossentropy',
        metrics=['accuracy']
    )
    
    return model

# Example usage
input_shape = (784,)  # For MNIST-like data
num_classes = 10
model = create_dnn_model(input_shape, num_classes)
model.summary()

Simple DNN with PyTorch

import torch
import torch.nn as nn
import torch.nn.functional as F

class SimpleDNN(nn.Module):
    def __init__(self, input_size, num_classes):
        super(SimpleDNN, self).__init__()
        self.fc1 = nn.Linear(input_size, 128)
        self.dropout1 = nn.Dropout(0.2)
        self.fc2 = nn.Linear(128, 64)
        self.dropout2 = nn.Dropout(0.2)
        self.fc3 = nn.Linear(64, 32)
        self.fc4 = nn.Linear(32, num_classes)
    
    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = self.dropout1(x)
        x = F.relu(self.fc2(x))
        x = self.dropout2(x)
        x = F.relu(self.fc3(x))
        x = self.fc4(x)
        return F.softmax(x, dim=1)

# Example usage
input_size = 784  # For MNIST-like data
num_classes = 10
model = SimpleDNN(input_size, num_classes)

# Print model summary
print(model)