Apparel Image Classifier (Computer Vision Project)

Project Overview & Use Case

The Use Case: Imagine you work for a massive e-commerce clothing retailer. Thousands of third-party sellers upload photos of their products every day, but they often forget to tag them correctly. You need an AI that can look at a 2D image of an item and automatically categorize it as a “Shirt,” “Sneaker,” “Bag,” etc.

The Output: This script downloads the FashionMNIST dataset (70,000 images of clothing). It defines a Deep Learning Neural Network from scratch using PyTorch classes, trains it using a custom training loop, and tests it on a random image to predict what type of clothing it is.

System Workflow (How It Works)

Data Loading (DataLoader): PyTorch doesn’t load all data into RAM at once. It uses DataLoaders to efficiently feed images to the AI in small batches (e.g., 64 images at a time).

Defining the Network (nn.Module): We create a Python class that inherits from PyTorch’s base neural network class. We define the layers in the init method and dictate how data flows through them in the forward method.

The Training Loop: Unlike Keras where you just type model.fit(), PyTorch requires you to write the learning loop. For every batch of images, the AI:

  • Makes a guess (Forward Pass).

  • Calculates how wrong it was (Loss).

  • Calculates the mathematical gradients (Backward Pass / Backpropagation).

  • Adjusts its weights to be smarter (Optimizer Step).

Live Prediction: The script grabs an unseen image, turns it into a PyTorch Tensor, asks the model for a prediction, and visualizes it using Matplotlib.

Source Code

pytorch_classifier.py

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
import random

# Define the classes (labels) for the FashionMNIST dataset
CLASSES = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
         'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

# 1. Define the Neural Network Architecture
class ApparelClassifier(nn.Module):
  def __init__(self):
      super(ApparelClassifier, self).__init__()
      # Flatten takes a 28x28 pixel image and turns it into a 1D array of 784 pixels
      self.flatten = nn.Flatten()
      
      # A sequence of Dense (Linear) layers and ReLU activations
      self.network = nn.Sequential(
          nn.Linear(28 * 28, 128),
          nn.ReLU(),
          nn.Linear(128, 64),
          nn.ReLU(),
          nn.Linear(64, 10) # 10 output nodes for our 10 clothing categories
      )

  def forward(self, x):
      # This defines how the data flows through the network
      x = self.flatten(x)
      logits = self.network(x)
      return logits

def train_model():
  print("📥 Downloading FashionMNIST Dataset...")
  # Convert images to PyTorch Tensors
  transform = transforms.ToTensor()
  
  # Load Training and Testing Data
  train_data = torchvision.datasets.FashionMNIST(root='./data', train=True, download=True, transform=transform)
  test_data = torchvision.datasets.FashionMNIST(root='./data', train=False, download=True, transform=transform)
  
  # DataLoaders batch the data and shuffle it automatically
  train_loader = torch.utils.data.DataLoader(train_data, batch_size=64, shuffle=True)
  test_loader = torch.utils.data.DataLoader(test_data, batch_size=64, shuffle=False)

  print("🧠 Initializing PyTorch Neural Network...")
  model = ApparelClassifier()
  
  # Define the Loss Function (Cross Entropy) and Optimizer (Adam)
  criterion = nn.CrossEntropyLoss()
  optimizer = optim.Adam(model.parameters(), lr=0.001)

  # 2. The PyTorch Training Loop
  epochs = 3
  print(f"
🏋️‍♂️ Training the AI for {epochs} epochs...")
  
  for epoch in range(epochs):
      running_loss = 0.0
      # Iterate through batches of 64 images
      for images, labels in train_loader:
          
          # Step A: Clear old gradients
          optimizer.zero_grad()
          
          # Step B: Forward pass (Ask the AI to guess)
          outputs = model(images)
          
          # Step C: Calculate the loss (How wrong was the guess?)
          loss = criterion(outputs, labels)
          
          # Step D: Backward pass (Calculate gradients/backpropagation)
          loss.backward()
          
          # Step E: Optimizer step (Update the AI's "brain" weights)
          optimizer.step()
          
          running_loss += loss.item()
          
      print(f"   Epoch {epoch+1}/{epochs} completed. Average Loss: {running_loss/len(train_loader):.4f}")

  print("✅ Training Complete!")
  return model, test_data

def predict_random_image(model, test_data):
  """Picks a random test image, asks the PyTorch model to classify it."""
  print("
🔮 Testing AI on a random piece of apparel...")
  
  # Pick a random image from the test set
  idx = random.randint(0, len(test_data) - 1)
  image, true_label = test_data[idx]
  
  # PyTorch expects a batch dimension, so we add one: [1, 1, 28, 28]
  image_batch = image.unsqueeze(0)
  
  # Put the model in evaluation mode (turns off training-specific behaviors like Dropout)
  model.eval()
  
  # We don't need to calculate gradients for predicting, which saves memory
  with torch.no_grad():
      output = model(image_batch)
      
  # Find the index of the highest prediction value
  _, predicted_idx = torch.max(output, 1)
  ai_guess = CLASSES[predicted_idx.item()]
  actual_item = CLASSES[true_label]

  print("-" * 40)
  print(f"👕 Actual Item:    {actual_item}")
  print(f"🤖 AI Prediction:  {ai_guess}")
  if ai_guess == actual_item:
      print("✅ The AI categorized it correctly!")
  else:
      print("❌ The AI was confused.")
  print("-" * 40)

  # Visualize using Matplotlib
  # PyTorch Tensors are shaped [ColorChannels, Height, Width]. 
  # Matplotlib expects [Height, Width, ColorChannels]. We use .permute() to fix this.
  image_np = image.permute(1, 2, 0).numpy()
  
  plt.imshow(image_np, cmap='gray')
  plt.title(f"AI Guess: {ai_guess}
Actual: {actual_item}", fontsize=14, fontweight='bold')
  plt.axis('off')
  
  print("Close the image window to exit the script.")
  plt.show()

if __name__ == "__main__":
  print("=== PyTorch Apparel Classifier ===")
  trained_model, testing_dataset = train_model()
  predict_random_image(trained_model, testing_dataset)

Code Explanation (PyTorch Concepts)

Tensors (transforms.ToTensor()): In NumPy, data is stored in Arrays. In PyTorch, data is stored in Tensors. Tensors are nearly identical to arrays, but they have a superpower: they can run on Graphics Cards (GPUs) to perform math thousands of times faster.

nn.Module: Every neural network in PyTorch must inherit from this class. It is the fundamental building block.

init(): Think of this as the warehouse where you store all your network’s layers.

forward(): Think of this as the instruction manual. It tells the data exactly what order to flow through the warehouse.

The 5-Step Training Loop: This is the hallmark of PyTorch.

optimizer.zero_grad(): Clears out the math from the previous batch of images so it doesn’t accidentally add up.

model(images): Feeds the images through the forward() function.

criterion(): Measures the error.

loss.backward(): PyTorch’s autograd engine maps out exactly which neurons caused the error using calculus.

optimizer.step(): Tweaks the connections between the neurons so the AI does better next time.

with torch.no_grad(): : When you are just asking the AI for a prediction (not training it), you wrap the code in this block. It tells PyTorch to stop tracking the calculus gradients, which saves massive amounts of computer memory and makes predictions instant.

Execution Guide

Install PyTorch: Open your terminal. Because PyTorch is highly optimized for different operating systems, installing it can take a few minutes: pip install torch torchvision matplotlib numpy

Save the file: Create a new Python file named pytorch_classifier.py and paste the code above.

Run the script: Execute the script from your terminal: python pytorch_classifier.py

Observe the Output: The script will download a new folder called ./data containing the images. You will see the loss going down as the training loop runs. Finally, a Matplotlib window will pop up showing you an item of clothing and exactly how the PyTorch model categorized it!