Build a CNN from Scratch Using Python and NumPy

November 28, 2025

Building a Convolutional Neural Network (CNN)

Deep learning frameworks like TensorFlow and PyTorch make building CNNs extremely easy — sometimes too easy.
Because most operations happen behind the scenes, beginners often miss the core intuition of how convolution, pooling, and feature extraction truly work.

In this blog, we will build a complete forward pass of a CNN from scratch using only Python and NumPy.
No TensorFlow, no PyTorch — just raw mathematical operations. I have tried to make it as simple as possible, no complex words, just basic CNN and its anagrams.

By the end, you’ll understand:

How convolution filters slide across images
How ReLU introduces non-linearity
How max pooling shrinks spatial dimensions
How flattening and dense layers work
How softmax generates final class probabilities

What Does a CNN Actually Do?

A CNN takes an image, extracts features, and predicts its class.

The core idea is:

Image → Convolution → ReLU → Pooling → Flatten → Fully Connected → Softmax

Each layer transforms the image into a more meaningful representation.

🧱 Step 1: Convolution Layer

A convolution layer applies small filters (e.g., 3×3) over the image to detect patterns like edges, curves, textures, etc.

Python Implementation:

def conv_forward(image, conv_filter):
    filter_size = conv_filter.shape[1]
    h, w = image.shape
    output_size = h - filter_size + 1
    output = np.zeros((output_size, output_size))

    for i in range(output_size):
        for j in range(output_size):
            output[i][j] = np.sum(image[i:i+filter_size, j:j+filter_size] * conv_filter)

    return output

⚡ Step 2: ReLU Activation

ReLU (Rectified Linear Unit) introduces non-linearity.
Formula:

def relu(x):
    return np.maximum(0, x)

🧩 Step 3: Max Pooling

Pooling reduces image size while keeping important features.

def maxpool(x, size=2, stride=2):
    h, w = x.shape
    output_h = h // size
    output_w = w // size

    output = np.zeros((output_h, output_w))

    for i in range(0, h, stride):
        for j in range(0, w, stride):
            output[i//stride][j//stride] = np.max(x[i:i+size, j:j+size])

    return output

📉 Step 4: Flatten Layer

Flattening converts the 2D feature map into a 1D vector.

def flatten(x):
    return x.reshape(-1, 1)

🔗 Step 5: Fully Connected Layer

This layer learns the final decision boundary.

def dense_forward(x, weights, bias):
    return np.dot(weights, x) + bias

🎯 Step 6: Softmax Output

Softmax turns raw scores into probabilities:

def softmax(logits):
    e = np.exp(logits - np.max(logits))
    return e / np.sum(e)

🚀 Putting It All Together (Full Forward Pass)

def cnn_forward(image, conv_filter, fc_weights, fc_bias):
    # 1. Convolution
    x = conv_forward(image, conv_filter)

    # 2. ReLU
    x = relu(x)

    # 3. Max Pooling
    x = maxpool(x)

    # 4. Flatten
    x = flatten(x)

    # 5. Fully Connected
    logits = dense_forward(x, fc_weights, fc_bias)

    # 6. Softmax
    probs = softmax(logits)

    return probs

🧪 Test Run Example

import numpy as np

image = np.random.randn(28, 28)
conv_filter = np.random.randn(3, 3)
fc_weights = np.random.randn(10, 169)
fc_bias = np.random.randn(10, 1)

output = cnn_forward(image, conv_filter, fc_weights, fc_bias)
print("Class probabilities:\n", output)

You now have a fully functional CNN forward pass built entirely with NumPy.

Conclusion

Building a CNN from scratch demystifies the inner workings of deep learning models.
By manually implementing convolution, activation functions, pooling, and fully connected layers, we gain a much clearer understanding of:

How CNNs extract features
How spatial transformations occur
How dense layers compute final predictions
How softmax translates scores into probabilities

This foundational knowledge helps you become a better deep learning engineer — even when using high-level frameworks.

📬 Want to connect or collaborate? Head over to the Contact page or find me on GitHub or LinkedIn

Tags:AI, CNN

Decode AI with JC

Build a CNN from Scratch Using Python and NumPy

Building a Convolutional Neural Network (CNN)

What Does a CNN Actually Do?

Image → Convolution → ReLU → Pooling → Flatten → Fully Connected → Softmax

🧱 Step 1: Convolution Layer

⚡ Step 2: ReLU Activation

🧩 Step 3: Max Pooling

📉 Step 4: Flatten Layer

🔗 Step 5: Fully Connected Layer

🎯 Step 6: Softmax Output

🚀 Putting It All Together (Full Forward Pass)

🧪 Test Run Example

Conclusion

About The Author

jatinchopra1011@gmail.com

Add a Comment

Your Weekly Dose of Knowledge – Delivered

Building a Convolutional Neural Network (CNN)

What Does a CNN Actually Do?

Image → Convolution → ReLU → Pooling → Flatten → Fully Connected → Softmax

🧱 Step 1: Convolution Layer

⚡ Step 2: ReLU Activation

🧩 Step 3: Max Pooling

📉 Step 4: Flatten Layer

🔗 Step 5: Fully Connected Layer

🎯 Step 6: Softmax Output

🚀 Putting It All Together (Full Forward Pass)

🧪 Test Run Example

Conclusion

Related Posts

About The Author

jatinchopra1011@gmail.com

Add a Comment