neural networks

The Complete Beginner’s Guide to Understanding Neural Networks

Ever wondered how your phone recognizes your face, or how Netflix knows exactly what show you’ll binge next? The answer is probably neural networks – the brain-inspired technology that’s quietly running a huge chunk of the AI we interact with every day.

Don’t worry if the term “neural network” sounds intimidating. By the end of this post, you’ll not only understand what they are, but you’ll even build a simple one yourself. No PhD required!

What Actually IS a Neural Network?

Think of a neural network like a really enthusiastic student who learns by example. Just like how you learned to recognize cats by seeing thousands of cat pictures as a kid, neural networks learn patterns by processing tons of data.

The “neural” part comes from how these systems loosely mimic brain neurons. In your brain, neurons receive signals, process them, and pass them along. Neural networks do something similar with mathematical operations instead of biological ones.

Here’s the basic idea: you feed the network a bunch of examples (like photos labeled “cat” or “not cat”), and it slowly learns to identify the patterns that make a cat… well, a cat. Pointy ears? Whiskers? That judgmental stare? The network figures it out.

The Building Blocks: Neurons and Layers

A neural network is made up of layers of artificial neurons (also called nodes). Each neuron takes some inputs, does a bit of math on them, and spits out an output.

Input Layer: This is where your data enters. If you’re analyzing images, each pixel might be one input.

Hidden Layer(s): This is where the magic happens. These layers process and transform the data, finding increasingly complex patterns.

Output Layer: This gives you the final answer. For a cat detector, it might output “85% chance this is a cat.”

The connections between neurons have different “weights”; think of these as how much influence one neuron has on another. During training, the network adjusts these weights to get better at its job.

How Do They Actually Learn?

Neural networks learn through a process that’s part trial-and-error, part mathematical optimization. Here’s the simplified version:

  1. Make a guess: The network processes an input and makes a prediction
  2. Check the answer: Compare the prediction to the correct answer
  3. Learn from mistakes: Adjust the weights to reduce the error
  4. Repeat: Do this thousands (or millions) of times

This process is called “backpropagation”; basically, the network works backwards from its mistakes to figure out which weights need tweaking.

Let’s Build One! A Simple Neural Network in Python

Ready to get your hands dirty? We’ll create a neural network that learns the XOR function (it outputs 1 when inputs are different, 0 when they’re the same). It’s simple but perfect for understanding the concepts.

First, let’s import what we need:

import numpy as np
import matplotlib.pyplot as plt

# Set random seed for reproducible results
np.random.seed(42)

Now, let’s create our neural network class:

class SimpleNeuralNetwork:
    def __init__(self):
        # Initialize weights randomly
        # We have 2 inputs, 4 hidden neurons, and 1 output
        self.weights1 = np.random.uniform(-1, 1, (2, 4))  # Input to hidden
        self.weights2 = np.random.uniform(-1, 1, (4, 1))  # Hidden to output
        
        # Biases (think of these as adjustable thresholds)
        self.bias1 = np.random.uniform(-1, 1, (1, 4))
        self.bias2 = np.random.uniform(-1, 1, (1, 1))
    
    def sigmoid(self, x):
        """Activation function - squashes values between 0 and 1"""
        return 1 / (1 + np.exp(-np.clip(x, -250, 250)))  # Clip to prevent overflow
    
    def sigmoid_derivative(self, x):
        """Derivative of sigmoid for backpropagation"""
        return x * (1 - x)
    
    def forward(self, X):
        """Forward pass - make a prediction"""
        self.z1 = np.dot(X, self.weights1) + self.bias1
        self.a1 = self.sigmoid(self.z1)
        self.z2 = np.dot(self.a1, self.weights2) + self.bias2
        self.a2 = self.sigmoid(self.z2)
        return self.a2
    
    def backward(self, X, y, output):
        """Backward pass - learn from mistakes"""
        # Calculate the error
        output_error = y - output
        output_delta = output_error * self.sigmoid_derivative(output)
        
        # Calculate hidden layer error
        hidden_error = output_delta.dot(self.weights2.T)
        hidden_delta = hidden_error * self.sigmoid_derivative(self.a1)
        
        # Update weights and biases
        learning_rate = 0.1
        self.weights2 += self.a1.T.dot(output_delta) * learning_rate
        self.bias2 += np.sum(output_delta, axis=0, keepdims=True) * learning_rate
        self.weights1 += X.T.dot(hidden_delta) * learning_rate
        self.bias1 += np.sum(hidden_delta, axis=0, keepdims=True) * learning_rate
    
    def train(self, X, y, epochs):
        """Train the network"""
        losses = []
        for i in range(epochs):
            output = self.forward(X)
            self.backward(X, y, output)
            
            # Calculate and store loss every 100 epochs
            if i % 100 == 0:
                loss = np.mean(np.square(y - output))
                losses.append(loss)
                print(f"Epoch {i}, Loss: {loss:.4f}")
        
        return losses

Now let’s create our training data (the XOR function):

# XOR training data
X = np.array([[0, 0],
              [0, 1], 
              [1, 0],
              [1, 1]])

y = np.array([[0],  # 0 XOR 0 = 0
              [1],  # 0 XOR 1 = 1  
              [1],  # 1 XOR 0 = 1
              [0]]) # 1 XOR 1 = 0

print("Training Data:")
print("Input -> Expected Output")
for i in range(len(X)):
    print(f"{X[i]} -> {y[i][0]}")

Time to train our network:

# Create and train the network
nn = SimpleNeuralNetwork()
print("\nTraining the network...")
losses = nn.train(X, y, 1000)

# Test our trained network
print("\nTesting the trained network:")
print("Input -> Predicted Output (Expected)")
for i in range(len(X)):
    prediction = nn.forward(np.array([X[i]]))
    print(f"{X[i]} -> {prediction[0][0]:.4f} ({y[i][0]})")

Let’s visualize how well it learned:

# Plot the training loss
plt.figure(figsize=(10, 6))
plt.plot(losses)
plt.title('Training Loss Over Time')
plt.xlabel('Epoch (x100)')
plt.ylabel('Loss')
plt.grid(True)
plt.show()

# Test with some new data to see if it really learned XOR
test_cases = [[0.1, 0.9], [0.8, 0.2], [0.9, 0.9], [0.1, 0.1]]
print("\nTesting with slightly different inputs:")
for test in test_cases:
    prediction = nn.forward(np.array([test]))
    expected = 1 if (test[0] > 0.5) != (test[1] > 0.5) else 0
    print(f"{test} -> {prediction[0][0]:.4f} (expected ~{expected})")

What Just Happened?

Congratulations! You just built and trained a neural network from scratch. Here’s what happened:

  1. Forward Pass: We fed data through the network, layer by layer
  2. Loss Calculation: We measured how wrong our predictions were
  3. Backward Pass: We adjusted weights to reduce the error
  4. Repeat: We did this 1000 times until the network got good at XOR

The network learned to map the XOR function without us explicitly programming the logic. It discovered the pattern through examples!

Where Do We Go From Here?

This simple network barely scratches the surface. Real-world neural networks can have:

  • Millions or billions of parameters
  • Hundreds of layers (hence “deep” learning)
  • Specialized architectures for different tasks

But the core principles remain the same: layers of neurons, weighted connections, and learning from data.

The Big Picture

Neural networks aren’t magic – they’re sophisticated pattern recognition systems. They excel at finding complex relationships in data that would be nearly impossible to program manually.

Whether it’s recognizing your voice, translating languages, or beating humans at complex games, neural networks are learning these skills the same way our simple XOR network learned: through lots of examples and gradual improvement.

Now you understand (and can build) the same fundamental technology that powers some of the most impressive AI systems out there.