Solving XOR without Backpropagation: A Genetic Algorithm Approach 🧬

Published: 1 week ago (January 6, 2026 at 01:08 PM EST)

3 min read

Source: Dev.to

Introduction

We usually hear about backpropagation and gradient descent when training neural networks. They are the industry standards, the “calculus” way of learning.
But what if we took a step back and looked at how nature learns? In this post we solve the classic XOR problem (a problem that a single neuron cannot solve) using a Genetic Algorithm (GA). No gradients are calculated; we simulate “survival of the fittest.”

The XOR Problem

The XOR (exclusive OR) gate is a non‑linear problem. A simple linear line cannot separate its outputs, so a neural network is required.

Genetic Algorithm Overview

In a GA we treat neural networks as individuals in a population.

Fitness – How well the network solves XOR (inverse of error).
Selection – The best networks reproduce.
Crossover – We mix the weights of two parents.
Mutation – Random noise is added to maintain genetic diversity.

Python Implementation

Note: This code assumes you have a basic NeuralNetwork class with a .forward() method and weight matrices W1, W2, etc.

Fitness Calculation

import numpy as np
from basic_neural_network import NeuralNetwork

def calculate_fitness(nn):
    """
    Measures the XOR performance of the network.
    The lower the error, the higher the fitness (score).
    """
    X_xor = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
    y_target = np.array([[0], [1], [1], [0]])

    predictions = nn.forward(X_xor)
    mse = np.mean((y_target - predictions) ** 2)  # Mean Squared Error

    # Add a tiny constant to avoid division by zero
    return 1 / (mse + 1e-9)

Mutation

def mutate(nn, mutation_rate=0.1, strength=0.5):
    """
    Adds random noise to the network's weights.
    """
    parameters = [nn.W1, nn.b1, nn.W2, nn.b2, nn.W3, nn.b3]

    for param in parameters:
        if np.random.rand() < mutation_rate:
            # Add random Gaussian noise
            noise = np.random.randn(*param.shape) * strength
            param += noise

Crossover

def crossover(parent1, parent2):
    """
    Produces a child by averaging two parents' weights.
    """
    child = NeuralNetwork()
    child.W1 = (parent1.W1 + parent2.W1) / 2
    child.b1 = (parent1.b1 + parent2.b1) / 2
    child.W2 = (parent1.W2 + parent2.W2) / 2
    child.b2 = (parent1.b2 + parent2.b2) / 2
    child.W3 = (parent1.W3 + parent2.W3) / 2
    child.b3 = (parent1.b3 + parent2.b3) / 2
    return child

Main Evolution Loop

# --- MAIN LOOP ---

population_size = 50
population = [NeuralNetwork() for _ in range(population_size)]
generation_count = 1000

print("Evolution starting...")

for gen in range(generation_count):
    # Evaluate fitness and sort (highest score first)
    population = sorted(population, key=lambda x: calculate_fitness(x), reverse=True)

    best_individual = population[0]
    best_error = 1 / calculate_fitness(best_individual)

    if gen % 100 == 0:
        print(f"Gen {gen}: Lowest Error (MSE): {best_error:.5f}")
        if best_error < 0.001:
            print("Solution found!")
            break

    # Create new generation (elitism + crossover)
    new_population = []

    # Keep top 5 individuals unchanged (elitism)
    new_population.extend(population[:5])

    # Fill the rest by breeding top performers
    while len(new_population) < population_size:
        parent1 = population[np.random.randint(0, 10)]  # top 20%
        parent2 = population[np.random.randint(0, 10)]

        child = crossover(parent1, parent2)
        mutate(child, mutation_rate=0.3, strength=0.2)
        new_population.append(child)

    population = new_population

Final Test

print("\n--- Training Complete ---")
X_test = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
result = population[0].forward(X_test)

print("XOR Inputs:")
print(X_test)
print("Trained Network Predictions:")
print(np.round(result, 3))

Why Use a Genetic Algorithm Over Backpropagation?

Sparse rewards: In many problems you only know the outcome at the end (e.g., winning a game).
Non‑differentiable domains: Discrete logic or complex physical simulations may not provide usable gradients.

Conclusion

Evolutionary approaches can solve problems where gradient‑based methods struggle. Have you experimented with evolutionary code? Share your experiences in the comments! Happy coding! 🚀