Solving XOR without Backpropagation: A Genetic Algorithm Approach 🧬

Published: (January 6, 2026 at 01:08 PM EST)
3 min read
Source: Dev.to

Source: Dev.to

Introduction

We usually hear about backpropagation and gradient descent when training neural networks. They are the industry standards, the ā€œcalculusā€ way of learning.
But what if we took a step back and looked at how nature learns? In this post we solve the classic XOR problem (a problem that a single neuron cannot solve) using a Genetic Algorithm (GA). No gradients are calculated; we simulate ā€œsurvival of the fittest.ā€

The XOR Problem

The XOR (exclusive OR) gate is a non‑linear problem. A simple linear line cannot separate its outputs, so a neural network is required.

Genetic Algorithm Overview

In a GA we treat neural networks as individuals in a population.

  • Fitness – How well the network solves XOR (inverse of error).
  • Selection – The best networks reproduce.
  • Crossover – We mix the weights of two parents.
  • Mutation – Random noise is added to maintain genetic diversity.

Python Implementation

Note: This code assumes you have a basic NeuralNetwork class with a .forward() method and weight matrices W1, W2, etc.

Fitness Calculation

import numpy as np
from basic_neural_network import NeuralNetwork

def calculate_fitness(nn):
    """
    Measures the XOR performance of the network.
    The lower the error, the higher the fitness (score).
    """
    X_xor = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
    y_target = np.array([[0], [1], [1], [0]])

    predictions = nn.forward(X_xor)
    mse = np.mean((y_target - predictions) ** 2)  # Mean Squared Error

    # Add a tiny constant to avoid division by zero
    return 1 / (mse + 1e-9)

Mutation

def mutate(nn, mutation_rate=0.1, strength=0.5):
    """
    Adds random noise to the network's weights.
    """
    parameters = [nn.W1, nn.b1, nn.W2, nn.b2, nn.W3, nn.b3]

    for param in parameters:
        if np.random.rand() < mutation_rate:
            # Add random Gaussian noise
            noise = np.random.randn(*param.shape) * strength
            param += noise

Crossover

def crossover(parent1, parent2):
    """
    Produces a child by averaging two parents' weights.
    """
    child = NeuralNetwork()
    child.W1 = (parent1.W1 + parent2.W1) / 2
    child.b1 = (parent1.b1 + parent2.b1) / 2
    child.W2 = (parent1.W2 + parent2.W2) / 2
    child.b2 = (parent1.b2 + parent2.b2) / 2
    child.W3 = (parent1.W3 + parent2.W3) / 2
    child.b3 = (parent1.b3 + parent2.b3) / 2
    return child

Main Evolution Loop

# --- MAIN LOOP ---

population_size = 50
population = [NeuralNetwork() for _ in range(population_size)]
generation_count = 1000

print("Evolution starting...")

for gen in range(generation_count):
    # Evaluate fitness and sort (highest score first)
    population = sorted(population, key=lambda x: calculate_fitness(x), reverse=True)

    best_individual = population[0]
    best_error = 1 / calculate_fitness(best_individual)

    if gen % 100 == 0:
        print(f"Gen {gen}: Lowest Error (MSE): {best_error:.5f}")
        if best_error < 0.001:
            print("Solution found!")
            break

    # Create new generation (elitism + crossover)
    new_population = []

    # Keep top 5 individuals unchanged (elitism)
    new_population.extend(population[:5])

    # Fill the rest by breeding top performers
    while len(new_population) < population_size:
        parent1 = population[np.random.randint(0, 10)]  # top 20%
        parent2 = population[np.random.randint(0, 10)]

        child = crossover(parent1, parent2)
        mutate(child, mutation_rate=0.3, strength=0.2)
        new_population.append(child)

    population = new_population

Final Test

print("\n--- Training Complete ---")
X_test = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
result = population[0].forward(X_test)

print("XOR Inputs:")
print(X_test)
print("Trained Network Predictions:")
print(np.round(result, 3))

Why Use a Genetic Algorithm Over Backpropagation?

  • Sparse rewards: In many problems you only know the outcome at the end (e.g., winning a game).
  • Non‑differentiable domains: Discrete logic or complex physical simulations may not provide usable gradients.

Conclusion

Evolutionary approaches can solve problems where gradient‑based methods struggle. Have you experimented with evolutionary code? Share your experiences in the comments! Happy coding! šŸš€

Back to Blog

Related posts

Read more Ā»

Rapg: TUI-based Secret Manager

We've all been there. You join a new project, and the first thing you hear is: > 'Check the pinned message in Slack for the .env file.' Or you have several .env...

Technology is an Enabler, not a Saviour

Why clarity of thinking matters more than the tools you use Technology is often treated as a magic switch—flip it on, and everything improves. New software, pl...