Solving XOR without Backpropagation: A Genetic Algorithm Approach š§¬
Source: Dev.to
Introduction
We usually hear about backpropagation and gradient descent when training neural networks. They are the industry standards, the ācalculusā way of learning.
But what if we took a step back and looked at how nature learns? In this post we solve the classic XOR problem (a problem that a single neuron cannot solve) using a Genetic Algorithm (GA). No gradients are calculated; we simulate āsurvival of the fittest.ā
The XOR Problem
The XOR (exclusive OR) gate is a nonālinear problem. A simple linear line cannot separate its outputs, so a neural network is required.
Genetic Algorithm Overview
In a GA we treat neural networks as individuals in a population.
- Fitness ā How well the network solves XOR (inverse of error).
- Selection ā The best networks reproduce.
- Crossover ā We mix the weights of two parents.
- Mutation ā Random noise is added to maintain genetic diversity.
Python Implementation
Note: This code assumes you have a basic
NeuralNetworkclass with a.forward()method and weight matricesW1,W2, etc.
Fitness Calculation
import numpy as np
from basic_neural_network import NeuralNetwork
def calculate_fitness(nn):
"""
Measures the XOR performance of the network.
The lower the error, the higher the fitness (score).
"""
X_xor = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y_target = np.array([[0], [1], [1], [0]])
predictions = nn.forward(X_xor)
mse = np.mean((y_target - predictions) ** 2) # Mean Squared Error
# Add a tiny constant to avoid division by zero
return 1 / (mse + 1e-9)
Mutation
def mutate(nn, mutation_rate=0.1, strength=0.5):
"""
Adds random noise to the network's weights.
"""
parameters = [nn.W1, nn.b1, nn.W2, nn.b2, nn.W3, nn.b3]
for param in parameters:
if np.random.rand() < mutation_rate:
# Add random Gaussian noise
noise = np.random.randn(*param.shape) * strength
param += noise
Crossover
def crossover(parent1, parent2):
"""
Produces a child by averaging two parents' weights.
"""
child = NeuralNetwork()
child.W1 = (parent1.W1 + parent2.W1) / 2
child.b1 = (parent1.b1 + parent2.b1) / 2
child.W2 = (parent1.W2 + parent2.W2) / 2
child.b2 = (parent1.b2 + parent2.b2) / 2
child.W3 = (parent1.W3 + parent2.W3) / 2
child.b3 = (parent1.b3 + parent2.b3) / 2
return child
Main Evolution Loop
# --- MAIN LOOP ---
population_size = 50
population = [NeuralNetwork() for _ in range(population_size)]
generation_count = 1000
print("Evolution starting...")
for gen in range(generation_count):
# Evaluate fitness and sort (highest score first)
population = sorted(population, key=lambda x: calculate_fitness(x), reverse=True)
best_individual = population[0]
best_error = 1 / calculate_fitness(best_individual)
if gen % 100 == 0:
print(f"Gen {gen}: Lowest Error (MSE): {best_error:.5f}")
if best_error < 0.001:
print("Solution found!")
break
# Create new generation (elitism + crossover)
new_population = []
# Keep top 5 individuals unchanged (elitism)
new_population.extend(population[:5])
# Fill the rest by breeding top performers
while len(new_population) < population_size:
parent1 = population[np.random.randint(0, 10)] # top 20%
parent2 = population[np.random.randint(0, 10)]
child = crossover(parent1, parent2)
mutate(child, mutation_rate=0.3, strength=0.2)
new_population.append(child)
population = new_population
Final Test
print("\n--- Training Complete ---")
X_test = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
result = population[0].forward(X_test)
print("XOR Inputs:")
print(X_test)
print("Trained Network Predictions:")
print(np.round(result, 3))
Why Use a Genetic Algorithm Over Backpropagation?
- Sparse rewards: In many problems you only know the outcome at the end (e.g., winning a game).
- Nonādifferentiable domains: Discrete logic or complex physical simulations may not provide usable gradients.
Conclusion
Evolutionary approaches can solve problems where gradientābased methods struggle. Have you experimented with evolutionary code? Share your experiences in the comments! Happy coding! š