Beyond the Standard Model :Introducing the 'Cousins' of the Memory-Native Neural Network Family

Published: 5 hours ago (February 3, 2026 at 01:50 PM EST)

3 min read

Source: Dev.to

Source: Dev.to

The “Cousin” Philosophy

Unlike the standard models found in api.py, these cousins are specialized tools:

Standalone architectures
Require dedicated C‑compiled backends
Not general‑purpose models
Designed for specific, complex temporal challenges that demand unique ways of remembering

1. DTPN — The Universal Persistence Hybrid

Dual‑Track Persistence Network

If the standard AMN is a master of context, DTPN is the master of persistence. It bridges immediate reaction and long‑term knowledge through three distinct retention tracks.

Track 1: The Echo (Temporal Fluidity)

Retains a fraction of the immediate previous output (β factor)
Ensures smooth transitions between time steps

Track 2: The State (Stateful Neurons)

Individual neurons maintain a decaying internal reservoir (α factor)
Acts as a medium‑term memory buffer

Track 3: The Manifold (Global Memory)

A shared associative whiteboard
Stores long‑term contextual information

Best For: Tracking micro‑fluctuations, medium‑term states, and long‑term facts simultaneously.

2. Hyper‑AMN — The Multi‑Head Specialist

Multi‑Head Associative Manifold

While a standard AMN uses a single global memory manifold, Hyper‑AMN introduces a multi‑head memory system, akin to a brain with specialized compartments.

Head Gating Mechanism

Information is routed into domain‑specific manifolds:

Spatial Manifold – Positional and structural patterns
Emotional Manifold – Sentiment and tone
Logical Manifold – Reasoning and causal links

Best For: Complex data streams where categorical separation (e.g., how something is said vs what is said) is essential.

3. SGW‑AMN — The “Conscious” Bottleneck

Sparse Global Workspace

Inspired by Global Workspace Theory, SGW‑AMN proposes that memory is strongest when forced through a bottleneck.

Thousands of neurons compete
Only a few enter a tiny global workspace
Memory becomes attention by compression

This competitive routing ensures that only the most salient features are stored.

Best For: Feature extraction and high‑noise environments where identifying the signal matters more than raw data volume.

4. NDM — The Fluid Network

Neural Differential Manifolds

NDM abandons static weight updates in favor of continuous weight evolution using Ordinary Differential Equations (ODEs).

# Example of continuous weight evolution
dW/dt = f(W, x, t)

Learning follows Hebbian traces (“neurons that fire together, wire together”)
The network rewires itself dynamically, achieving true neuroplasticity where structure and learning are inseparable.

Best For: Non‑stationary environments where rules change faster than traditional training can adapt.

Summary of the Cousins

Architecture	Key Innovation	Best For
DTPN	Triple‑Track Persistence	Maximum data retention across all time scales
Hyper‑AMN	Domain‑Specific Heads	Logic vs Emotion vs Structure separation
SGW‑AMN	Competitive Bottleneck	Extracting signal from noise
NDM	ODE Weight Evolution	Constantly changing environments

The Experiment Continues

These cousins live on the fringe of memory‑native research and demonstrate that there is no one‑size‑fits‑all intelligence:

Sometimes you need a bottleneck (SGW‑AMN)
Sometimes you need specialization (Hyper‑AMN)
Sometimes your weights must flow like liquid (NDM)

The project remains open‑source:

Code is available
C‑libraries are ready to compile
Exploration has only just begun

Note on Development

While these architectures were originally conceptualized with assistance from Claude Sonnet 4.5, they have been manually edited, refined, and tested to function as standalone research‑grade models.

🔗 Join the experiment: GitHub Repository

Beyond the Standard Model :Introducing the 'Cousins' of the Memory-Native Neural Network Family

The “Cousin” Philosophy

1. DTPN — The Universal Persistence Hybrid

Track 1: The Echo (Temporal Fluidity)

Track 2: The State (Stateful Neurons)

Track 3: The Manifold (Global Memory)

2. Hyper‑AMN — The Multi‑Head Specialist

Head Gating Mechanism

3. SGW‑AMN — The “Conscious” Bottleneck

4. NDM — The Fluid Network

Summary of the Cousins

The Experiment Continues

Note on Development

Related posts

Unveiling the MEMORY-NATIVE-NEURAL-NETWORK (MNNN) Family: Rewriting AI's Approach to Memory

Everything Will Be Represented in a Virtual Twin, Jensen Huang Says at 3DEXPERIENCE World

Everything Will Be Represented in a Virtual Twin, NVIDIA CEO Jensen Huang Says at 3DEXPERIENCE World

FlashAttention-T: Towards Tensorized Attention

The “Cousin” Philosophy

1. DTPN — The Universal Persistence Hybrid

Track 1: The Echo (Temporal Fluidity)

Track 2: The State (Stateful Neurons)

Track 3: The Manifold (Global Memory)

2. Hyper‑AMN — The Multi‑Head Specialist

Head Gating Mechanism

3. SGW‑AMN — The “Conscious” Bottleneck

4. NDM — The Fluid Network

Summary of the Cousins

The Experiment Continues

Note on Development

Related posts

Unveiling the MEMORY-NATIVE-NEURAL-NETWORK (MNNN) Family: Rewriting AI's Approach to Memory

Everything Will Be Represented in a Virtual Twin, Jensen Huang Says at 3DEXPERIENCE World

Everything Will Be Represented in a Virtual Twin, NVIDIA CEO Jensen Huang Says at 3DEXPERIENCE World

FlashAttention-T: Towards Tensorized Attention

Track 1: The Echo (Temporal Fluidity)

Track 2: The State (Stateful Neurons)

Track 3: The Manifold (Global Memory)