AWS re:Invent 2025 -The new AI architecture that adapts and thinks just like humans (STP108)

Published: (December 5, 2025 at 07:46 AM EST)
3 min read
Source: Dev.to

Source: Dev.to

Overview

AWS re:Invent 2025 – The new AI architecture that adapts and thinks just like humans (STP108)

In this session Jan Chorowski and Victor Szczerba from Pathway introduce Baby Dragon Hatchling, a post‑Transformer AI architecture inspired by the brain’s sparse neural networks. They argue that Transformers lack continuous learning, are inefficient, and are unsuitable for long‑running enterprise tasks. Baby Dragon features sparse activation and connectivity, enabling:

  • Continuous learning from thin datasets
  • Extended attention spans beyond two hours
  • Improved energy efficiency
  • Model observability for regulated environments

The architecture is positioned as a solution for enterprise needs through “sticky inference” with corporate data, with a planned mid‑year launch in partnership with AWS.

This article is auto‑generated from the original presentation; minor typos or inaccuracies may be present.

The Transformer’s Limitations and the Brain‑Inspired Baby Dragon Hatchling Architecture

Thumbnail 0

“The Transformer is on its way out. Its days are numbered.” – Jan Chorowski

Why Transformers struggle with long‑running tasks

  • Limited long‑term memory – Models are trained once, released as a static snapshot, and do not adapt after deployment.
  • Inefficiency – Incremental benchmark gains require ten‑fold increases in model size and data, driving up cost and data‑collection effort.
  • Lack of interpretability – Bugs are hard to diagnose; fixing them often means scaling up data or switching models, which merely trades one failure for another.

These constraints make dense, one‑size‑fits‑all Transformers ill‑suited for enterprise workloads that demand continuous improvement, custom data, and regulatory compliance.

The brain as a model for continuous learning

The human brain learns continuously, integrating new information on the fly. Its key properties contrast sharply with dense Transformers:

BrainTransformer
Sparse activation – only relevant neurons fireDense activation – every layer processes every input
Dynamic connectivity – pathways strengthen or weaken with experienceStatic connectivity – weights are fixed after training
Energy‑efficient – low power consumption for complex tasksEnergy‑hungry – scaling up models dramatically increases compute cost

Baby Dragon Hatchling: core ideas

  1. Sparse activation & connectivity – mimics the brain’s selective firing, reducing compute and energy use.
  2. Continuous learning – the model updates its parameters during operation, allowing it to improve from thin, domain‑specific datasets.
  3. Extended attention windows – supports coherent reasoning over tasks lasting hours, not just a few sentences.
  4. Observability & auditability – built‑in tooling for monitoring model behavior, essential for regulated industries.

Enterprise implications

  • Sticky inference – models can retain corporate knowledge securely, reducing the need to re‑upload large datasets.
  • Mid‑year launch with AWS – integration with AWS services (e.g., SageMaker, Bedrock) will enable seamless deployment and scaling.
  • Energy and cost savings – sparsity translates to lower inference costs, making large‑scale, long‑running AI applications financially viable.

Watch the full presentation

Thumbnail 30

Thumbnail 60

Thumbnail 120

Thumbnail 260

Back to Blog

Related posts

Read more »

The Modem's Revenge

The First Connection In the winter of 1994, in a small apartment in Hong Kong, a fourteen‑year‑old boy plugged a US Robotics Sportster 14,400 Fax Modem into hi...