Starting from scratch: Training a 30M Topological Transformer

Published: 3 weeks ago (January 18, 2026 at 06:39 AM EST)

1 min read

Source: Hacker News

Article URL: https://www.tuned.org.uk/posts/013_the_topological_transformer_training_tauformer
Comments URL: https://news.ycombinator.com/item?id=46666963
Points: 4

Back to Blog

Understanding ReLU Through Visual Python Examples

Using the ReLU Activation Function In the previous articles we used back‑propagation and plotted graphs to predict values correctly. All those examples employe...

Time Series Isn’t Enough: How Graph Neural Networks Change Demand Forecasting

Why modeling SKUs as a network reveals what traditional forecasts miss The post Time Series Isn’t Enough: How Graph Neural Networks Change Demand Forecasting ap...

Show HN: The Hessian of tall-skinny networks is easy to invert

It turns out the inverse of the Hessian of a deep net is easy to apply to a vector. Doing this naively takes cubically many operations in the number of layers s...

Rethinking Learning Dynamics in AI Models: An Early Theory from Experimentation

Observing Representation Instability During Neural Network Training While experimenting with neural network training behaviors, I noticed a recurring pattern t...

Related posts

Understanding ReLU Through Visual Python Examples

Time Series Isn’t Enough: How Graph Neural Networks Change Demand Forecasting

Show HN: The Hessian of tall-skinny networks is easy to invert

Rethinking Learning Dynamics in AI Models: An Early Theory from Experimentation