Starting from scratch: Training a 30M Topological Transformer
Source: Hacker News
Article URL: https://www.tuned.org.uk/posts/013_the_topological_transformer_training_tauformer
Comments URL: https://news.ycombinator.com/item?id=46666963
Points: 4
Source: Hacker News
Article URL: https://www.tuned.org.uk/posts/013_the_topological_transformer_training_tauformer
Comments URL: https://news.ycombinator.com/item?id=46666963
Points: 4
Using the ReLU Activation Function In the previous articles we used back‑propagation and plotted graphs to predict values correctly. All those examples employe...
Why modeling SKUs as a network reveals what traditional forecasts miss The post Time Series Isn’t Enough: How Graph Neural Networks Change Demand Forecasting ap...
It turns out the inverse of the Hessian of a deep net is easy to apply to a vector. Doing this naively takes cubically many operations in the number of layers s...
Observing Representation Instability During Neural Network Training While experimenting with neural network training behaviors, I noticed a recurring pattern t...