neural networks — Page 2

Sort:

3 weeks ago · ai · - · -

Cross Entropy Derivatives, Part 6: Using gradient descent to reach the final result

Optimizing the Bias b_3 – Getting the Exact Value In the previous articlehttps://dev.to/rijultp/cross-entropy-derivatives-part-5-optimizing-bias-with-backpropa...

#cross-entropy #gradient-descent #backpropagation #neural-networks #machine-learning
3 weeks ago · ai · - · -

Deep Learning Without Backpropagation

!Cover image for Deep Learning Without Backpropagationhttps://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=auto/https%3A%2F%2...

#deep learning #backpropagation #neural networks #local learning #biologically-inspired AI
1 month ago · ai · - · -

CRAM-Net: The Network that Thinks by Rewiring

Introduction: Beyond the Static Model CRAM‑Net Conversational Reasoning & Memory Network represents a fundamental shift in neural architecture—from static weig...

#neural networks #memory-native #conversation AI #CRAM-Net #GitHub
1 month ago · ai · - · -

Cross Entropy Derivatives, Part 3: Chain Rule for a Single Output Class

markdown !Cover image for Cross Entropy Derivatives, Part 3: Chain Rule for a Single Output Classhttps://media2.dev.to/dynamic/image/width=1000,height=420,fit=c...

#cross-entropy #derivatives #chain rule #backpropagation #neural networks #machine learning #bias gradient
1 month ago · ai · - · -

Unveiling the MEMORY-NATIVE-NEURAL-NETWORK (MNNN) Family: Rewriting AI's Approach to Memory

_Crazy experiment by me, author: @hejhdisshttps://dev.to/hejhdiss._ Note: The codebase in the repository was originally written by Claude Sonnet, but I edited a...

#memory #neural networks #MNNN #deep learning #transformers #AI architecture #intrinsic memory
1 month ago · ai · - · -

Cross Entropy Derivatives, Part 2: Setting Up the Derivative with Respect to a Bias

Introduction In the previous article we reviewed the key ideas needed to work with derivatives of cross‑entropy. In this article we set up the derivative step‑...

#cross-entropy #softmax #gradient #bias #neural networks #machine learning #deep learning
1 month ago · ai · - · -

AI Matching: Matrix First, Neural Nets Later

How to Get Day‑One Relevance When You Don’t Have Data and Probably Never Did Everyone wants an “AI‑powered matching engine.” In practice, that usually means on...

#matching engine #neural networks #training data scarcity #recommendation systems #machine learning infrastructure #data bootstrapping
1 month ago · ai · - · -

Understanding the exploding gradient problem

Why Neural Networks Explode — A Simple Fix That Helps Training some neural networks, especially RNNs, can feel like steering a boat in a storm, because small c...

#exploding gradients #gradient clipping #RNN #neural networks #training stability #deep learning
1 month ago · ai · - · -

Understanding ReLU Through Visual Python Examples

Using the ReLU Activation Function In the previous articles we used back‑propagation and plotted graphs to predict values correctly. All those examples employe...

#ReLU #activation function #deep learning #neural networks #Python #visualization #machine learning
1 month ago · ai · - · -

Starting from scratch: Training a 30M Topological Transformer

Article URL: https://www.tuned.org.uk/posts/013_the_topological_transformer_training_tauformer Comments URL: https://news.ycombinator.com/item?id=46666963 Point...

#transformer #topological transformer #machine learning #deep learning #neural networks #model training #30M parameters
1 month ago · ai · - · -

From Words to Vectors: How Semantics Traveled from Linguistics to Large Language Models

Why meaning moved from definitions to structure — and what that changed for modern AI When engineers talk about semantic search, embeddings, or LLMs that “unde...

#semantics #embeddings #large language models #natural language processing #neural networks #AI history #linguistics
1 month ago · ai · - · -

Show HN: The Hessian of tall-skinny networks is easy to invert

It turns out the inverse of the Hessian of a deep net is easy to apply to a vector. Doing this naively takes cubically many operations in the number of layers s...

#Hessian #deep learning #neural networks #second-order optimization #efficient algorithms

Newer posts

Older posts