language models

1 week ago · ai

What Really Happens When an LLM Chooses the Next Token🤯

LLM outputs sometimes feel stable. Sometimes they suddenly become random. Often, the only thing that changed is a parameter. So what actually happens at the mom...

#LLM #token sampling #probability distribution #language models #inference #temperature #top‑k #top‑p
1 week ago · ai

LLM poetry and the 'greatness' question: Experiments by Gwern and Mercor

Article URL: https://hollisrobbinsanecdotal.substack.com/p/llm-poetry-and-the-greatness-question Comments URL: https://news.ycombinator.com/item?id=46575268 Poi...

#LLM #poetry #AI creativity #Gwern #Mercor #language models #generative AI
1 week ago · ai

Task-free intelligence testing of LLMs

Article URL: https://www.marble.onl/posts/tapping/index.html Comments URL: https://news.ycombinator.com/item?id=46545587 Points: 11 Comments: 1...

#LLM #intelligence testing #evaluation #benchmark #language models
1 week ago · ai

Understanding DLCM: A Deep Dive into Its Core Architecture and the Power of Causal Encoding

Modern Language Models and the Dynamic Latent Concept Model DLCM Modern language models have evolved beyond simple token‑by‑token processing, and the Dynamic L...

#DLCM #causal encoding #language models #model architecture #deep learning #transformers #hierarchical modeling
1 week ago · ai

AI Models Are Starting to Learn by Asking Themselves Questions

An AI model that learns without human input—by posing interesting queries for itself—might point the way to superintelligence....

#self-supervised learning #self-questioning AI #meta-learning #language models #artificial general intelligence
1 week ago · ai

I broke GPT-2: How I proved Semantic Collapse using Geometry (The Ainex Limit)

TL;DR I forced GPT‑2 to learn from its own output for 20 generations. By Generation 20 the model lost 66 % of its semantic volume and began hallucinating state...

#GPT-2 #semantic collapse #synthetic data #language models #AI safety #model degradation #geometry analysis
1 week ago · ai

What I Learned Trying (and Mostly Failing) to Understand Attention Heads

What I initially believed Before digging in, I implicitly believed a few things: - If an attention head consistently attends to a specific token, that token is...

#attention #transformers #language models #interpretability #machine learning #neural networks #NLP
2 weeks ago · ai

The US Invaded Venezuela and Captured Nicolás Maduro. ChatGPT Disagrees

Some AI chatbots have a surprisingly good handle on breaking news. Others decidedly don't....

#ChatGPT #AI fact-checking #misinformation #news verification #language models
2 weeks ago · ai

Recursive Language Models

Article URL: https://arxiv.org/abs/2512.24601 Comments URL: https://news.ycombinator.com/item?id=46475395 Points: 8 Comments: 0...

#language models #recursive models #machine learning #deep learning #arxiv
2 weeks ago · ai

Instructions Are Not Control

!Cover image for Instructions Are Not Controlhttps://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-u...

#prompt engineering #LLM #jailbreak #AI safety #language models
3 weeks ago · ai

I Asked for a Parrot. The AI Gave Me a Crow and Set It Free.

I asked an AI model to generate a parrot. It confidently generated a crow. And then—metaphorically—set it free. > “Maine bola tota bana, isne kavva bana ke uda...

#prompt engineering #AI alignment #language models #model behavior #creativity vs correctness
3 weeks ago · ai

Part 2: Why Transformers Still Forget

Part 2 – Why Long‑Context Language Models Still Struggle with Memory second of a three‑part series In Part 1https://forem.com/harvesh_kumar/part-1-long-context-...

#transformers #long-context #memory #language-models #deep-learning #AI-research

Newer posts

Older posts