language models

1周前 · ai

当 LLM 选择下一个 Token 时到底发生了什么🤯

LLM 输出有时感觉很稳定。有时它们会突然变得随机。通常，唯一改变的只是一个 parameter。那么实际上在那一刻会发生什么……

#LLM #token sampling #probability distribution #language models #inference #temperature #top‑k #top‑p
1周前 · ai

LLM 诗歌与“伟大”问题：Gwern 与 Mercor 的实验

请提供您希望翻译的具体摘录或摘要文本，我才能为您进行简体中文翻译。

#LLM #poetry #AI creativity #Gwern #Mercor #language models #generative AI
1周前 · ai

无任务的 LLM 智能测试

请提供您希望翻译的具体摘录或摘要文本，我才能为您进行简体中文翻译。

#LLM #intelligence testing #evaluation #benchmark #language models
1周前 · ai

了解 DLCM：深入探讨其核心架构与因果编码的力量

现代语言模型与动态潜在概念模型 DLCM 现代语言模型已经超越了简单的逐标记处理，且动态 L…

#DLCM #causal encoding #language models #model architecture #deep learning #transformers #hierarchical modeling
1周前 · ai

AI模型开始通过自问自答来学习

一种无需人工输入、通过自行提出有趣查询进行学习的 AI 模型，可能指向通往超级智能的道路……

#self-supervised learning #self-questioning AI #meta-learning #language models #artificial general intelligence
1周前 · ai

我破解了 GPT-2：我如何利用几何证明语义崩塌（The Ainex Limit）

TL;DR 我强迫 GPT‑2 从它自己的输出中学习了 20 代。到第 20 代时，模型失去了 66% 的 semantic volume，并开始出现 hallucinating state。

#GPT-2 #semantic collapse #synthetic data #language models #AI safety #model degradation #geometry analysis
1周前 · ai

我在尝试（并大多失败）理解 Attention Heads 时学到的东西

我最初的信念在深入研究之前，我隐含地相信了几件事： - 如果一个 attention head 持续关注（attend）特定的 token，那么该 token 是……

#attention #transformers #language models #interpretability #machine learning #neural networks #NLP
2周前 · ai

美国入侵委内瑞拉并捕获尼古拉斯·马杜罗。ChatGPT不同意

一些 AI 聊天机器人在突发新闻方面出奇地掌握得很好，另一些则明显做不到……

#ChatGPT #AI fact-checking #misinformation #news verification #language models
2周前 · ai

递归语言模型

Article URL: https://arxiv.org/abs/2512.24601 Comments URL: https://news.ycombinator.com/item?id=46475395 Points: 8 Comments: 0...

#language models #recursive models #machine learning #deep learning #arxiv
2周前 · ai

指令不是控制

封面图片：Instructions Are Not Control https://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-u...

#prompt engineering #LLM #jailbreak #AI safety #language models
3周前 · ai

我要求一只鹦鹉。AI 给了我一只乌鸦并把它放走。

我让一个 AI model 生成一只鹦鹉。它自信地生成了一只乌鸦。然后——比喻地——把它放飞了。> “我说要鹦鹉，它却变成乌鸦放飞……”

#prompt engineering #AI alignment #language models #model behavior #creativity vs correctness
3周前 · ai

第2部分：为什么 Transformer 仍然会遗忘

第2部分 – 为什么长上下文语言模型仍然在记忆方面挣扎（共三部分系列的第二部分）在第1部分 https://forem.com/harvesh_kumar/part-1-long-context-...

#transformers #long-context #memory #language-models #deep-learning #AI-research

Newer posts

Older posts