EUNO.NEWS EUNO.NEWS
  • All (20931) +237
  • AI (3154) +13
  • DevOps (932) +6
  • Software (11018) +167
  • IT (5778) +50
  • Education (48)
  • Notice
  • All (20931) +237
    • AI (3154) +13
    • DevOps (932) +6
    • Software (11018) +167
    • IT (5778) +50
    • Education (48)
  • Notice
  • All (20931) +237
  • AI (3154) +13
  • DevOps (932) +6
  • Software (11018) +167
  • IT (5778) +50
  • Education (48)
  • Notice
Sources Tags Search
한국어 English 中文
  • 1 week ago · ai

    What I Learned Trying (and Mostly Failing) to Understand Attention Heads

    What I initially believed Before digging in, I implicitly believed a few things: - If an attention head consistently attends to a specific token, that token is...

    #attention #transformers #language models #interpretability #machine learning #neural networks #NLP
  • 1 month ago · ai

    [Paper] Controlling changes to attention logits

    Stability of neural network weights is critical when training transformer models. The query and key weights are particularly problematic, as they tend to grow l...

    #attention #transformer training #learning rate scaling #model stability #research paper
EUNO.NEWS
RSS GitHub © 2026