EUNO.NEWS EUNO.NEWS
  • All (20349) +286
  • AI (3104) +14
  • DevOps (907) +7
  • Software (10509) +190
  • IT (5781) +75
  • Education (48)
  • Notice
  • All (20349) +286
    • AI (3104) +14
    • DevOps (907) +7
    • Software (10509) +190
    • IT (5781) +75
    • Education (48)
  • Notice
  • All (20349) +286
  • AI (3104) +14
  • DevOps (907) +7
  • Software (10509) +190
  • IT (5781) +75
  • Education (48)
  • Notice
Sources Tags Search
한국어 English 中文
  • 1个月前 · ai

    [论文] 控制对注意力 logits 的更改

    在训练 transformer 模型时,神经网络权重的稳定性至关重要。查询(query)和键(key)权重尤其成问题,因为它们倾向于增长……

    #attention #transformer training #learning rate scaling #model stability #research paper
EUNO.NEWS
RSS GitHub © 2026