long-context

12시간 전 · ai

파트 2: 왜 Transformers는 여전히 잊어버리는가

파트 2 – 왜 장기 컨텍스트 언어 모델은 여전히 메모리와 씨름하는가 (3부 시리즈 중 두 번째) 파트 1에서 https://forem.com/harvesh_kumar/part-1-long-context-...

#transformers #long-context #memory #language-models #deep-learning #AI-research
2일 전 · ai

전문가들의 Mixtral

개요 Mixtral 8x7B는 많은 작은 전문가들에 작업을 분산시켜 속도와 지능을 모두 달성하는 언어 모델입니다. 이는 Sparse Mixtu...

#Mixtral #Mixture of Experts #Sparse MoE #large language models #LLM #open-source #long-context #coding #multilingual