memory architecture

1 month ago · ai

GAM takes aim at “context rot”: A dual-agent memory architecture that outperforms long-context LLMs

For all their superhuman power, today’s AI models suffer from a surprisingly human flaw: They forget. Give an AI assistant a sprawling conversation, a multi-ste...

#context rot #dual-agent memory #long-context LLMs #memory architecture #AI assistants #large language models #VentureBeat
1 month ago · ai

[Paper] Beluga: A CXL-Based Memory Architecture for Scalable and Efficient LLM KVCache Management

The rapid increase in LLM model sizes and the growing demand for long-context inference have made memory a critical bottleneck in GPU-accelerated serving system...

#CXL #LLM #KVCache #memory architecture #inference acceleration