[Paper] Splitwise: Collaborative Edge-Cloud Inference for LLMs via Lyapunov-Assisted DRL
Deploying large language models (LLMs) on edge devices is challenging due to their limited memory and power resources. Cloud-only inference reduces device burde...