· ai
[Paper] AugServe: Adaptive Request Scheduling for Augmented Large Language Model Inference Serving
As augmented large language models (LLMs) with external tools become increasingly popular in web applications, improving augmented LLM inference serving efficie...