[Paper] Konflux: Optimized Function Fusion for Serverless Applications
Source: arXiv - 2601.11156v1
Overview
The paper “Konflux: Optimized Function Fusion for Serverless Applications” tackles a practical pain point for anyone building on Function‑as‑a‑Service (FaaS) platforms: deciding which functions to bundle together to cut costs and latency. By introducing a lightweight emulator that can evaluate every possible fusion layout locally, the authors show how developers can avoid costly trial‑and‑error on the live cloud.
Key Contributions
- Full‑coverage fusion analysis – systematic enumeration and evaluation of all possible function‑fusion configurations for a given serverless workflow.
- Konflux emulator – platform‑agnostic, low‑overhead local runtime that mimics performance and pricing behavior of real FaaS services.
- Pareto‑optimal fusion set – empirical evidence that only a small subset of configurations dominate the cost‑latency trade‑off, dramatically simplifying the decision space.
- Pricing‑model sensitivity study – demonstrates how different cloud provider pricing schemes (per‑invocation, memory‑seconds, cold‑start penalties) shift the optimal fusion choices.
Methodology
- Model the application graph – represent a serverless app as a directed acyclic graph (DAG) where nodes are functions and edges are data/control dependencies.
- Generate fusion candidates – combinatorial enumeration produces every way of merging adjacent nodes that respects the DAG’s ordering (e.g.,
f1+f2,f2+f3,f1+f2+f3, etc.). - Emulate the FaaS platform – Konflux runs the fused functions on a local container that reproduces:
- Cold‑start latency (configurable image size & memory)
- Execution time (measured from the original function code)
- Pricing semantics (memory‑seconds, request‑based fees, network egress)
- Benchmark each configuration – the emulator records total latency (including intra‑fusion overhead) and estimated cost under multiple pricing models.
- Pareto analysis – keep configurations that are not strictly worse in both cost and latency as the optimal frontier.
The whole pipeline runs in minutes on a developer laptop, compared to hours or days of real‑cloud testing.
Results & Findings
| Application (example) | # Functions | # Fusion Configs | Best‑latency config | Best‑cost config | Pareto size |
|---|---|---|---|---|---|
| Image‑processing pipeline | 5 | 52 | Fuse all 5 (single container) | Keep functions separate | 3 |
| Event‑driven ETL | 7 | 429 | Fuse hot‑path (3 functions) | Fuse cold‑path (2 functions) | 4 |
| IoT data aggregator | 4 | 15 | Fuse 2 adjacent functions | No fusion | 2 |
Key take‑aways
- Only 2–5 configurations per app end up on the Pareto front, regardless of the exponential number of possibilities.
- Pricing model matters: under a per‑invocation model, fusing many functions reduces request fees and is often optimal; under a memory‑seconds model, the extra memory overhead of a larger bundle can make smaller fusions cheaper.
- Cold‑start reduction is the dominant latency win; merging functions that share a cold‑start penalty yields the biggest speedups.
Practical Implications
- Rapid “what‑if” testing – developers can iterate on fusion decisions locally without incurring cloud bills, fitting naturally into CI pipelines.
- Cost‑aware deployment tooling – Konflux’s emulator can be integrated into serverless frameworks (e.g., Serverless, SAM, Pulumi) to auto‑suggest the optimal fusion layout before push.
- Vendor‑agnostic optimization – because the emulator abstracts pricing rules, teams can compare AWS Lambda, Azure Functions, Google Cloud Run‑on‑GKE, etc., and pick the best provider for a given workload.
- Simplified architecture – knowing that only a handful of configurations matter lets architects focus on code modularity and observability rather than exhaustive performance testing.
Limitations & Future Work
- Static analysis only – assumes deterministic execution times; workloads with high variance (e.g., ML inference) may need runtime profiling.
- Resource constraints ignored – memory/CPU limits per function are not enforced during emulation, which could affect feasibility on actual platforms.
- Network effects simplified – inter‑function data transfer costs are approximated; real‑world VPC or cross‑region traffic could shift the optimal frontier.
- Future directions suggested by the authors: extend Konflux to handle dynamic scaling policies, incorporate real‑time telemetry for adaptive fusion, and evaluate multi‑tenant scenarios where shared resources impact cold‑start behavior.
Authors
- Niklas Kowallik
- Trever Schirmer
- David Bermbach
Paper Information
- arXiv ID: 2601.11156v1
- Categories: cs.DC
- Published: January 16, 2026
- PDF: Download PDF