[Paper] Konflux: Optimized Function Fusion for Serverless Applications

Published: (January 16, 2026 at 05:16 AM EST)
3 min read
Source: arXiv

Source: arXiv - 2601.11156v1

Overview

The paper “Konflux: Optimized Function Fusion for Serverless Applications” tackles a practical pain point for anyone building on Function‑as‑a‑Service (FaaS) platforms: deciding which functions to bundle together to cut costs and latency. By introducing a lightweight emulator that can evaluate every possible fusion layout locally, the authors show how developers can avoid costly trial‑and‑error on the live cloud.

Key Contributions

  • Full‑coverage fusion analysis – systematic enumeration and evaluation of all possible function‑fusion configurations for a given serverless workflow.
  • Konflux emulator – platform‑agnostic, low‑overhead local runtime that mimics performance and pricing behavior of real FaaS services.
  • Pareto‑optimal fusion set – empirical evidence that only a small subset of configurations dominate the cost‑latency trade‑off, dramatically simplifying the decision space.
  • Pricing‑model sensitivity study – demonstrates how different cloud provider pricing schemes (per‑invocation, memory‑seconds, cold‑start penalties) shift the optimal fusion choices.

Methodology

  1. Model the application graph – represent a serverless app as a directed acyclic graph (DAG) where nodes are functions and edges are data/control dependencies.
  2. Generate fusion candidates – combinatorial enumeration produces every way of merging adjacent nodes that respects the DAG’s ordering (e.g., f1+f2, f2+f3, f1+f2+f3, etc.).
  3. Emulate the FaaS platform – Konflux runs the fused functions on a local container that reproduces:
    • Cold‑start latency (configurable image size & memory)
    • Execution time (measured from the original function code)
    • Pricing semantics (memory‑seconds, request‑based fees, network egress)
  4. Benchmark each configuration – the emulator records total latency (including intra‑fusion overhead) and estimated cost under multiple pricing models.
  5. Pareto analysis – keep configurations that are not strictly worse in both cost and latency as the optimal frontier.

The whole pipeline runs in minutes on a developer laptop, compared to hours or days of real‑cloud testing.

Results & Findings

Application (example)# Functions# Fusion ConfigsBest‑latency configBest‑cost configPareto size
Image‑processing pipeline552Fuse all 5 (single container)Keep functions separate3
Event‑driven ETL7429Fuse hot‑path (3 functions)Fuse cold‑path (2 functions)4
IoT data aggregator415Fuse 2 adjacent functionsNo fusion2

Key take‑aways

  • Only 2–5 configurations per app end up on the Pareto front, regardless of the exponential number of possibilities.
  • Pricing model matters: under a per‑invocation model, fusing many functions reduces request fees and is often optimal; under a memory‑seconds model, the extra memory overhead of a larger bundle can make smaller fusions cheaper.
  • Cold‑start reduction is the dominant latency win; merging functions that share a cold‑start penalty yields the biggest speedups.

Practical Implications

  • Rapid “what‑if” testing – developers can iterate on fusion decisions locally without incurring cloud bills, fitting naturally into CI pipelines.
  • Cost‑aware deployment tooling – Konflux’s emulator can be integrated into serverless frameworks (e.g., Serverless, SAM, Pulumi) to auto‑suggest the optimal fusion layout before push.
  • Vendor‑agnostic optimization – because the emulator abstracts pricing rules, teams can compare AWS Lambda, Azure Functions, Google Cloud Run‑on‑GKE, etc., and pick the best provider for a given workload.
  • Simplified architecture – knowing that only a handful of configurations matter lets architects focus on code modularity and observability rather than exhaustive performance testing.

Limitations & Future Work

  • Static analysis only – assumes deterministic execution times; workloads with high variance (e.g., ML inference) may need runtime profiling.
  • Resource constraints ignored – memory/CPU limits per function are not enforced during emulation, which could affect feasibility on actual platforms.
  • Network effects simplified – inter‑function data transfer costs are approximated; real‑world VPC or cross‑region traffic could shift the optimal frontier.
  • Future directions suggested by the authors: extend Konflux to handle dynamic scaling policies, incorporate real‑time telemetry for adaptive fusion, and evaluate multi‑tenant scenarios where shared resources impact cold‑start behavior.

Authors

  • Niklas Kowallik
  • Trever Schirmer
  • David Bermbach

Paper Information

  • arXiv ID: 2601.11156v1
  • Categories: cs.DC
  • Published: January 16, 2026
  • PDF: Download PDF
Back to Blog

Related posts

Read more »