AWS Lambda Durable Functions: Build Workflows That Last

Published: (December 3, 2025 at 04:33 PM EST)
4 min read
Source: Dev.to

Source: Dev.to

What Are Durable Functions?

Durable functions are Lambda functions that can pause and resume. When your function waits for a callback or sleeps for an hour, Lambda checkpoints its state and stops execution. When it’s time to continue, Lambda resumes exactly where it left off—with all variables and context intact.

This isn’t a new compute model. It’s regular Lambda with automatic state management. You write normal async/await code. Lambda makes it durable.

A Simple Example

Here’s a workflow that creates an order, waits 5 minutes, then sends a notification:

import { DurableContext, withDurableExecution } from '@aws/durable-execution-sdk-js';

export const handler = withDurableExecution(
  async (event: any, context: DurableContext) => {
    const order = await context.step('create-order', async () => {
      return createOrder(event.items);
    });

    await context.wait({ seconds: 300 });

    await context.step('send-notification', async () => {
      return sendEmail(order.customerId, order.id);
    });

    return { orderId: order.id, status: 'completed' };
  }
);

That’s it. No state machines to configure, no databases to manage, no polling loops. The function pauses during the wait, costs nothing while idle, and resumes automatically after 5 minutes.

Key Capabilities

  • Long execution times – Workflows can run for up to 1 year. Individual invocations are still limited to 15 minutes, but the workflow continues across multiple invocations.
  • Automatic checkpointing – Lambda saves your function’s state at each step. If something fails, the function resumes from the last checkpoint—not from the beginning.
  • Built‑in retries – Configure retry strategies with exponential backoff. Lambda handles the retry logic and timing automatically.
  • Wait for callbacks – Pause execution until an external event arrives. Perfect for human approvals, webhook responses, or async API results.
  • Parallel execution – Run multiple operations concurrently and wait for all to complete. Lambda manages the coordination.
  • Nested workflows – Invoke other durable functions and compose complex workflows from simple building blocks.

How It Works: The Replay Model

Durable functions use a replay‑based execution model. When your function resumes, Lambda replays it from the start—but instead of re‑executing operations, it uses checkpointed results.

First invocation – Your function runs, executing each step and checkpointing results.
Wait or callback – Function pauses; Lambda saves state and stops execution.
Resume – Lambda invokes your function again, replaying from the start.
Replay – Operations return checkpointed results instantly instead of re‑executing.
Continue – Function proceeds past the wait with all context intact.

This model ensures your function always sees consistent state, even across issues and restarts. Operations must be deterministic—they execute once and replay with the same result.

Learn more: Understanding the Replay Model

Common Use Cases

  • Approval workflows – Wait for human approval before proceeding. The function pauses until someone clicks approve or reject.
  • Saga patterns – Coordinate distributed transactions with compensating actions. If a step fails, automatically roll back previous steps.
  • Scheduled tasks – Wait for specific times or intervals. Process data at midnight, send reminders after 24 hours, or retry every 5 minutes.
  • API orchestration – Call multiple APIs with retries and error handling. Coordinate responses and handle partial issues gracefully.
  • Data processing pipelines – Process large datasets in stages with checkpoints. Resume from the last successful stage if something fails.
  • Event‑driven workflows – React to external events like webhooks, IoT signals, or user actions. Wait for events and continue processing when they arrive.

Testing Your Workflows

Testing long‑running workflows doesn’t mean waiting hours. The Durable Execution SDK includes a testing library that runs your functions locally in milliseconds:

import { LocalDurableTestRunner } from '@aws/durable-execution-sdk-js-testing';

const runner = new LocalDurableTestRunner({
  handlerFunction: handler,
});

const execution = await runner.run();

expect(execution.getStatus()).toBe('SUCCEEDED');
expect(execution.getResult()).toEqual({ orderId: '123', status: 'completed' });

The test runner simulates checkpoints, skips time‑based waits, and lets you inspect every operation. You can test callbacks, retries, and failures without deploying to AWS.

Learn more: Testing Durable Functions

Deploying with AWS SAM

Deploy durable functions using AWS SAM with a few key configurations:

Resources:
  OrderProcessorFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/order-processor
      Handler: index.handler
      Runtime: nodejs22.x
      DurableConfig:
        ExecutionTimeout: 900
        RetentionPeriodInDays: 7
    Metadata:
      BuildMethod: esbuild
      BuildProperties:
        EntryPoints:
          - index.ts

The DurableConfig property enables durable execution and sets the workflow timeout. SAM automatically handles IAM permissions for checkpointing and state management.

Learn more: Deploying Durable Functions with SAM

When to Use Durable Functions

  • Your workflow spans multiple steps with waits or callbacks.
  • You need automatic retries with exponential backoff.
  • You want to coordinate multiple async operations.
  • Your process requires human approval or external events.
  • You need to handle long‑running tasks without managing state.
  • You prefer writing workflows as code rather than configuration.

Getting Started

  1. Install the SDK
    npm install @aws/durable-execution-sdk-js
  2. Write your function – Wrap your handler with withDurableExecution().
  3. Use durable operationscontext.step(), context.wait(), context.waitForCallback().
  4. Test locally – Use LocalDurableTestRunner for fast iteration.
  5. Deploy with SAM – Add DurableConfig to your template.
  6. Monitor execution – Use Amazon CloudWatch and AWS X‑Ray for observability.

Learn More

Back to Blog

Related posts

Read more »