Deploying a Serverless AI Agent with AWS Bedrock, Lambda, and API Gateway

Published: 1 month ago (December 27, 2025 at 08:15 PM EST)

3 min read

Source: Dev.to

Overview

This guide walks through building a question‑answering service powered by generative AI using Amazon Bedrock. The architecture accepts prompts via HTTP and returns model‑generated responses while keeping costs minimal through a fully serverless stack.

Data Flow

External clients send HTTP requests to API Gateway.
API Gateway routes requests to a Lambda function.
The Lambda function invokes Amazon Bedrock’s Nova Micro model.
The Lambda container image is stored in ECR (Elastic Container Registry).

Requirements

Aspect	Requirement
Prompt Processing	Accept prompts and return Nova Micro completions
HTTP Endpoint	Expose an endpoint for triggering responses
Estimated Volume	~100 monthly requests (for cost estimation)
Automation	Fully automated deployment via GitHub Actions
Availability	99.9 %+ monthly uptime
Security	IAM‑scoped Bedrock access, OpenID Connect auth, HTTPS‑only
Observability	Structured logging with CloudWatch dashboards

Authentication, input sanitization, and authorization are excluded to keep focus on the core GenAI implementation.

Cost Estimation

Based on an estimated 22 input tokens and 232 output tokens per request:

Service	Monthly Cost	Notes
Bedrock (Nova Micro)	~ $0.003	2,200 input / 23,200 output tokens
Lambda	Free	Within free tier (1 M requests, 400 K GB‑seconds)
API Gateway	Free (Year 1)	~ $0.0004/month after the free tier
ECR	~ $0.01	300 MB image after 500 MB free tier

Monthly Requests vs. Cost

Requests	Approx. Cost
1 000	~ $0.04
10 000	~ $0.39
100 000	~ $3.76

Project Setup

mkdir -p handler terraform
cd handler
pnpm init -y
pnpm --package=typescript dlx tsc --init
mkdir -p src __tests__
touch src/{app,env,index}.ts

pnpm add -D @types/node tsx typescript
pnpm add ai @ai-sdk/amazon-bedrock zod dotenv

Application Architecture

flowchart TB
    A["Lambda Handler
*Parses events, returns responses*"] --> B["Application Logic
*Manages prompts & orchestration*"]
    B --> C["Bedrock Integration
*Model invocation via AI SDK*"]

Lambda Handler (`src/index.ts`)

export const handler = async (event: any, context: any) => {
    try {
        const body = event.body ? JSON.parse(event.body) : {};
        const prompt = body.prompt ?? "Welcome from Warike technologies";
        const response = await main(prompt);
        return {
            statusCode: 200,
            body: JSON.stringify({ success: true, data: response }),
        };
    } catch (error) {
        return {
            statusCode: 500,
            body: JSON.stringify({
                success: false,
                error: error instanceof Error ? error.message : 'Unexpected error'
            }),
        };
    }
};

Bedrock Utility (`src/utils/bedrock.ts`)

import { createAmazonBedrock, generateText } from 'ai';
import { config } from '../config';

export async function generateResponse(prompt: string) {
    const { regionId, modelId } = config({});
    const bedrock = createAmazonBedrock({ region: regionId });

    const { text, usage } = await generateText({
        model: bedrock(modelId),
        system: "You are a helpful assistant.",
        prompt: [{ role: "user", content: prompt }],
    });

    console.log(`model: ${modelId}, response: ${text}, usage: ${JSON.stringify(usage)}`);
    return text;
}

Environment Variables

AWS_REGION=us-west-2
AWS_BEDROCK_MODEL='amazon.nova-micro-v1:0'
AWS_BEARER_TOKEN_BEDROCK='aws_bearer_token_bedrock'

Security Note: Use short‑lived API keys only.

Docker Build

Build Stage

# Build Stage
FROM node:22-alpine AS builder
WORKDIR /usr/src/app
RUN corepack enable
COPY package.json pnpm-lock.yaml* ./
RUN pnpm install --frozen-lockfile
COPY . .
RUN pnpm run build

Runtime Stage

# Runtime Stage
FROM public.ecr.aws/lambda/nodejs:22
WORKDIR ${LAMBDA_TASK_ROOT}
COPY --from=builder /usr/src/app/dist/src ./ 
COPY --from=builder /usr/src/app/node_modules ./node_modules
CMD [ "index.handler" ]

Infrastructure Components

API Gateway – HTTP protocol with Lambda integration, CORS headers, JSON access logs.
Bedrock Permissions – Nova Micro inference profile access via IAM.
Lambda Function – 900‑second timeout, CloudWatch logging enabled.

📝 The ECR seeding resource requires Docker running locally.

flowchart LR
    A[Push to Main] --> B[Build & Test]
    B --> C[Build Docker Image]
    C --> D[Push to ECR]
    D --> E[Deploy Lambda]

CI/CD with GitHub Actions

The workflow (triggered on pushes to main) handles:

Building and testing the code.
Creating the Docker image.
Pushing the image to ECR.
Deploying the Lambda function via Terraform.

Testing the Endpoint

curl -sS "https://123456.execute-api.us-west-2.amazonaws.com/dev/" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"Heeey hoe gaat het?"}' | jq

Expected Response

{
  "success": true,
  "data": "Hoi! Het gaat prima, bedankt voor het vragen..."
}

Observability

CloudWatch dashboards provide visibility into errors and performance metrics.

Cleanup

terraform destroy

Conclusion

Serverless GenAI with API Gateway, Lambda, and Bedrock’s Nova Micro delivers a functional, cost‑effective solution.
Pricing remains negligible even at significant scale.
Terraform manages infrastructure; GitHub Actions automates deployment.
The foundation readily supports more sophisticated generative AI applications.

Deploying a Serverless AI Agent with AWS Bedrock, Lambda, and API Gateway

Overview

Data Flow

Requirements

Cost Estimation

Monthly Requests vs. Cost

Project Setup

Application Architecture

Lambda Handler (`src/index.ts`)

Bedrock Utility (`src/utils/bedrock.ts`)

Environment Variables

Docker Build

Build Stage

Runtime Stage

Infrastructure Components

CI/CD with GitHub Actions

Testing the Endpoint

Observability

Cleanup

Conclusion

Related posts

The $0 Localization Stack for Solo .NET Developers

Building an AI-Powered Code Editor: (part 2) LLM like interpreter

Networking for DevOps (Senior-Level, Production-Focused)

# The Engineering Behind Zero-Buffer 4K Streaming: A Deep Dive into High-Performance Smart4k IPTV Architecture

Overview

Data Flow

Requirements

Cost Estimation

Monthly Requests vs. Cost

Project Setup

Application Architecture

Lambda Handler (src/index.ts)

Bedrock Utility (src/utils/bedrock.ts)

Environment Variables

Docker Build

Build Stage

Runtime Stage

Infrastructure Components

CI/CD with GitHub Actions

Testing the Endpoint

Observability

Cleanup

Conclusion

Related posts

The $0 Localization Stack for Solo .NET Developers

Building an AI-Powered Code Editor: (part 2) LLM like interpreter

Networking for DevOps (Senior-Level, Production-Focused)

# The Engineering Behind Zero-Buffer 4K Streaming: A Deep Dive into High-Performance Smart4k IPTV Architecture

Lambda Handler (`src/index.ts`)

Bedrock Utility (`src/utils/bedrock.ts`)