Deploying a Serverless AI Agent with AWS Bedrock, Lambda, and API Gateway

Published: (December 27, 2025 at 08:15 PM EST)
3 min read
Source: Dev.to

Source: Dev.to

Overview

This guide walks through building a question‑answering service powered by generative AI using Amazon Bedrock. The architecture accepts prompts via HTTP and returns model‑generated responses while keeping costs minimal through a fully serverless stack.

Data Flow

  1. External clients send HTTP requests to API Gateway.
  2. API Gateway routes requests to a Lambda function.
  3. The Lambda function invokes Amazon Bedrock’s Nova Micro model.
  4. The Lambda container image is stored in ECR (Elastic Container Registry).

Requirements

AspectRequirement
Prompt ProcessingAccept prompts and return Nova Micro completions
HTTP EndpointExpose an endpoint for triggering responses
Estimated Volume~100 monthly requests (for cost estimation)
AutomationFully automated deployment via GitHub Actions
Availability99.9 %+ monthly uptime
SecurityIAM‑scoped Bedrock access, OpenID Connect auth, HTTPS‑only
ObservabilityStructured logging with CloudWatch dashboards

Authentication, input sanitization, and authorization are excluded to keep focus on the core GenAI implementation.

Cost Estimation

Based on an estimated 22 input tokens and 232 output tokens per request:

ServiceMonthly CostNotes
Bedrock (Nova Micro)~ $0.0032,200 input / 23,200 output tokens
LambdaFreeWithin free tier (1 M requests, 400 K GB‑seconds)
API GatewayFree (Year 1)~ $0.0004/month after the free tier
ECR~ $0.01300 MB image after 500 MB free tier

Monthly Requests vs. Cost

RequestsApprox. Cost
1 000~ $0.04
10 000~ $0.39
100 000~ $3.76

Project Setup

mkdir -p handler terraform
cd handler
pnpm init -y
pnpm --package=typescript dlx tsc --init
mkdir -p src __tests__
touch src/{app,env,index}.ts

pnpm add -D @types/node tsx typescript
pnpm add ai @ai-sdk/amazon-bedrock zod dotenv

Application Architecture

flowchart TB
    A["Lambda Handler
*Parses events, returns responses*"] --> B["Application Logic
*Manages prompts & orchestration*"]
    B --> C["Bedrock Integration
*Model invocation via AI SDK*"]

Lambda Handler (src/index.ts)

export const handler = async (event: any, context: any) => {
    try {
        const body = event.body ? JSON.parse(event.body) : {};
        const prompt = body.prompt ?? "Welcome from Warike technologies";
        const response = await main(prompt);
        return {
            statusCode: 200,
            body: JSON.stringify({ success: true, data: response }),
        };
    } catch (error) {
        return {
            statusCode: 500,
            body: JSON.stringify({
                success: false,
                error: error instanceof Error ? error.message : 'Unexpected error'
            }),
        };
    }
};

Bedrock Utility (src/utils/bedrock.ts)

import { createAmazonBedrock, generateText } from 'ai';
import { config } from '../config';

export async function generateResponse(prompt: string) {
    const { regionId, modelId } = config({});
    const bedrock = createAmazonBedrock({ region: regionId });

    const { text, usage } = await generateText({
        model: bedrock(modelId),
        system: "You are a helpful assistant.",
        prompt: [{ role: "user", content: prompt }],
    });

    console.log(`model: ${modelId}, response: ${text}, usage: ${JSON.stringify(usage)}`);
    return text;
}

Environment Variables

AWS_REGION=us-west-2
AWS_BEDROCK_MODEL='amazon.nova-micro-v1:0'
AWS_BEARER_TOKEN_BEDROCK='aws_bearer_token_bedrock'

Security Note: Use short‑lived API keys only.

Docker Build

Build Stage

# Build Stage
FROM node:22-alpine AS builder
WORKDIR /usr/src/app
RUN corepack enable
COPY package.json pnpm-lock.yaml* ./
RUN pnpm install --frozen-lockfile
COPY . .
RUN pnpm run build

Runtime Stage

# Runtime Stage
FROM public.ecr.aws/lambda/nodejs:22
WORKDIR ${LAMBDA_TASK_ROOT}
COPY --from=builder /usr/src/app/dist/src ./ 
COPY --from=builder /usr/src/app/node_modules ./node_modules
CMD [ "index.handler" ]

Infrastructure Components

  • API Gateway – HTTP protocol with Lambda integration, CORS headers, JSON access logs.
  • Bedrock Permissions – Nova Micro inference profile access via IAM.
  • Lambda Function – 900‑second timeout, CloudWatch logging enabled.

📝 The ECR seeding resource requires Docker running locally.

flowchart LR
    A[Push to Main] --> B[Build & Test]
    B --> C[Build Docker Image]
    C --> D[Push to ECR]
    D --> E[Deploy Lambda]

CI/CD with GitHub Actions

The workflow (triggered on pushes to main) handles:

  1. Building and testing the code.
  2. Creating the Docker image.
  3. Pushing the image to ECR.
  4. Deploying the Lambda function via Terraform.

Testing the Endpoint

curl -sS "https://123456.execute-api.us-west-2.amazonaws.com/dev/" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"Heeey hoe gaat het?"}' | jq

Expected Response

{
  "success": true,
  "data": "Hoi! Het gaat prima, bedankt voor het vragen..."
}

Observability

CloudWatch dashboards provide visibility into errors and performance metrics.

Cleanup

terraform destroy

Conclusion

  • Serverless GenAI with API Gateway, Lambda, and Bedrock’s Nova Micro delivers a functional, cost‑effective solution.
  • Pricing remains negligible even at significant scale.
  • Terraform manages infrastructure; GitHub Actions automates deployment.
  • The foundation readily supports more sophisticated generative AI applications.
Back to Blog

Related posts

Read more »