Deploying a Serverless AI Agent with AWS Bedrock, Lambda, and API Gateway
Source: Dev.to
Overview
This guide walks through building a question‑answering service powered by generative AI using Amazon Bedrock. The architecture accepts prompts via HTTP and returns model‑generated responses while keeping costs minimal through a fully serverless stack.
Data Flow
- External clients send HTTP requests to API Gateway.
- API Gateway routes requests to a Lambda function.
- The Lambda function invokes Amazon Bedrock’s Nova Micro model.
- The Lambda container image is stored in ECR (Elastic Container Registry).
Requirements
| Aspect | Requirement |
|---|---|
| Prompt Processing | Accept prompts and return Nova Micro completions |
| HTTP Endpoint | Expose an endpoint for triggering responses |
| Estimated Volume | ~100 monthly requests (for cost estimation) |
| Automation | Fully automated deployment via GitHub Actions |
| Availability | 99.9 %+ monthly uptime |
| Security | IAM‑scoped Bedrock access, OpenID Connect auth, HTTPS‑only |
| Observability | Structured logging with CloudWatch dashboards |
Authentication, input sanitization, and authorization are excluded to keep focus on the core GenAI implementation.
Cost Estimation
Based on an estimated 22 input tokens and 232 output tokens per request:
| Service | Monthly Cost | Notes |
|---|---|---|
| Bedrock (Nova Micro) | ~ $0.003 | 2,200 input / 23,200 output tokens |
| Lambda | Free | Within free tier (1 M requests, 400 K GB‑seconds) |
| API Gateway | Free (Year 1) | ~ $0.0004/month after the free tier |
| ECR | ~ $0.01 | 300 MB image after 500 MB free tier |
Monthly Requests vs. Cost
| Requests | Approx. Cost |
|---|---|
| 1 000 | ~ $0.04 |
| 10 000 | ~ $0.39 |
| 100 000 | ~ $3.76 |
Project Setup
mkdir -p handler terraform
cd handler
pnpm init -y
pnpm --package=typescript dlx tsc --init
mkdir -p src __tests__
touch src/{app,env,index}.ts
pnpm add -D @types/node tsx typescript
pnpm add ai @ai-sdk/amazon-bedrock zod dotenv
Application Architecture
flowchart TB
A["Lambda Handler
*Parses events, returns responses*"] --> B["Application Logic
*Manages prompts & orchestration*"]
B --> C["Bedrock Integration
*Model invocation via AI SDK*"]
Lambda Handler (src/index.ts)
export const handler = async (event: any, context: any) => {
try {
const body = event.body ? JSON.parse(event.body) : {};
const prompt = body.prompt ?? "Welcome from Warike technologies";
const response = await main(prompt);
return {
statusCode: 200,
body: JSON.stringify({ success: true, data: response }),
};
} catch (error) {
return {
statusCode: 500,
body: JSON.stringify({
success: false,
error: error instanceof Error ? error.message : 'Unexpected error'
}),
};
}
};
Bedrock Utility (src/utils/bedrock.ts)
import { createAmazonBedrock, generateText } from 'ai';
import { config } from '../config';
export async function generateResponse(prompt: string) {
const { regionId, modelId } = config({});
const bedrock = createAmazonBedrock({ region: regionId });
const { text, usage } = await generateText({
model: bedrock(modelId),
system: "You are a helpful assistant.",
prompt: [{ role: "user", content: prompt }],
});
console.log(`model: ${modelId}, response: ${text}, usage: ${JSON.stringify(usage)}`);
return text;
}
Environment Variables
AWS_REGION=us-west-2
AWS_BEDROCK_MODEL='amazon.nova-micro-v1:0'
AWS_BEARER_TOKEN_BEDROCK='aws_bearer_token_bedrock'
Security Note: Use short‑lived API keys only.
Docker Build
Build Stage
# Build Stage
FROM node:22-alpine AS builder
WORKDIR /usr/src/app
RUN corepack enable
COPY package.json pnpm-lock.yaml* ./
RUN pnpm install --frozen-lockfile
COPY . .
RUN pnpm run build
Runtime Stage
# Runtime Stage
FROM public.ecr.aws/lambda/nodejs:22
WORKDIR ${LAMBDA_TASK_ROOT}
COPY --from=builder /usr/src/app/dist/src ./
COPY --from=builder /usr/src/app/node_modules ./node_modules
CMD [ "index.handler" ]
Infrastructure Components
- API Gateway – HTTP protocol with Lambda integration, CORS headers, JSON access logs.
- Bedrock Permissions – Nova Micro inference profile access via IAM.
- Lambda Function – 900‑second timeout, CloudWatch logging enabled.
📝 The ECR seeding resource requires Docker running locally.
flowchart LR
A[Push to Main] --> B[Build & Test]
B --> C[Build Docker Image]
C --> D[Push to ECR]
D --> E[Deploy Lambda]
CI/CD with GitHub Actions
The workflow (triggered on pushes to main) handles:
- Building and testing the code.
- Creating the Docker image.
- Pushing the image to ECR.
- Deploying the Lambda function via Terraform.
Testing the Endpoint
curl -sS "https://123456.execute-api.us-west-2.amazonaws.com/dev/" \
-H "Content-Type: application/json" \
-d '{"prompt":"Heeey hoe gaat het?"}' | jq
Expected Response
{
"success": true,
"data": "Hoi! Het gaat prima, bedankt voor het vragen..."
}
Observability
CloudWatch dashboards provide visibility into errors and performance metrics.
Cleanup
terraform destroy
Conclusion
- Serverless GenAI with API Gateway, Lambda, and Bedrock’s Nova Micro delivers a functional, cost‑effective solution.
- Pricing remains negligible even at significant scale.
- Terraform manages infrastructure; GitHub Actions automates deployment.
- The foundation readily supports more sophisticated generative AI applications.