Building AI Agents on AWS in 2025: A Practitioner's Guide to Bedrock, AgentCore, and Beyond
Source: Dev.to
1. The Shift: From “Using AI” to “Orchestrating Agents”
Before we dive into specific services, it’s helpful to understand the conceptual direction AWS is taking.
From 2024 to 2025
| Year | Typical workflow |
|---|---|
| 2024 | call an LLM → get a response → display it to the user (often with some RAG for context) |
| 2025 | build autonomous agents that plan, execute, learn, and operate independently |
What this means for architecture
-
2024 style
user request → LLM → response -
2025 style
user goal → agent network → coordinated actions → outcome
Why it matters
- AgentCore, Nova Act, and Q Developer now expose agentic capabilities that embody this shift.
- The real test will be how these models perform under production workloads at scale—an area I’m still evaluating.
2. The AWS Gen AI Landscape
Below is a concise view of the AWS generative‑AI ecosystem.
Foundation Layer
- Amazon Bedrock – Multi‑model access and orchestration.
- Amazon SageMaker AI – Custom training and deployment.
Agent Infrastructure
- Amazon Bedrock AgentCore – Full‑stack for building, deploying, and operating agents.
- Nova Act – Specialized browser‑automation agents.
Models
- Amazon Nova 2 family – AWS’s own frontier models.
- Third‑party models – Claude, Llama, Mistral, and 100 + others via Bedrock.
Development Tools
- Amazon Q Developer – AI‑assisted coding directly in your IDE.
- Kiro – Agentic IDE with spec‑driven development.
- PartyRock – No‑code Bedrock playground.
Supporting Services
- S3 Vectors – Native vector storage for Retrieval‑Augmented Generation (RAG).
- CloudWatch – Agent observability and monitoring.
We’ll dive into each component in the sections that follow.
3. Amazon Bedrock: The Multi‑Model Foundation
Bedrock is the central hub for accessing foundation models on AWS. If you’ve been away for a while, here’s what changed in 2025:
Model Expansion
Bedrock now offers nearly 100 server‑less foundation models, with 100+ additional models available through the Bedrock Marketplace. The December 2025 expansion added 18 open‑weight models, including:
| Provider | Model(s) |
|---|---|
| Gemma 3 | |
| MiniMax AI | MiniMax M2 |
| Mistral | Mistral Large 3, Ministral series |
| Moonshot AI | Kimi K2 |
| NVIDIA | Nemotron Nano 2 |
| Anthropic | Claude 4.5 (Nov 2025) – the most capable Claude to date |
Reinforcement Fine‑Tuning
Instead of traditional fine‑tuning with labelled datasets, you provide feedback signals and let the model learn through reinforcement. AWS claims 66 % accuracy gains over base models without deep‑ML expertise.
Practical upside – customise model behaviour using your existing evaluation criteria rather than creating massive training datasets.
Cross‑Region Inference
Bedrock now supports intelligent routing across regions for high‑availability scenarios. If your primary region is under load, requests automatically route to secondary regions. You configure this in the model‑access settings.
4. Amazon Bedrock AgentCore: The Deep Dive
AgentCore is where I’ve spent most of my time this year. It progressed from preview (July) → GA (October) → significantly expanded (December). Below is a component‑by‑component overview and guidance on when to use each.
4.1 AgentCore Runtime
The Runtime supplies the execution environment for agents.
Session Isolation
- Each agent session runs in complete isolation with low latency.
- Ideal for agents that handle sensitive data or need guaranteed resource allocation.
# Sessions are isolated automatically.
# Each invocation gets its own execution context.
from bedrock_agentcore import AgentRuntime
runtime = AgentRuntime()
session = runtime.create_session(
agent_id="my-agent",
session_config={
"isolation_level": "full",
"timeout_seconds": 28_800 # 8 hours max
}
)
Long‑Running Workloads
- Sessions can run for up to 8 hours.
- Useful for agents that must wait for external events, poll systems, or orchestrate multi‑step workflows that span hours rather than seconds.
Bidirectional Streaming (added Dec 2025)
- Enables natural voice interactions where the agent can listen and respond simultaneously.
- Supports interruptions mid‑conversation—crucial for voice‑first experiences.
Tip: Use this feature when building voice agents; it’s a major improvement over the classic request‑response model.
4.2 AgentCore Memory
Memory lets agents retain context across interactions.
Episodic Memory
- Introduced in the December update.
- Agents learn from experiences and build knowledge over time, moving beyond treating each session as independent.
from bedrock_agentcore import AgentMemory
memory = AgentMemory(
memory_type="episodic",
retention_policy={
"max_episodes": 1_000,
"decay_factor": 0.95
}
)
# Agent learns from each interaction
memory.record_episode(
context=session_context,
action_taken=agent_action,
outcome=result,
feedback=user_feedback
)
| Aspect | Details |
|---|---|
| Status | Early‑stage; more production testing needed. |
| Benefit | Agents improve over time as they accumulate useful episodes. |
| Risk | Potential drift if feedback isn’t curated or if retention policies are mis‑configured. |
| Best‑Fit Use Cases | Personal assistants, recommendation engines, or any workflow that benefits from learning user preferences. |
Bottom line: Leverage runtime isolation and long‑running sessions for secure, heavyweight workloads, and adopt bidirectional streaming for voice‑first agents. When you need continuity across interactions, start experimenting with episodic memory, but monitor drift and set sensible retention policies.
4.3 AgentCore Gateway
The Gateway handles tool integration. Its killer feature is the ability to convert existing APIs into Model Context Protocol (MCP)‑compatible tools with minimal code.
MCP Integration
MCP is becoming the standard for how LLMs interact with external tools. If you have existing REST APIs, the Gateway can expose them as MCP tools that any agent can discover and use.
from bedrock_agentcore import Gateway
gateway = Gateway()
# Convert an existing API to an MCP‑compatible tool
gateway.register_api(
name="customer_lookup",
endpoint="https://api.mycompany.com/customers",
schema=openapi_spec,
authentication={
"type": "oauth2",
"credentials_vault": "my-vault"
}
)
Tool Discovery
Agents can query the Gateway to discover available tools dynamically. This is especially useful for multi‑agent systems where hard‑coding tool availability is undesirable.
4.4 AgentCore Identity
The Identity component handles authentication and authorisation for agent actions.
OAuth Integration
- Agents can authenticate with external services on behalf of users.
- The Identity service manages refresh tokens securely—credentials are never handled directly by the agent.
Secure Vault Storage
- Credentials are stored in vaults with encryption and strict access controls.
- The December update added native integration with additional OAuth‑enabled services.
from bedrock_agentcore import Identity
identity = Identity()
# Agent acts on behalf of a user
user_context = identity.establish_user_context(
user_id="user-123",
oauth_provider="google",
scopes=["calendar.read", "calendar.write"]
)
# Agent can now access the user's calendar
calendar_response = agent.invoke_tool(
"google_calendar",
action="list_events",
user_context=user_context
)
4.5 AgentCore Observability
Observability plugs into CloudWatch for comprehensive monitoring.
What You Get
- End‑to‑end agent execution traces
- Latency metrics per component
- Token usage tracking
- Error rates and patterns
- Custom dashboards
The integration also works with open‑source frameworks such as LangChain, LangGraph, and CrewAI.
4.6 Policy and Evaluations
Added in December 2025, these are the guardrails for production deployment.
Policy (Preview)
Policy intercepts every tool call in real time. You define boundaries in natural language, and they’re converted to Cedar—AWS’s open‑source policy language.
# Natural language policy
"Agent can only process refunds under $500 without human approval"
# Converted to Cedar automatically
permit(
principal,
action == Action::"process_refund",
resource
) when {
resource.amount < 500
}
10. Supporting Services
A few other services worth knowing:
Amazon SageMaker AI
- Serverless MLflow – zero‑infrastructure experimentation.
- HyperPod – checkpoint‑less training with automatic recovery from failures.
- Up to 95 % training‑cluster efficiency.
PartyRock
The no‑code Bedrock playground. Free daily usage, no credit‑card required. Great for quick prototyping before you write real code.
S3 Vectors
Native vector storage in S3:
- 2 billion vectors per index
- 20 trillion vectors per bucket
- 100 ms query latency
- Up to 90 % cost reduction vs. specialized vector databases
For RAG applications, S3 Vectors removes the need for a separate vector database. The cost savings alone make it worth investigating.
11. Production Patterns
Observations from building with these services
1. Start with Bedrock, Add AgentCore When Needed
Don’t reach for AgentCore immediately. Simple Bedrock invocations handle most use cases. Use AgentCore only when you need:
- Multi‑step workflows with tool usage
- Session isolation for concurrent users
- Episodic memory across interactions
- Production‑grade observability
2. Policy Before Production
If you’re deploying agents that take real actions, set up policy guardrails early. Defining boundaries upfront is far easier than retro‑fitting them after an incident.
3. Monitor Token Usage
Agentic workflows consume more tokens than single‑shot invocations. The agent’s internal reasoning, tool calls, and iterative refinement all add up. Build cost‑monitoring into your architecture from the start.
4. MCP Is the Standard
Model Context Protocol (MCP) is becoming ubiquitous. When building new APIs or integrations, consider MCP compatibility from the beginning—it will make your tools accessible to a broader range of agent frameworks.
12. Where Does This Leave Us?
The AWS Gen AI ecosystem in 2025 is comprehensive – arguably too comprehensive. There are multiple overlapping ways to achieve similar goals, and the “right” approach depends heavily on your specific requirements.
My Current Mental Model
| Goal | Recommended Service |
|---|---|
| Simple interactions | Bedrock (direct invocation) |
| Complex workflows | AgentCore |
| Browser automation | Nova Act |
| Development – inline assistance | Q Developer |
| Development – spec‑driven projects | Kiro |
| Custom models | Forge (if you have the data and commitment) |
| Retrieval‑augmented generation (RAG) | S3 Vectors + Bedrock Knowledge Bases |
Is agentic AI the future of software development? Probably, in some form.
Will these specific services be the lasting implementations? That’s less certain. AWS has deprecated services before, and the AI landscape moves fast.
What I can say is that building with these tools today is genuinely productive. The developer experience has improved dramatically over the past year. Whether you’re building customer‑facing agents, internal automation, or AI‑assisted development tools, AWS has the pieces you need.
The real challenge – and fun – is how you put them together.
What are you building with these services? I’d love to hear about your use cases and any patterns you’ve discovered.