Building AI Agents on AWS in 2025: A Practitioner's Guide to Bedrock, AgentCore, and Beyond

Published: 1 month ago (January 2, 2026 at 06:51 PM EST)

7 min read

Source: Dev.to

1. The Shift: From “Using AI” to “Orchestrating Agents”

Before we dive into specific services, it’s helpful to understand the conceptual direction AWS is taking.

From 2024 to 2025

Year	Typical workflow
2024	`call an LLM → get a response → display it to the user` (often with some RAG for context)
2025	`build autonomous agents that plan, execute, learn, and operate independently`

What this means for architecture

2024 style
```
user request → LLM → response
```

2025 style

user goal → agent network → coordinated actions → outcome

Why it matters

AgentCore, Nova Act, and Q Developer now expose agentic capabilities that embody this shift.
The real test will be how these models perform under production workloads at scale—an area I’m still evaluating.

2. The AWS Gen AI Landscape

Below is a concise view of the AWS generative‑AI ecosystem.

Foundation Layer

Amazon Bedrock – Multi‑model access and orchestration.
Amazon SageMaker AI – Custom training and deployment.

Agent Infrastructure

Amazon Bedrock AgentCore – Full‑stack for building, deploying, and operating agents.
Nova Act – Specialized browser‑automation agents.

Models

Amazon Nova 2 family – AWS’s own frontier models.
Third‑party models – Claude, Llama, Mistral, and 100 + others via Bedrock.

Development Tools

Amazon Q Developer – AI‑assisted coding directly in your IDE.
Kiro – Agentic IDE with spec‑driven development.
PartyRock – No‑code Bedrock playground.

Supporting Services

S3 Vectors – Native vector storage for Retrieval‑Augmented Generation (RAG).
CloudWatch – Agent observability and monitoring.

We’ll dive into each component in the sections that follow.

3. Amazon Bedrock: The Multi‑Model Foundation

Bedrock is the central hub for accessing foundation models on AWS. If you’ve been away for a while, here’s what changed in 2025:

Model Expansion

Bedrock now offers nearly 100 server‑less foundation models, with 100+ additional models available through the Bedrock Marketplace. The December 2025 expansion added 18 open‑weight models, including:

Provider	Model(s)
Google	Gemma 3
MiniMax AI	MiniMax M2
Mistral	Mistral Large 3, Ministral series
Moonshot AI	Kimi K2
NVIDIA	Nemotron Nano 2
Anthropic	Claude 4.5 (Nov 2025) – the most capable Claude to date

Reinforcement Fine‑Tuning

Instead of traditional fine‑tuning with labelled datasets, you provide feedback signals and let the model learn through reinforcement. AWS claims 66 % accuracy gains over base models without deep‑ML expertise.

Practical upside – customise model behaviour using your existing evaluation criteria rather than creating massive training datasets.

Cross‑Region Inference

Bedrock now supports intelligent routing across regions for high‑availability scenarios. If your primary region is under load, requests automatically route to secondary regions. You configure this in the model‑access settings.

4. Amazon Bedrock AgentCore: The Deep Dive

AgentCore is where I’ve spent most of my time this year. It progressed from preview (July) → GA (October) → significantly expanded (December). Below is a component‑by‑component overview and guidance on when to use each.

4.1 AgentCore Runtime

The Runtime supplies the execution environment for agents.

Session Isolation

Each agent session runs in complete isolation with low latency.
Ideal for agents that handle sensitive data or need guaranteed resource allocation.

# Sessions are isolated automatically.
# Each invocation gets its own execution context.
from bedrock_agentcore import AgentRuntime

runtime = AgentRuntime()
session = runtime.create_session(
    agent_id="my-agent",
    session_config={
        "isolation_level": "full",
        "timeout_seconds": 28_800  # 8 hours max
    }
)

Long‑Running Workloads

Sessions can run for up to 8 hours.
Useful for agents that must wait for external events, poll systems, or orchestrate multi‑step workflows that span hours rather than seconds.

Bidirectional Streaming (added Dec 2025)

Enables natural voice interactions where the agent can listen and respond simultaneously.
Supports interruptions mid‑conversation—crucial for voice‑first experiences.

Tip: Use this feature when building voice agents; it’s a major improvement over the classic request‑response model.

4.2 AgentCore Memory

Memory lets agents retain context across interactions.

Episodic Memory

Introduced in the December update.
Agents learn from experiences and build knowledge over time, moving beyond treating each session as independent.

from bedrock_agentcore import AgentMemory

memory = AgentMemory(
    memory_type="episodic",
    retention_policy={
        "max_episodes": 1_000,
        "decay_factor": 0.95
    }
)

# Agent learns from each interaction
memory.record_episode(
    context=session_context,
    action_taken=agent_action,
    outcome=result,
    feedback=user_feedback
)

Aspect	Details
Status	Early‑stage; more production testing needed.
Benefit	Agents improve over time as they accumulate useful episodes.
Risk	Potential drift if feedback isn’t curated or if retention policies are mis‑configured.
Best‑Fit Use Cases	Personal assistants, recommendation engines, or any workflow that benefits from learning user preferences.

Bottom line: Leverage runtime isolation and long‑running sessions for secure, heavyweight workloads, and adopt bidirectional streaming for voice‑first agents. When you need continuity across interactions, start experimenting with episodic memory, but monitor drift and set sensible retention policies.

4.3 AgentCore Gateway

The Gateway handles tool integration. Its killer feature is the ability to convert existing APIs into Model Context Protocol (MCP)‑compatible tools with minimal code.

MCP Integration

MCP is becoming the standard for how LLMs interact with external tools. If you have existing REST APIs, the Gateway can expose them as MCP tools that any agent can discover and use.

from bedrock_agentcore import Gateway

gateway = Gateway()

# Convert an existing API to an MCP‑compatible tool
gateway.register_api(
    name="customer_lookup",
    endpoint="https://api.mycompany.com/customers",
    schema=openapi_spec,
    authentication={
        "type": "oauth2",
        "credentials_vault": "my-vault"
    }
)

Tool Discovery

Agents can query the Gateway to discover available tools dynamically. This is especially useful for multi‑agent systems where hard‑coding tool availability is undesirable.

4.4 AgentCore Identity

The Identity component handles authentication and authorisation for agent actions.

OAuth Integration

Agents can authenticate with external services on behalf of users.
The Identity service manages refresh tokens securely—credentials are never handled directly by the agent.

Secure Vault Storage

Credentials are stored in vaults with encryption and strict access controls.
The December update added native integration with additional OAuth‑enabled services.

from bedrock_agentcore import Identity

identity = Identity()

# Agent acts on behalf of a user
user_context = identity.establish_user_context(
    user_id="user-123",
    oauth_provider="google",
    scopes=["calendar.read", "calendar.write"]
)

# Agent can now access the user's calendar
calendar_response = agent.invoke_tool(
    "google_calendar",
    action="list_events",
    user_context=user_context
)

4.5 AgentCore Observability

Observability plugs into CloudWatch for comprehensive monitoring.

What You Get

End‑to‑end agent execution traces
Latency metrics per component
Token usage tracking
Error rates and patterns
Custom dashboards

The integration also works with open‑source frameworks such as LangChain, LangGraph, and CrewAI.

4.6 Policy and Evaluations

Added in December 2025, these are the guardrails for production deployment.

Policy (Preview)

Policy intercepts every tool call in real time. You define boundaries in natural language, and they’re converted to Cedar—AWS’s open‑source policy language.

# Natural language policy
"Agent can only process refunds under $500 without human approval"

# Converted to Cedar automatically
permit(
    principal,
    action == Action::"process_refund",
    resource
) when {
    resource.amount < 500
}

10. Supporting Services

A few other services worth knowing:

Amazon SageMaker AI

Serverless MLflow – zero‑infrastructure experimentation.
HyperPod – checkpoint‑less training with automatic recovery from failures.
Up to 95 % training‑cluster efficiency.

PartyRock

The no‑code Bedrock playground. Free daily usage, no credit‑card required. Great for quick prototyping before you write real code.

S3 Vectors

Native vector storage in S3:

2 billion vectors per index
20 trillion vectors per bucket
100 ms query latency
Up to 90 % cost reduction vs. specialized vector databases

For RAG applications, S3 Vectors removes the need for a separate vector database. The cost savings alone make it worth investigating.

11. Production Patterns

Observations from building with these services

1. Start with Bedrock, Add AgentCore When Needed

Don’t reach for AgentCore immediately. Simple Bedrock invocations handle most use cases. Use AgentCore only when you need:

Multi‑step workflows with tool usage
Session isolation for concurrent users
Episodic memory across interactions
Production‑grade observability

2. Policy Before Production

If you’re deploying agents that take real actions, set up policy guardrails early. Defining boundaries upfront is far easier than retro‑fitting them after an incident.

3. Monitor Token Usage

Agentic workflows consume more tokens than single‑shot invocations. The agent’s internal reasoning, tool calls, and iterative refinement all add up. Build cost‑monitoring into your architecture from the start.

4. MCP Is the Standard

Model Context Protocol (MCP) is becoming ubiquitous. When building new APIs or integrations, consider MCP compatibility from the beginning—it will make your tools accessible to a broader range of agent frameworks.

12. Where Does This Leave Us?

The AWS Gen AI ecosystem in 2025 is comprehensive – arguably too comprehensive. There are multiple overlapping ways to achieve similar goals, and the “right” approach depends heavily on your specific requirements.

My Current Mental Model

Goal	Recommended Service
Simple interactions	Bedrock (direct invocation)
Complex workflows	AgentCore
Browser automation	Nova Act
Development – inline assistance	Q Developer
Development – spec‑driven projects	Kiro
Custom models	Forge (if you have the data and commitment)
Retrieval‑augmented generation (RAG)	S3 Vectors + Bedrock Knowledge Bases

Is agentic AI the future of software development? Probably, in some form.
Will these specific services be the lasting implementations? That’s less certain. AWS has deprecated services before, and the AI landscape moves fast.

What I can say is that building with these tools today is genuinely productive. The developer experience has improved dramatically over the past year. Whether you’re building customer‑facing agents, internal automation, or AI‑assisted development tools, AWS has the pieces you need.

The real challenge – and fun – is how you put them together.

What are you building with these services? I’d love to hear about your use cases and any patterns you’ve discovered.