Building AI Agents on AWS in 2025: A Practitioner's Guide to Bedrock, AgentCore, and Beyond

Published: (January 2, 2026 at 06:51 PM EST)
7 min read
Source: Dev.to

Source: Dev.to

1. The Shift: From “Using AI” to “Orchestrating Agents”

Before we dive into specific services, it’s helpful to understand the conceptual direction AWS is taking.

From 2024 to 2025

YearTypical workflow
2024call an LLM → get a response → display it to the user (often with some RAG for context)
2025build autonomous agents that plan, execute, learn, and operate independently

What this means for architecture

  • 2024 style

    user request → LLM → response
  • 2025 style

    user goal → agent network → coordinated actions → outcome

Why it matters

  • AgentCore, Nova Act, and Q Developer now expose agentic capabilities that embody this shift.
  • The real test will be how these models perform under production workloads at scale—an area I’m still evaluating.

2. The AWS Gen AI Landscape

Below is a concise view of the AWS generative‑AI ecosystem.

Foundation Layer

  • Amazon Bedrock – Multi‑model access and orchestration.
  • Amazon SageMaker AI – Custom training and deployment.

Agent Infrastructure

  • Amazon Bedrock AgentCore – Full‑stack for building, deploying, and operating agents.
  • Nova Act – Specialized browser‑automation agents.

Models

  • Amazon Nova 2 family – AWS’s own frontier models.
  • Third‑party models – Claude, Llama, Mistral, and 100 + others via Bedrock.

Development Tools

  • Amazon Q Developer – AI‑assisted coding directly in your IDE.
  • Kiro – Agentic IDE with spec‑driven development.
  • PartyRock – No‑code Bedrock playground.

Supporting Services

  • S3 Vectors – Native vector storage for Retrieval‑Augmented Generation (RAG).
  • CloudWatch – Agent observability and monitoring.

We’ll dive into each component in the sections that follow.

3. Amazon Bedrock: The Multi‑Model Foundation

Bedrock is the central hub for accessing foundation models on AWS. If you’ve been away for a while, here’s what changed in 2025:

Model Expansion

Bedrock now offers nearly 100 server‑less foundation models, with 100+ additional models available through the Bedrock Marketplace. The December 2025 expansion added 18 open‑weight models, including:

ProviderModel(s)
GoogleGemma 3
MiniMax AIMiniMax M2
MistralMistral Large 3, Ministral series
Moonshot AIKimi K2
NVIDIANemotron Nano 2
AnthropicClaude 4.5 (Nov 2025) – the most capable Claude to date

Reinforcement Fine‑Tuning

Instead of traditional fine‑tuning with labelled datasets, you provide feedback signals and let the model learn through reinforcement. AWS claims 66 % accuracy gains over base models without deep‑ML expertise.

Practical upside – customise model behaviour using your existing evaluation criteria rather than creating massive training datasets.

Cross‑Region Inference

Bedrock now supports intelligent routing across regions for high‑availability scenarios. If your primary region is under load, requests automatically route to secondary regions. You configure this in the model‑access settings.

4. Amazon Bedrock AgentCore: The Deep Dive

AgentCore is where I’ve spent most of my time this year. It progressed from preview (July)GA (October)significantly expanded (December). Below is a component‑by‑component overview and guidance on when to use each.

4.1 AgentCore Runtime

The Runtime supplies the execution environment for agents.

Session Isolation

  • Each agent session runs in complete isolation with low latency.
  • Ideal for agents that handle sensitive data or need guaranteed resource allocation.
# Sessions are isolated automatically.
# Each invocation gets its own execution context.
from bedrock_agentcore import AgentRuntime

runtime = AgentRuntime()
session = runtime.create_session(
    agent_id="my-agent",
    session_config={
        "isolation_level": "full",
        "timeout_seconds": 28_800  # 8 hours max
    }
)

Long‑Running Workloads

  • Sessions can run for up to 8 hours.
  • Useful for agents that must wait for external events, poll systems, or orchestrate multi‑step workflows that span hours rather than seconds.

Bidirectional Streaming (added Dec 2025)

  • Enables natural voice interactions where the agent can listen and respond simultaneously.
  • Supports interruptions mid‑conversation—crucial for voice‑first experiences.

Tip: Use this feature when building voice agents; it’s a major improvement over the classic request‑response model.

4.2 AgentCore Memory

Memory lets agents retain context across interactions.

Episodic Memory

  • Introduced in the December update.
  • Agents learn from experiences and build knowledge over time, moving beyond treating each session as independent.
from bedrock_agentcore import AgentMemory

memory = AgentMemory(
    memory_type="episodic",
    retention_policy={
        "max_episodes": 1_000,
        "decay_factor": 0.95
    }
)

# Agent learns from each interaction
memory.record_episode(
    context=session_context,
    action_taken=agent_action,
    outcome=result,
    feedback=user_feedback
)
AspectDetails
StatusEarly‑stage; more production testing needed.
BenefitAgents improve over time as they accumulate useful episodes.
RiskPotential drift if feedback isn’t curated or if retention policies are mis‑configured.
Best‑Fit Use CasesPersonal assistants, recommendation engines, or any workflow that benefits from learning user preferences.

Bottom line: Leverage runtime isolation and long‑running sessions for secure, heavyweight workloads, and adopt bidirectional streaming for voice‑first agents. When you need continuity across interactions, start experimenting with episodic memory, but monitor drift and set sensible retention policies.

4.3 AgentCore Gateway

The Gateway handles tool integration. Its killer feature is the ability to convert existing APIs into Model Context Protocol (MCP)‑compatible tools with minimal code.

MCP Integration

MCP is becoming the standard for how LLMs interact with external tools. If you have existing REST APIs, the Gateway can expose them as MCP tools that any agent can discover and use.

from bedrock_agentcore import Gateway

gateway = Gateway()

# Convert an existing API to an MCP‑compatible tool
gateway.register_api(
    name="customer_lookup",
    endpoint="https://api.mycompany.com/customers",
    schema=openapi_spec,
    authentication={
        "type": "oauth2",
        "credentials_vault": "my-vault"
    }
)

Tool Discovery

Agents can query the Gateway to discover available tools dynamically. This is especially useful for multi‑agent systems where hard‑coding tool availability is undesirable.

4.4 AgentCore Identity

The Identity component handles authentication and authorisation for agent actions.

OAuth Integration

  • Agents can authenticate with external services on behalf of users.
  • The Identity service manages refresh tokens securely—credentials are never handled directly by the agent.

Secure Vault Storage

  • Credentials are stored in vaults with encryption and strict access controls.
  • The December update added native integration with additional OAuth‑enabled services.
from bedrock_agentcore import Identity

identity = Identity()

# Agent acts on behalf of a user
user_context = identity.establish_user_context(
    user_id="user-123",
    oauth_provider="google",
    scopes=["calendar.read", "calendar.write"]
)

# Agent can now access the user's calendar
calendar_response = agent.invoke_tool(
    "google_calendar",
    action="list_events",
    user_context=user_context
)

4.5 AgentCore Observability

Observability plugs into CloudWatch for comprehensive monitoring.

What You Get

  • End‑to‑end agent execution traces
  • Latency metrics per component
  • Token usage tracking
  • Error rates and patterns
  • Custom dashboards

The integration also works with open‑source frameworks such as LangChain, LangGraph, and CrewAI.

4.6 Policy and Evaluations

Added in December 2025, these are the guardrails for production deployment.

Policy (Preview)

Policy intercepts every tool call in real time. You define boundaries in natural language, and they’re converted to Cedar—AWS’s open‑source policy language.

# Natural language policy
"Agent can only process refunds under $500 without human approval"

# Converted to Cedar automatically
permit(
    principal,
    action == Action::"process_refund",
    resource
) when {
    resource.amount < 500
}

10. Supporting Services

A few other services worth knowing:

Amazon SageMaker AI

  • Serverless MLflow – zero‑infrastructure experimentation.
  • HyperPod – checkpoint‑less training with automatic recovery from failures.
  • Up to 95 % training‑cluster efficiency.

PartyRock

The no‑code Bedrock playground. Free daily usage, no credit‑card required. Great for quick prototyping before you write real code.

S3 Vectors

Native vector storage in S3:

  • 2 billion vectors per index
  • 20 trillion vectors per bucket
  • 100 ms query latency
  • Up to 90 % cost reduction vs. specialized vector databases

For RAG applications, S3 Vectors removes the need for a separate vector database. The cost savings alone make it worth investigating.

11. Production Patterns

Observations from building with these services

1. Start with Bedrock, Add AgentCore When Needed

Don’t reach for AgentCore immediately. Simple Bedrock invocations handle most use cases. Use AgentCore only when you need:

  • Multi‑step workflows with tool usage
  • Session isolation for concurrent users
  • Episodic memory across interactions
  • Production‑grade observability

2. Policy Before Production

If you’re deploying agents that take real actions, set up policy guardrails early. Defining boundaries upfront is far easier than retro‑fitting them after an incident.

3. Monitor Token Usage

Agentic workflows consume more tokens than single‑shot invocations. The agent’s internal reasoning, tool calls, and iterative refinement all add up. Build cost‑monitoring into your architecture from the start.

4. MCP Is the Standard

Model Context Protocol (MCP) is becoming ubiquitous. When building new APIs or integrations, consider MCP compatibility from the beginning—it will make your tools accessible to a broader range of agent frameworks.

12. Where Does This Leave Us?

The AWS Gen AI ecosystem in 2025 is comprehensive – arguably too comprehensive. There are multiple overlapping ways to achieve similar goals, and the “right” approach depends heavily on your specific requirements.

My Current Mental Model

GoalRecommended Service
Simple interactionsBedrock (direct invocation)
Complex workflowsAgentCore
Browser automationNova Act
Development – inline assistanceQ Developer
Development – spec‑driven projectsKiro
Custom modelsForge (if you have the data and commitment)
Retrieval‑augmented generation (RAG)S3 Vectors + Bedrock Knowledge Bases

Is agentic AI the future of software development? Probably, in some form.
Will these specific services be the lasting implementations? That’s less certain. AWS has deprecated services before, and the AI landscape moves fast.

What I can say is that building with these tools today is genuinely productive. The developer experience has improved dramatically over the past year. Whether you’re building customer‑facing agents, internal automation, or AI‑assisted development tools, AWS has the pieces you need.

The real challenge – and fun – is how you put them together.


What are you building with these services? I’d love to hear about your use cases and any patterns you’ve discovered.

Back to Blog

Related posts

Read more »

Real-World Agent Examples with Gemini 3

markdown December 19, 2025 We are entering a new phase of agentic AI. Developers are moving beyond simple notebooks to build complex, production‑ready agentic w...