Amazon Nova 2 Multimodal Embeddings with Amazon S3 Vectors and AWS Java SDK - Part 1 Introduction

Published: 3 weeks ago (January 12, 2026 at 11:53 AM EST)

4 min read

Source: Dev.to

Introduction

Throughout this series, we’ll use Amazon Nova 2 Multimodal Embeddings to create embeddings and store them in Amazon S3 Vectors. In this article we:

Introduce Amazon Nova 2 Multimodal Embeddings and Amazon S3 Vectors.
Show how to create and store text and image embeddings (audio and video will be covered in later parts).
Explain how to perform similarity search on the stored embeddings in S3 Vectors.

All examples use the AWS Java SDK (there are many Python examples, but Java is gaining traction in the ML/AI space).
The code for this series is available in the GitHub repository amazon‑nova‑2‑multimodal‑embeddings – please give it a ⭐ and follow me for more examples.

Amazon Nova Multimodal Embeddings

Amazon Nova Multimodal Embeddings is a single model that supports text, documents, images, video, and audio, enabling cross‑modal retrieval. It maps each content type into a unified semantic space, allowing you to perform:

Unimodal vector operations
Cross‑modal vector operations
Multimodal vector operations

When content is passed through Nova, the model converts it into a vector – a set of numerical values that capture semantic meaning. Similar content yields vectors that are close together in this space.

Key Features

Unified support for text, image, document image, video, and audio (max 8 K tokens or 30 s of video/audio).
Synchronous & asynchronous APIs – choose the mode that fits your workflow.
Large‑file segmentation (async API) – automatically splits long text, video, or audio into user‑defined segments, generating one embedding per segment.
Video‑with‑audio processing – obtain a single embedding for both modalities or two separate embeddings.
Embedding purpose – optimize embeddings for downstream tasks such as retrieval/RAG/search, classification, or clustering.
Dimension sizes – trade‑off accuracy vs. storage cost with four options: 3072, 1024, 384, 256.
Input methods – provide content via an S3 URI or inline as a Base64‑encoded string.

Introduction to Amazon S3 Vectors

Amazon S3 Vectors delivers purpose‑built, cost‑optimized vector storage for AI agents, inference, RAG, and semantic search. It inherits S3’s elasticity, durability, and availability while offering:

Sub‑second latency for infrequent queries
≈ 100 ms latency for frequent queries

You interact with S3 Vectors through a dedicated set of API operations—no infrastructure provisioning required.

Core Components

Component	Description
Vector buckets	A new bucket type designed specifically for storing and querying vectors.
Vector indexes	Within a bucket, indexes organize your vectors and enable similarity queries.
Vectors	Stored in an index; each vector is an embedding that preserves semantic relationships (text, image, audio, etc.). Metadata can be attached for filtering (e.g., timestamps, categories, user preferences).

Key Features

Purpose‑built storage for vectors
- First cloud object storage optimized for vector data.
- Elastic, durable, and cost‑effective.
- Automatically optimizes storage as you write, update, and delete vectors, ensuring the best price‑performance at scale.
Similarity queries
- Retrieve the most similar vectors to a query vector in sub‑second (infrequent) or ≈ 100 ms (frequent) response times.
- Attach metadata (key‑value pairs) to vectors for filtering results.
- Supported metadata types: string, number, boolean, list. By default, all metadata is filterable unless explicitly marked non‑filterable.

Next Steps

Create text and image embeddings with Amazon Nova Multimodal Embeddings (Java SDK).
Store the embeddings in an S3 Vectors bucket and index.
Run similarity searches against the stored vectors.

Happy embedding!

Managing Access for Vector Buckets

You can manage access for resources in vector buckets with IAM and Service Control Policies in AWS Organizations.
S3 Vectors uses a different service namespace than Amazon S3: the s3vectors namespace. Therefore, you can design policies specifically for the S3 Vectors service and its resources.

You can design policies to grant access to:

an individual vector index,
all vector indexes within a vector bucket, or
all vector buckets in an account.

Integration with AWS Services

Amazon OpenSearch Service

Optimize vector storage costs while continuing to use OpenSearch API operations.
Ideal for workloads that need advanced search functionality, such as:
- hybrid search,
- aggregations,
- advanced filtering, and
- faceted search.
You can also export a snapshot of an S3 vector index to Amazon OpenSearch Serverless for high QPS and low‑latency vector search.

Amazon Bedrock Knowledge Bases

Select a vector index in S3 Vectors as your vector store to save on storage costs for Retrieval‑Augmented Generation (RAG) applications.

Amazon Bedrock in SageMaker Unified Studio

Develop and test knowledge bases using S3 Vectors as your vector store.

Conclusion

In this part of the series, we introduced the goal of the series and presented Amazon Nova 2 Multimodal Embeddings and Amazon S3 Vectors.
In the next part, we’ll cover creating and storing text and image embeddings.

Amazon Nova 2 Multimodal Embeddings with Amazon S3 Vectors and AWS Java SDK - Part 1 Introduction

Introduction

Amazon Nova Multimodal Embeddings

Key Features

Introduction to Amazon S3 Vectors

Core Components

Key Features

Next Steps

Managing Access for Vector Buckets

Integration with AWS Services

Amazon OpenSearch Service

Amazon Bedrock Knowledge Bases

Amazon Bedrock in SageMaker Unified Studio

Conclusion

Related posts

The Agent Control Plane: Why Intelligence Without Governance Is a Bug

Your 'Atomic' Deploys Probably Aren't Atomic

It's Time to Learn about Google TPUs in 2026

Hello, Newbie Here.

Introduction

Amazon Nova Multimodal Embeddings

Key Features

Introduction to Amazon S3 Vectors

Core Components

Key Features

Next Steps

Managing Access for Vector Buckets

Integration with AWS Services

Amazon OpenSearch Service

Amazon Bedrock Knowledge Bases

Amazon Bedrock in SageMaker Unified Studio

Conclusion

Related posts

The Agent Control Plane: Why Intelligence Without Governance Is a Bug

Your 'Atomic' Deploys Probably Aren't Atomic

It's Time to Learn about Google TPUs in 2026

Hello, Newbie Here.

Amazon Nova Multimodal Embeddings

Introduction to Amazon S3 Vectors

Amazon Bedrock in SageMaker Unified Studio