How S3 Vectors Work: A Friendly Guide to AWS’s New Vector Store

Published: 1 week ago (December 7, 2025 at 05:24 AM EST)

4 min read

Source: Dev.to

Why Vectors?

We all know what AWS S3 is, right? It’s the well‑known, cheap, durable object store that has been used for decades. When you first hear about S3 Vectors, a few questions pop up:

What are vectors?
Are they only for mathematicians?
How do they relate to AI?
Why do they carry the S3 prefix?

Below we’ll answer these questions and show how S3 Vectors work.

What Is a Vector Store and Why Do We Need It in AI?

A vector is a mathematical object that has both magnitude (size) and direction, often represented as an ordered list of numbers, e.g., ["tell"] = [0, 2, 1, 10, …, 3].
The numbers are not random; they encode semantic meaning. For instance, the words Espresso and Latte would have very similar vectors because they refer to similar types of coffee.

Vector Databases

Vector databases (vector DBs) are built for similarity search using specialized Approximate Nearest Neighbor (ANN) algorithms. Relational databases can perform similarity search, but not efficiently at scale.

Typical workflow:

Your application queries the vector store.
The store retrieves the nearest vectors (chunks).
Those chunks are supplied to a Large Language Model (LLM) as context.

Example: If a chatbot LLM receives the question “Are there any discounts for students on your website?” and the vector DB contains relevant information, the LLM can retrieve, augment, and answer the query.

A vector store acts as an augmented knowledge base for your LLM, allowing you to inject arbitrary information without expensive model training. This approach is known as Retrieval‑Augmented Generation (RAG). Vectors can represent text, images, music—anything that can be embedded.

S3 Vector Store Structure

The structure mirrors traditional S3:

Bucket – unique name within a region.
Index – unique name inside a bucket; can be configured per your needs.

Index Configuration Options

Setting	Description
Dimension	Numeric value (1–4096) indicating how many numbers each vector contains. Example: a dimension of 5 yields vectors like `[1, 2, 1, 4, 5]`.
Distance Metric	Choose Cosine (angular similarity) or Euclidean (straight‑line distance) to define similarity during queries.
Encryption	Use bucket‑level encryption or override it for the index. Options: • SSE‑KMS (Server‑Side Encryption with AWS KMS keys) • SSE‑S3 (Server‑Side Encryption with Amazon S3 managed keys)
Tagging	Attach arbitrary tags (e.g., `region`, `category`, `audience`) to each vector. Tags act as metadata, enabling filtered searches beyond pure vector similarity. AWS can auto‑generate tags or you can set them manually.

Example: Creating and Querying Vectors

Below is a complete Python example that:

Generates embeddings using Amazon Titan embed‑text‑v2.
Stores the vectors in an S3 vector index.
Performs a semantic search, optionally filtered by a tag.

import boto3
import json

region = "us-west-2"
bucket = "my-sales-bucket"
index = "my-sales-index"

bedrock = boto3.client("bedrock-runtime", region_name=region)
s3vectors = boto3.client("s3vectors", region_name=region)

# --- 1. Populate the vector index with sample sales data ---
items = [
    {
        "key": "laptop-general-sale",
        "text": "10% off all laptops this weekend in our electronics section.",
        "metadata": {"category": "laptop", "audience": "all", "discount": "10%"}
    },
    {
        "key": "laptop-student-sale",
        "text": "15% discount on lightweight laptops for students and university use.",
        "metadata": {"category": "laptop", "audience": "students", "discount": "15%"}
    },
    {
        "key": "phone-sale",
        "text": "20% off latest smartphones with long‑lasting battery.",
        "metadata": {"category": "phone", "audience": "all", "discount": "20%"}
    }
]

vectors = []
for item in items:
    resp = bedrock.invoke_model(
        modelId="amazon.titan-embed-text-v2:0",
        body=json.dumps({"inputText": item["text"]})
    )
    embedding = json.loads(resp["body"].read())["embedding"]

    vectors.append({
        "key": item["key"],
        "data": {"float32": embedding},
        "metadata": {
            "source_text": item["text"],
            **item["metadata"]
        }
    })

s3vectors.put_vectors(
    vectorBucketName=bucket,
    indexName=index,
    vectors=vectors
)

# --- 2. Query the index for "laptops for students" ---
query_text = "Do you have any sales on laptops for students?"

resp = bedrock.invoke_model(
    modelId="amazon.titan-embed-text-v2:0",
    body=json.dumps({"inputText": query_text})
)
query_embedding = json.loads(resp["body"].read())["embedding"]

# Plain semantic search
response = s3vectors.query_vectors(
    vectorBucketName=bucket,
    indexName=index,
    queryVector={"float32": query_embedding},
    topK=3,
    returnDistance=True,
    returnMetadata=True
)
print("Top matches:")
print(json.dumps(response["vectors"], indent=2))

# Optional: restrict to laptop category only
response_filtered = s3vectors.query_vectors(
    vectorBucketName=bucket,
    indexName=index,
    queryVector={"float32": query_embedding},
    topK=3,
    filter={"category": "laptop"},
    returnDistance=True,
    returnMetadata=True
)
print("Laptop‑only matches:")
print(json.dumps(response_filtered["vectors"], indent=2))

Query used: “Do you have any sales on laptops for students?”

Note: Prices shown in the example are approximate. For detailed pricing, refer to the official AWS documentation.

I hope you found this guide useful! Feel free to experiment with your own data and let the vector store work its magic for your applications. If you have any questions, drop them in the comments below.

How S3 Vectors Work: A Friendly Guide to AWS’s New Vector Store

Why Vectors?

What Is a Vector Store and Why Do We Need It in AI?

Vector Databases

S3 Vector Store Structure

Index Configuration Options

Example: Creating and Querying Vectors

Related posts

Building PolyScan: Free CC0 PBR Textures & 3D Models for Real Projects

Zero-to-Scale ML: Deploying ONNX Models on Kubernetes with FastAPI and HPA

Unpacking the Google File System Paper: A Simple Breakdown

How to Adapt Tone to User Sentiment in Voice AI and Integrate Calendar Checks