How to Use the Claude API with Python

Published: 1 day ago (May 3, 2026 at 06:06 AM EDT)

7 min read

Source: Dev.to

You Have a Python Script. You Want It to Think.

That’s the whole premise. This tutorial shows you how to connect your code to Claude — Anthropic’s AI model — so it can read, reason, and respond inside your own projects.

I wrote this after spending an afternoon figuring it out myself. No prior AI experience is needed. If you’ve written a Python function before, you can follow this.

Before You Start

Two things are worth knowing up front.

The API costs money.
Not much — $5 gets you weeks of normal usage — but it isn’t free like the Claude.ai chat interface. You’ll need to add credits at .
You need Python 3.9 or later.
Check your version:
```
python --version
```
If it’s older than 3.9, update it at .
On Windows, be sure to check “Add Python to PATH” during installation; skipping it breaks everything.

Setup

Create a folder, set up a virtual environment, and install the SDK.

mkdir claude-project
cd claude-project
python -m venv venv

Activate the environment:

# macOS / Linux
source venv/bin/activate

# Windows
venv\Scripts\activate

Your terminal prompt should now start with (venv). That’s how you know it’s active. If you install packages without the venv active, they end up in the wrong place.

pip install anthropic python-dotenv

Your API Key

Go to , create an account, and generate a key under API Keys.
Store it in a .env file in your project folder:
```
ANTHROPIC_API_KEY=your-key-here
```
Keep the file out of version control:
```
echo .env > .gitignore
```

Why? A public API key gets discovered, used, and charged to you within hours.

The First Call

Here’s what talking to Claude from Python looks like:

from dotenv import load_dotenv
from anthropic import Anthropic

load_dotenv()
client = Anthropic()

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "What is a REST API?"
        }
    ]
)

print(message.content[0].text)

Run the script – Claude will answer in your terminal.

Three things to understand about this call

Parameter	Meaning
`model`	Which version of Claude you’re using. `claude-sonnet-4-6` is the default for most use‑cases – fast and capable.
`max_tokens`	Maximum length of Claude’s response. Set it too low and the reply gets cut off mid‑sentence. `1024` is a safe starting point.
`messages`	A list of turns in the conversation. Each turn has a `role` (`user` for your messages, `assistant` for Claude).

What Comes Back

The response object holds more than just text:

print(message.content[0].text)          # Claude's response
print(message.stop_reason)              # Why it stopped — usually "end_turn"
print(message.usage.input_tokens)        # Tokens in your message
print(message.usage.output_tokens)     # Tokens in Claude's reply

Tokens are roughly equivalent to words. Watching them matters because that’s what you’re paying for.

Giving Claude a Role

By default, Claude is a general assistant. A system prompt changes that. Think of it as a briefing you give before the conversation starts:

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system="You are a Python code reviewer. Be direct. Point out issues first, then explain why.",
    messages=[
        {
            "role": "user",
            "content": "Review this: for i in range(len(my_list)): print(my_list[i])"
        }
    ]
)

print(message.content[0].text)

Same model, completely different behavior. The system prompt is where most of the real control lives.

Conversations

The API has no memory. Every call starts fresh unless you pass the history yourself.

from dotenv import load_dotenv
from anthropic import Anthropic

load_dotenv()
client = Anthropic()

history = []

def chat(message: str) -> str:
    # Add the user message to the history
    history.append({"role": "user", "content": message})

    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system="You are a helpful programming assistant.",
        messages=history
    )

    # Extract Claude's reply and add it to the history
    reply = response.content[0].text
    history.append({"role": "assistant", "content": reply})

    return reply

print(chat("What is a decorator in Python?"))
print(chat("Show me a real example."))
print(chat("How would that work in Flask?"))

Each call passes the full history. Claude reads it, understands the context, and continues the thread.

Common mistake: appending the user message but forgetting to append Claude’s reply. If you do that, the next request arrives without context, and Claude answers as if the conversation never happened.

Streaming

Waiting for a full response before printing anything works fine for scripts, but for anything user‑facing, streaming feels much better.

from dotenv import load_dotenv
from anthropic import Anthropic

load_dotenv()
client = Anthropic()

with client.messages.stream(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain recursion simply."}
    ]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

The text appears token‑by‑token, giving the impression of a live conversation.

Real‑time Streaming

Instead of waiting, words appear as Claude generates them — the same experience you get in the Claude.ai interface.

A Real Use Case

Here’s a function worth keeping. It summarizes any text you pass to it:

from dotenv import load_dotenv
from anthropic import Anthropic

load_dotenv()
client = Anthropic()

def summarize(text: str, sentences: int = 3) -> str:
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=512,
        system=f"Summarize the following text in {sentences} sentences. Return only the summary.",
        messages=[{"role": "user", "content": text}]
    )
    return response.content[0].text

article = """
The James Webb Space Telescope has captured the deepest infrared image
of the universe ever taken. The image covers a patch of sky approximately
the size of a grain of sand held at arm's length. It contains thousands
of galaxies, some of which formed less than a billion years after the
Big Bang. Scientists believe this data will reshape our understanding
of how the earliest galaxies formed and evolved.
"""

print(summarize(article, sentences=2))

Change the system prompt and it becomes a translator, a classifier, a data extractor. The pattern is always the same.

Handling Errors

Networks fail. Rate limits happen. Wrap your calls:

from dotenv import load_dotenv
from anthropic import Anthropic, APIError, RateLimitError, APIConnectionError

load_dotenv()
client = Anthropic()

def ask(question: str) -> str:
    try:
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=1024,
            messages=[{"role": "user", "content": question}]
        )
        return response.content[0].text

    except RateLimitError:
        return "Rate limit reached. Wait a moment and try again."

    except APIConnectionError:
        return "Connection failed. Check your internet."

    except APIError as e:
        return f"API error {e.status_code}."

Choosing a Model

Model	Use when
`claude-sonnet-4-6`	Most things. Fast, capable, cost‑effective
`claude-opus-4-6`	Hard problems that need deep reasoning
`claude-haiku-4-5-20251001`	High volume, simple tasks, lowest cost

Start with Sonnet. Switch if you have a reason.

Things That Will Catch You

The API costs money. Claude.ai’s web UI doesn’t. Add credits before you start.
load_dotenv() doesn’t call itself. If your key isn’t loading, this is probably why.
max_tokens being too low cuts responses mid‑thought. Raise it if answers feel incomplete.
The conversation history needs both sides: user messages and Claude’s replies. Miss one and the context breaks.
On macOS/Linux, python might point to Python 2. Use python3 if things aren’t working as expected.

What’s Next

The foundation is here. Where it goes depends on what you’re building.

Tool use lets Claude call your own Python functions — useful when you want it to interact with real data or external services.
Vision lets you send images alongside text, so Claude can read screenshots, diagrams, or documents.
Async support via AsyncAnthropic is worth exploring if you’re handling multiple requests at once.

The full documentation is at .

The Whole Thing in Ten Lines

from dotenv import load_dotenv
from anthropic import Anthropic

load_dotenv()
client = Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Your message here."}]
)

print(response.content[0].text)