Deploying a Machine Learning Model to AWS SageMaker Complete Guide - PART 01

Published: (December 18, 2025 at 05:39 PM EST)
4 min read
Source: Dev.to

Source: Dev.to

Deploying a Custom Machine‑Learning Model on AWS

In this guide we’ll walk through how to deploy a custom model (or any model from Hugging Face) on AWS using a handful of core services:

ServiceWhat it does
Amazon SageMakerThe “sage maker” – the platform that trains, hosts, and scales ML models.
Amazon API GatewayExposes your model as a public HTTP API so other developers can call it.
AWS LambdaA lightweight, server‑less function that routes, validates, and formats incoming requests before they reach the model.

⚠️ Cost note – For learning purposes you can keep the bill under ≈ $5 if you shut everything down after ~6 h. Set a billing alarm at $5 to avoid surprises.

1️⃣ Open SageMaker

  1. Sign in to the AWS console.
  2. In the top‑left search bar type “SageMaker” and select the service.
    You should see a dashboard similar to the screenshot below.

SageMaker home page

2️⃣ Create a Notebook Instance

A SageMaker notebook is a managed Jupyter environment that runs on an ML‑optimized EC2 instance.

  1. In the left sidebar click “Notebook instances”.
  2. Click “Create notebook instance” and fill out the form:
FieldRecommended value / notes
Notebook instance nametranslation-model-01 (or any name you like)
Instance typeChoose a modest instance (e.g., ml.t2.medium). You can change it later, but larger instances cost more per hour.
IAM roleEither select an existing role or let SageMaker create a new one. The default role is fine for now, but keep the principle of least privilege in mind – avoid granting full S3 access unless necessary.
EncryptionKeep the default settings unless you have specific compliance requirements.

Create notebook form

Why IAM roles?
An IAM role is an identity that grants only the permissions required for a specific task. Using a role limits the blast radius if a vulnerability is exploited – the attacker only gets the role’s permissions, not full account access.

Creating the role and notebook

Create role screen

Then click Create role. A few minutes later you’ll see a green success message and the role will be auto‑selected. Leave the rest of the details as they are and click Create notebook.

Create notebook screen

Now that the notebook has been created (it takes about 5‑10 minutes), open it. You’ll see two options at the top: Jupyter and JupyterLab. Both are interactive environments that let you write code, run it, keep data and variables intact, and then rewrite, re‑run, and refine your code on‑the‑fly. This is called a Read‑Eval‑Print‑Loop (REPL).

Jupyter vs JupyterLab

To create a new notebook:

  1. File > New > Notebook
  2. When prompted, select a kernel. The default kernel works fine for our use case.

New notebook dialog

A kernel is essentially the interpreter that runs your Python (or Markdown) code. Think of it as the same interpreter you’d install locally, but here it comes pre‑configured out of the box.

Kernel selection screen

JupyterLab main editor

JupyterLab functions like an IDE: it provides an editor, file explorer, terminal, and many other tools.

In the screenshot below we created a notebook file named Untitled.ipynb (you can rename it later; .ipynb stands for iPython Notebook). Inside, we’ll write the same code we would in Google Colab.

A simple Python snippet is already written; run it with Shift + Enter. The output appears in the next cell, showing:

Hello World!

3️⃣ (Optional) Review IAM Role Permissions

After the notebook is created you’ll see a screen like this:

IAM role configuration

  • The default policy gives the notebook read/write access to an automatically created S3 bucket.
  • If you need additional resources (e.g., other S3 buckets, ECR repositories), attach the necessary policies after you understand the security implications.

4️⃣ Next Steps (Overview)

Once the notebook instance is running you can:

  1. Clone a Hugging Face repo or upload your own model files.
  2. Create a SageMaker endpoint to host the model for real‑time inference.
  3. Set up an API Gateway that forwards HTTP requests to a Lambda function.
  4. Write a Lambda handler that:
    • Validates the incoming payload.
    • Calls the SageMaker endpoint.
    • Returns the model’s response to the caller.

The rest of the tutorial (model deployment, API creation, Lambda code, testing) follows the same pattern – each step is illustrated with screenshots and concise instructions.

📌 Quick Recap

  • SageMaker = training & hosting environment.
  • API Gateway = public HTTP entry point.
  • Lambda = request validation & routing (server‑less).
  • Cost control = shut down resources when not in use & set a billing alarm at $5.

Happy deploying! 🚀

Back to Blog

Related posts

Read more »