Deploying a Machine Learning Model to AWS SageMaker Complete Guide - PART 01
Source: Dev.to
Deploying a Custom Machine‑Learning Model on AWS
In this guide we’ll walk through how to deploy a custom model (or any model from Hugging Face) on AWS using a handful of core services:
| Service | What it does |
|---|---|
| Amazon SageMaker | The “sage maker” – the platform that trains, hosts, and scales ML models. |
| Amazon API Gateway | Exposes your model as a public HTTP API so other developers can call it. |
| AWS Lambda | A lightweight, server‑less function that routes, validates, and formats incoming requests before they reach the model. |
⚠️ Cost note – For learning purposes you can keep the bill under ≈ $5 if you shut everything down after ~6 h. Set a billing alarm at $5 to avoid surprises.
1️⃣ Open SageMaker
- Sign in to the AWS console.
- In the top‑left search bar type “SageMaker” and select the service.
You should see a dashboard similar to the screenshot below.

2️⃣ Create a Notebook Instance
A SageMaker notebook is a managed Jupyter environment that runs on an ML‑optimized EC2 instance.
- In the left sidebar click “Notebook instances”.
- Click “Create notebook instance” and fill out the form:
| Field | Recommended value / notes |
|---|---|
| Notebook instance name | translation-model-01 (or any name you like) |
| Instance type | Choose a modest instance (e.g., ml.t2.medium). You can change it later, but larger instances cost more per hour. |
| IAM role | Either select an existing role or let SageMaker create a new one. The default role is fine for now, but keep the principle of least privilege in mind – avoid granting full S3 access unless necessary. |
| Encryption | Keep the default settings unless you have specific compliance requirements. |

Why IAM roles?
An IAM role is an identity that grants only the permissions required for a specific task. Using a role limits the blast radius if a vulnerability is exploited – the attacker only gets the role’s permissions, not full account access.
Creating the role and notebook

Then click Create role. A few minutes later you’ll see a green success message and the role will be auto‑selected. Leave the rest of the details as they are and click Create notebook.
Now that the notebook has been created (it takes about 5‑10 minutes), open it. You’ll see two options at the top: Jupyter and JupyterLab. Both are interactive environments that let you write code, run it, keep data and variables intact, and then rewrite, re‑run, and refine your code on‑the‑fly. This is called a Read‑Eval‑Print‑Loop (REPL).
To create a new notebook:
- File > New > Notebook
- When prompted, select a kernel. The default kernel works fine for our use case.
A kernel is essentially the interpreter that runs your Python (or Markdown) code. Think of it as the same interpreter you’d install locally, but here it comes pre‑configured out of the box.
JupyterLab main editor
JupyterLab functions like an IDE: it provides an editor, file explorer, terminal, and many other tools.
In the screenshot below we created a notebook file named Untitled.ipynb (you can rename it later; .ipynb stands for iPython Notebook). Inside, we’ll write the same code we would in Google Colab.
A simple Python snippet is already written; run it with Shift + Enter. The output appears in the next cell, showing:
Hello World!
3️⃣ (Optional) Review IAM Role Permissions
After the notebook is created you’ll see a screen like this:

- The default policy gives the notebook read/write access to an automatically created S3 bucket.
- If you need additional resources (e.g., other S3 buckets, ECR repositories), attach the necessary policies after you understand the security implications.
4️⃣ Next Steps (Overview)
Once the notebook instance is running you can:
- Clone a Hugging Face repo or upload your own model files.
- Create a SageMaker endpoint to host the model for real‑time inference.
- Set up an API Gateway that forwards HTTP requests to a Lambda function.
- Write a Lambda handler that:
- Validates the incoming payload.
- Calls the SageMaker endpoint.
- Returns the model’s response to the caller.
The rest of the tutorial (model deployment, API creation, Lambda code, testing) follows the same pattern – each step is illustrated with screenshots and concise instructions.
📌 Quick Recap
- SageMaker = training & hosting environment.
- API Gateway = public HTTP entry point.
- Lambda = request validation & routing (server‑less).
- Cost control = shut down resources when not in use & set a billing alarm at $5.
Happy deploying! 🚀



