Automated Cloud Migrations with Kiro and the Arm MCP Server

Published: (December 19, 2025 at 07:11 PM EST)
7 min read
Source: Dev.to

Source: Dev.to – Automated Cloud Migrations with Kiro and the Arm MCP Server

Introduction

AWS Graviton delivers the best price‑performance of any EC2 instance on AWS, and the recent Graviton 5 announcement makes the value proposition even stronger. Migrating to Graviton is both financially sensible and a huge performance boost for your applications.

Most applications move over seamlessly, but what if you have x86‑specific optimizations in your code?

Good news: you don’t have to do the migration manually any more.

In this post I’ll show how to use Kiro (AWS’s agentic IDE) together with the Arm MCP Server to automate the entire migration process—Docker images, SIMD intrinsics, compiler flags, the whole thing.

What the Arm MCP Server Gives You

The Arm MCP Server implements the Model Context Protocol (MCP)—a standard way for AI coding assistants to tap into specialized tools. When you connect it to Kiro, you get an agent that can:

  • Check Docker images for arm64 support without digging through manifests.
  • 🔍 Scan your codebase for x86‑specific code (intrinsics, build flags, etc.).
  • 📚 Search Arm’s knowledge base for migration guidance and intrinsic equivalents.
  • 🧪 Analyze assembly for performance characteristics.

Example: Migrating a Legacy Benchmarking App

Imagine you’ve inherited a legacy benchmarking application that’s tightly coupled to x86. Below is the original Dockerfile.

FROM centos:6

# CentOS 6 reached EOL – use vault mirrors
RUN sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/CentOS-Base.repo && \
    sed -i 's|^#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-Base.repo

# Install EPEL repository (required for some development tools)
RUN yum install -y epel-release && \
    sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/epel.repo && \
    sed -i 's|^#baseurl=http://download.fedoraproject.org/pub/epel|baseurl=http://archives.fedoraproject.org/pub/archive/epel|g' /etc/yum.repos.d/epel.repo

# Install Developer Toolset 2 for better C++11 support (GCC 4.8)
RUN yum install -y centos-release-scl && \
    sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/CentOS-SCLo-scl.repo && \
    sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/CentOS-SCLo-scl-rh.repo && \
    sed -i 's|^# baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-SCLo-scl.repo && \
    sed -i 's|^# baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-SCLo-scl-rh.repo

# Install build tools
RUN yum install -y \
    devtoolset-2-gcc \
    devtoolset-2-gcc-c++ \
    devtoolset-2-binutils \
    make \
    && yum clean all

WORKDIR /app
COPY *.h *.cpp ./

# AVX2 intrinsics are used in the code
RUN scl enable devtoolset-2 "g++ -O2 -mavx2 -o benchmark \
    main.cpp \
    matrix_operations.cpp \
    -std=c++11"

CMD ["./benchmark"]

Why This Dockerfile Won’t Work on Graviton

IssueExplanation
centos:6No arm64 variant – the base image is x86‑only.
-mavx2An x86‑only compiler flag; Arm CPUs don’t understand AVX2.
AVX2 intrinsics (__m256, etc.)Won’t compile on Arm; you need Arm‑equivalent SIMD intrinsics.

If you didn’t spot these problems yourself, that’s fine—Kiro + Arm MCP Server will.

The Problematic Matrix Multiplication Code

Below is the matrix_operations.cpp file that uses AVX2 intrinsics for double‑precision matrix multiplication. (The snippet is intentionally minimal – the full implementation contains the actual multiplication kernels.)

// matrix_operations.cpp
#include "matrix_operations.h"
#include <vector>
#include <random>
#include <iostream>
#include <immintrin.h>   // AVX2 intrinsics

// ---------------------------------------------------------------------------
// Matrix constructor
// ---------------------------------------------------------------------------
Matrix::Matrix(size_t r, size_t c) : rows(r), cols(c) {
    // Allocate a rows × cols matrix filled with zeros
    data.resize(rows, std::vector<double>(cols, 0.0));
}

// ---------------------------------------------------------------------------
// Fill the matrix with random values in the range [0.0, 10.0)
// ---------------------------------------------------------------------------
void Matrix::randomize() {
    std::random_device rd;
    std::mt19937 gen(rd());
    std::uniform_real_distribution<> dis(0.0, 10.0);

    for (size_t i = 0; i < rows; ++i) {
        for (size_t j = 0; j < cols; ++j) {
            data[i][j] = dis(gen);
        }
    }
}

// ---------------------------------------------------------------------------
// (The multiplication routine that uses AVX2 intrinsics is omitted here)
// ---------------------------------------------------------------------------

Why This Code Fails on ARM v8 AArch64

IssueExplanation
AVX2 intrinsics (_mm256_*)The x86‑64 specific intrinsics are not available on ARM v8.
Missing include guards / headersThe original snippet omitted several standard headers, causing compilation errors.
Incorrect std::vector initializationstd::vector(cols, 0.0) should be std::vector<double>(cols, 0.0).
Truncated loops / syntax errorsParts of the randomize method were incomplete.

Suggested Migration Path (Using the Kiro Assistant)

  1. Detect the Architecture

    • Kiro reads the CI configuration (.github/workflows/*.yml) and determines that the runner is arm64.
  2. Identify Incompatible Intrinsics

    • It scans the source for AVX2 intrinsics (_mm256_*) and flags them as non‑portable.
  3. Propose a Portable Alternative

    • Kiro suggests replacing AVX2 code with ARM NEON equivalents (float64x2_t, float64x4_t, etc.) or falling back to a scalar implementation if the performance impact is acceptable.
  4. Automated Refactor (Optional PR)

    • Kiro can generate a pull request that:
      • Updates the Docker/CI image to an ARM‑compatible base.
      • Adjusts the build command (e.g., adds -march=armv8-a+simd).
      • Substitutes the AVX2 intrinsics with NEON‑compatible code or a portable scalar loop.
  5. Verification

    • After the changes, Kiro runs the benchmark on an arm64 runner, compares the performance against the original x86 build, and reports any delta.

TL;DR

  • Problem: An x86‑specific Docker base, compiler flags, and AVX2 intrinsics prevent the code from running on Graviton (ARM) instances.
  • Solution: Use Kiro + Arm MCP Server to automatically detect and replace the offending parts (e.g., AVX2 intrinsics, x86‑only pre‑processor checks).
  • Result: A clean, ARM‑compatible Docker image and source code that runs on the latest Graviton 5 instances, delivering cost‑effective performance gains.

Give it a try—your future self (and your AWS bill) will thank you!

Header file matrix_operations.h

#ifndef MATRIX_OPERATIONS_H
#define MATRIX_OPERATIONS_H

#include <cstddef>          // size_t
#include <vector>           // std::vector
#include <iostream>         // std::cout (optional, for debugging)

class Matrix {
private:
    std::vector<std::vector<double>> data;   // 2‑D storage
    size_t rows;
    size_t cols;

public:
    // ctor
    Matrix(size_t r, size_t c);

    // fill the matrix with random values (0‑1)
    void randomize();

    // matrix multiplication (returns a new matrix)
    Matrix multiply(const Matrix& other) const;

    // sum of all elements – handy for a quick sanity check
    double sum() const;

    // simple accessors
    size_t getRows() const { return rows; }
    size_t getCols() const { return cols; }
};

// Benchmark driver – defined in a separate .cpp file
void benchmark_matrix_ops();

#endif // MATRIX_OPERATIONS_H

What Was Fixed?

IssueOriginalFixed
Missing include guard end#endif // MATRIX_OPERATIONS_H appeared before the class definitionMoved the guard to wrap the whole file
#include line was empty#includeAdded required headers (<cstddef>, <vector>, <iostream>)
Vector declaration syntaxstd::vector> data;Changed to std::vector<std::vector<double>> data;
Unclosed comment / stray text"Time: "Removed – not part of a header file
Function prototypes missing semicolonsnoneAdded ; after each prototype where needed

Source file main.cpp

#include "matrix_operations.h"
#include <iostream>

int main() {
    std::cout << "Matrix Operations Benchmark\n"
              << "============================\n";

#if defined(__x86_64__) || defined(_M_X64)
    std::cout << "Running on x86‑64 architecture with AVX2 optimisations\n";
#else
    std::cout << "Running on a non‑x86 architecture (e.g., ARM/Graviton)\n";
#endif

    // Execute the benchmark – implementation lives in matrix_operations.cpp
    benchmark_matrix_ops();

    return 0;
}

What Was Fixed?

IssueOriginalFixed
Missing include for <iostream>#include (empty)Added #include <iostream>
#error directive prevented compilation on ARM#error "This code requires x86-64 architecture with AVX2 support"Replaced with a runtime message; the code now compiles on any architecture.
Inconsistent indentation / stray spacesMixed tabs/spacesNormalised to 4‑space indentation
Unnecessary using namespace std; (not present) – kept explicit std:: for clarityNo change needed

How to Make the Project ARM‑Ready

  1. Remove AVX2 intrinsics – replace them with portable SIMD libraries (e.g., xsimd or compiler‑auto‑vectorisation).

  2. Update the Dockerfile

    # Use an ARM‑compatible base image
    FROM public.ecr.aws/ubuntu/ubuntu:22.04-arm64
    
    # Install build tools
    RUN apt-get update && apt-get install -y \
        build-essential cmake git \
        && rm -rf /var/lib/apt/lists/*
    
    # Copy source and build
    COPY . /app
    WORKDIR /app
    RUN cmake -B build && cmake --build build -j$(nproc)
  3. Compile with the appropriate flags

    g++ -O3 -march=armv8.2-a+simd -std=c++20 -I. -o matrix_demo main.cpp matrix_operations.cpp
  4. Run the container on Graviton – the image will now start without the x86‑specific checks and will benefit from the ARM‑native SIMD extensions.

Quick Test

# Build locally (ARM host or emulated via Docker)
docker build -t matrix-bench:arm .

# Run
docker run --rm matrix-bench:arm

Expected output:

Matrix Operations Benchmark
============================
Running on a non‑x86 architecture (e.g., ARM/Graviton)
[benchmark output…]

Now the code is fully portable, and you can take advantage of the cost‑effective performance that Graviton 5 instances provide. Happy coding!

Migrating to ARM with Kiro + MCP

The code contains many x86‑specific intrinsics, but Kiro can handle the conversion automatically.

1. Configure Kiro to Connect to the ARM MCP Server

Create the file .kiro/settings/mcp.json in the project root:

{
  "mcpServers": {
    "arm-mcp": {
      "command": "docker",
      "args": [
        "run", "-i", "--rm",
        "-v", "/path/to/your/code:/workspace",
        "armlimited/arm-mcp:1.0.1"
      ]
    }
  }
}

Notes

  • The command runs the ARM MCP Server via Docker.
  • Replace /path/to/your/code with the absolute path to your project.
  • Kiro will automatically pick up the new server. Verify the connection by typing /mcp in the chat; you should see arm-mcp listed with its tools.

2. Quick Checks in the Chat

You can ask Kiro to perform ad‑hoc checks, e.g.:

Check the base image in the Dockerfile for ARM compatibility

Kiro will use the MCP tools and report that centos:6 only supports amd64.

3. Automate Migrations with a Steering Document

Create .kiro/steering/arm-migration.md:

---
inclusion: manual
---

Your goal is to migrate a codebase from x86 to ARM. Use the MCP server tools to help you with this. Check for x86‑specific dependencies (build flags, intrinsics, libraries, etc.) and replace them with ARM‑equivalent ones, ensuring compatibility and optimizing performance. Examine Dockerfiles, version files
Back to Blog

Related posts

Read more »