My first impressions on ROCm and Strix Halo

Published: 1 day ago (April 18, 2026 at 05:50 PM EDT)

3 min read

Source: Hacker News

Here I’ll share my first impressions with ROCm and Strix Halo and how I’ve set up everything.

Strix Halo on htop
128 GB efficiently shared between the CPU and GPU.

OS choice and driver installation

I’m used to working with Ubuntu, so I stuck with the supported 24.04 LTS version and just followed the official installation instructions.

BIOS update

Things wouldn’t work without a BIOS update: PyTorch was unable to find the GPU. The update was easily done from the BIOS settings; the machine connected to my Wi‑Fi network and downloaded it automatically.

BIOS settings and Grub changes

In the BIOS settings, make sure to set the reserved video memory to a low value and let the memory be shared between the CPU and GPU using the GTT. The reserved memory can be as low as 512 MB.

Implications

The CPU cannot use the GPU‑reserved memory.
The GPU can use the total of Reserved + GTT, but utilizing both simultaneously can be less efficient than a single large GTT pool due to fragmentation and addressing overhead.
Some legacy games or software may see the GPU memory as 512 MB and refuse to work; this hasn’t happened to me so far.

Then edit /etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash ttm.pages_limit=32768000 amdgpu.gttsize=114688"

Run sudo update-grub.

Note: amdgpu.gttsize shouldn’t include the whole system memory. Leave some memory (≈ 4 GB–12 GB) reserved for the CPU (total memory − reserved GPU − GTT) to keep the Linux kernel stable.

PyTorch with UV

Because of PyTorch’s complex dependency graph, I ended up using uv with the following pyproject.toml:

[project]
name = "myproject"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.13"
dependencies = [
    "torch==2.11.0+rocm7.2",
    "triton-rocm",
]

[tool.uv]
environments = ["sys_platform == 'linux'"]

[[tool.uv.index]]
name = "pytorch-rocm"
url = "https://download.pytorch.org/whl/rocm7.2"
explicit = true

[tool.uv.sources]
torch = { index = "pytorch-rocm" }
torchvision = { index = "pytorch-rocm" }
triton-rocm = { index = "pytorch-rocm" }

You can add a convenient alias to your .bashrc:

alias pytorch='''uvx --extra-index-url https://download.pytorch.org/whl/rocm7.2 \
    --index-strategy unsafe-best-match \
    --with torch==2.11.0+rocm7.2,triton-rocm \
    ipython -c "import torch; print(f\"ROCM: {torch.version.hip}\"); \
    print(f\"GPU available: {torch.cuda.is_available()}\"); import torch.nn as nn" -i
'''

Llama.cpp

Run the server container:

podman run --rm -it --name qwen-coder \
    --device /dev/kfd --device /dev/dri \
    --security-opt label=disable --group-add keep-groups \
    -e HSA_OVERRIDE_GFX_VERSION=11.5.0 \
    -p 8080:8080 -v /some_path/models:/models:z \
    ghcr.io/ggml-org/llama.cpp:server-rocm \
    -m /models/qwen3.6/model.gguf -ngl 99 -c 327680 \
    --host 0.0.0.0 --port 8080 \
    --flash-attn on --no-mmap

Download the model:

uvx hf download Qwen/Qwen3.6-35B-A3B --local-dir /some_path/models/qwen3.6

Clone the llama.cpp repository and convert the model to GGUF:

git clone https://github.com/ggerganov/llama.cpp.git /some_path/llama.cpp

cd /some_path/models/qwen3.6 && \
uvx --extra-index-url https://download.pytorch.org/whl/rocm7.2 \
    --index-strategy unsafe-best-match \
    --with torch==2.11.0+rocm7.2,triton-rocm,transformers \
    ipython /some_path/llama.cpp/convert_hf_to_gguf.py \
    -- . --outfile model.gguf

Opencode

I’m using Podman to run Opencode; see my repo for the setup instructions: .

Here is the configuration that lets Opencode work with the Llama.cpp server:

{
    "$schema": "https://opencode.ai/config.json",
    "provider": {
        "local": {
            "options": {
                "baseURL": "http://localhost:8080/v1",
                "apiKey": "any-string",
                "reasoningEffort": "auto",
                "textVerbosity": "high",
                "supportsToolCalls": true
            },
            "models": {
                "qwen-coder-local": {}
            }
        }
    },
    "model": "local/qwen-coder-local",
    "permission": {
        "*": "ask",
        "read": {
            "*": "allow",
            "*.env": "deny",
            "**/secrets/**": "deny"
        },
        "bash": "allow",
        "edit": "allow",
        "glob": "allow",
        "grep": "allow",
        "websearch": "allow",
        "codesearch": "allow",
        "webfetch": "allow"
    },
    "disabled_providers": [
        "opencode"
    ]
}

Conclusion

So, as promised, my first impressions are: so far, so good. I was able to play with PyTorch and run Qwen 3.6 on Llama.cpp with a large context window. There were some rough edges, but overall it was quite worth it.

My first impressions on ROCm and Strix Halo

OS choice and driver installation

BIOS update

BIOS settings and Grub changes

Implications

PyTorch with UV

Llama.cpp

Opencode

Conclusion

Related posts

2,100 Swiss municipalities showing which provider handles their official email

Ex-CEO, ex-CFO of bankrupt AI company charged with fraud

I wrote a CHIP-8 emulator in my own programming language

Turtle WoW classic server announces shutdown after Blizzard wins injunction