My first impressions on ROCm and Strix Halo
Source: Hacker News
Here I’ll share my first impressions with ROCm and Strix Halo and how I’ve set up everything.

128 GB efficiently shared between the CPU and GPU.
OS choice and driver installation
I’m used to working with Ubuntu, so I stuck with the supported 24.04 LTS version and just followed the official installation instructions.
BIOS update
Things wouldn’t work without a BIOS update: PyTorch was unable to find the GPU. The update was easily done from the BIOS settings; the machine connected to my Wi‑Fi network and downloaded it automatically.
BIOS settings and Grub changes
In the BIOS settings, make sure to set the reserved video memory to a low value and let the memory be shared between the CPU and GPU using the GTT. The reserved memory can be as low as 512 MB.
Implications
- The CPU cannot use the GPU‑reserved memory.
- The GPU can use the total of Reserved + GTT, but utilizing both simultaneously can be less efficient than a single large GTT pool due to fragmentation and addressing overhead.
- Some legacy games or software may see the GPU memory as 512 MB and refuse to work; this hasn’t happened to me so far.
Then edit /etc/default/grub:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash ttm.pages_limit=32768000 amdgpu.gttsize=114688"
Run sudo update-grub.
Note:
amdgpu.gttsizeshouldn’t include the whole system memory. Leave some memory (≈ 4 GB–12 GB) reserved for the CPU (total memory − reserved GPU − GTT) to keep the Linux kernel stable.
PyTorch with UV
Because of PyTorch’s complex dependency graph, I ended up using uv with the following pyproject.toml:
[project]
name = "myproject"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.13"
dependencies = [
"torch==2.11.0+rocm7.2",
"triton-rocm",
]
[tool.uv]
environments = ["sys_platform == 'linux'"]
[[tool.uv.index]]
name = "pytorch-rocm"
url = "https://download.pytorch.org/whl/rocm7.2"
explicit = true
[tool.uv.sources]
torch = { index = "pytorch-rocm" }
torchvision = { index = "pytorch-rocm" }
triton-rocm = { index = "pytorch-rocm" }
You can add a convenient alias to your .bashrc:
alias pytorch='''uvx --extra-index-url https://download.pytorch.org/whl/rocm7.2 \
--index-strategy unsafe-best-match \
--with torch==2.11.0+rocm7.2,triton-rocm \
ipython -c "import torch; print(f\"ROCM: {torch.version.hip}\"); \
print(f\"GPU available: {torch.cuda.is_available()}\"); import torch.nn as nn" -i
'''
Llama.cpp
Run the server container:
podman run --rm -it --name qwen-coder \
--device /dev/kfd --device /dev/dri \
--security-opt label=disable --group-add keep-groups \
-e HSA_OVERRIDE_GFX_VERSION=11.5.0 \
-p 8080:8080 -v /some_path/models:/models:z \
ghcr.io/ggml-org/llama.cpp:server-rocm \
-m /models/qwen3.6/model.gguf -ngl 99 -c 327680 \
--host 0.0.0.0 --port 8080 \
--flash-attn on --no-mmap
Download the model:
uvx hf download Qwen/Qwen3.6-35B-A3B --local-dir /some_path/models/qwen3.6
Clone the llama.cpp repository and convert the model to GGUF:
git clone https://github.com/ggerganov/llama.cpp.git /some_path/llama.cpp
cd /some_path/models/qwen3.6 && \
uvx --extra-index-url https://download.pytorch.org/whl/rocm7.2 \
--index-strategy unsafe-best-match \
--with torch==2.11.0+rocm7.2,triton-rocm,transformers \
ipython /some_path/llama.cpp/convert_hf_to_gguf.py \
-- . --outfile model.gguf
Opencode
I’m using Podman to run Opencode; see my repo for the setup instructions: .
Here is the configuration that lets Opencode work with the Llama.cpp server:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"local": {
"options": {
"baseURL": "http://localhost:8080/v1",
"apiKey": "any-string",
"reasoningEffort": "auto",
"textVerbosity": "high",
"supportsToolCalls": true
},
"models": {
"qwen-coder-local": {}
}
}
},
"model": "local/qwen-coder-local",
"permission": {
"*": "ask",
"read": {
"*": "allow",
"*.env": "deny",
"**/secrets/**": "deny"
},
"bash": "allow",
"edit": "allow",
"glob": "allow",
"grep": "allow",
"websearch": "allow",
"codesearch": "allow",
"webfetch": "allow"
},
"disabled_providers": [
"opencode"
]
}
Conclusion
So, as promised, my first impressions are: so far, so good. I was able to play with PyTorch and run Qwen 3.6 on Llama.cpp with a large context window. There were some rough edges, but overall it was quite worth it.