Ubuntu 24.04.2 + AMD Vega56 + llama.cpp
·2 min read
1. 安装 Ubuntu 24.04.2
- 不要使用 24.04.4 不支持
- https://mirrors.sohu.com/ubuntu-releases/24.04.2/
2. 安装 AMD 的驱动
~$ sudo apt update
~$ sudo apt install "linux-headers-$(uname -r)" "linux-modules-extra-$(uname -r)"
~$ sudo apt install python3-setuptools python3-wheel
~$ sudo usermod -a -G render,video $LOGNAME
~$ wget https://repo.radeon.com/amdgpu-install/6.3.3/ubuntu/noble/amdgpu-install_6.3.60303-1_all.deb
~$ sudo apt install ./amdgpu-install_6.3.60303-1_all.deb
~$ sudo apt update
~$ sudo apt install amdgpu-dkms rocm
~$ sudo amdgpu-install --no-dkms
~$ sudo apt install rocm-dev
~$ sudo reboot
3. 看看安装是否成功
~$ export PATH=/opt/rocm/bin:$PATH
~$ rocminfo
~$ rocm-smi
~$ hipcc --version
HIP version: 6.3.42134-a9a80e791
AMD clang version 18.0.0git (https://github.com/RadeonOpenCompute/llvm-project roc-6.3.3 25012 e5bf7e55c91490b07c49d8960fa7983d864936c4)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/rocm-6.3.3/lib/llvm/bin
Configuration file: /opt/rocm-6.3.3/lib/llvm/bin/clang++.cfg
4. 编译安装 llama.cpp (ollama 难整)
~$ git clone --depth=1 https://github.com/ggerganov/llama.cpp
~$ cmake -B build \
-DGGML_USE_HIP=ON \
-DGGML_HIPBLAS=ON \
-DAMDGPU_TARGETS=gfx900 \
-DCMAKE_PREFIX_PATH=/opt/rocm \
-DCMAKE_BUILD_TYPE=Release
# 编译
~$ cmake --build build -j$(nproc)
# 编译一会如果报错了,我单独编译下面两个
~$ cmake --build build -j$(nproc) --target llama-cli
~$ cmake --build build -j$(nproc) --target llama-server
5. 下载模型 gemma-4-E4B-it-Q5_K_M.gguf
~$ sudo apt install aria2c
~$ aria2c -x 16 -s 16 "https://cdn-lfs-cn-1.modelscope.cn/prod/lfs-objects/44/e9/292ff1b243a0923e8c46ed58111e46a6c300be74797b6dd54ae0e66b5bba?filename=gemma-4-E4B-it-Q5_K_M.gguf&namespace=unsloth&repository=gemma-4-E4B-it-GGUF&revision=master&tag=model&auth_key=1775583418-ee4c008381624c4ca11096720b246e79-0-0a5874921a93f20fc7ca3094678b8e7a"
6. 运行 llama.cpp
~$ llama-cli --version
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 8176 MiB):
Device 0: Radeon RX Vega, gfx900:xnack- (0x900), VMM: no, Wave Size: 64, VRAM: 8176 MiB
version: 8672 (25eec6f32)
built with Clang 18.0.0 for Linux x86_64
~$ llama-cli -m ~/Downloads/gemma-4-E4B-it-Q5_K_M.gguf -ngl 999
~$ llama-server -m ~/Downloads/gemma-4-E4B-it-Q5_K_M.gguf -ngl 999 -c 2048 --host 0.0.0.0
- rocm-smi 可以看到 GPU 的运行情况