当前位置: 首页 > news >正文

如何在WSL中设置AMD AI MAX 395的Rocm微调环境

该方法不依赖与特定的Linux内核,例如我的Wsl内核如下

Linux DESKTOP-H25OAU7 6.6.87.2-microsoft-standard-WSL2 #1 SMP PREEMPT_DYNAMIC Thu Jun  5 18:30:46 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

可以看到,远低于Rocm的要求。为此AMD专门为了wsl开发了特殊的环境,安装方法如下。注意,在WSL中不要换源!!!!否则Rocm会提示找不到包

该方法来自于https://github.com/ROCm/ROCm/issues/4952

How to Set up ROCm in WSL on Strix Halo Systems

First, wsl --install Ubuntu-24.04

In Ubuntu 24.04:

Install deps:

sudo apt update
sudo apt install -y ca-certificates wget gpg

Add AMD repo signing key:

sudo mkdir --parents --mode=0755 /etc/apt/keyrings
wget https://repo.radeon.com/rocm/rocm.gpg.key -O - | \gpg --dearmor | sudo tee /etc/apt/keyrings/rocm.gpg > /dev/null

Add AMD apt sources:

sudo tee /etc/apt/sources.list.d/rocm.list << EOF
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/latest noble main
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/graphics/latest/ubuntu noble main
EOF

Set higher priority for AMD repo:

sudo tee /etc/apt/preferences.d/rocm-pin-600 << EOF
Package: *
Pin: release o=repo.radeon.com
Pin-Priority: 600
EOF

Install WSL-specific runtime:

sudo apt install hsa-runtime-rocr4wsl-amdgpu

Then get latest deb from https://github.com/ROCm/librocdxg/releases and install it.

Then you can install rocm meta package:

sudo apt install rocm

Finally, add these lines to your .bashrc:

export HSA_ENABLE_DXG_DETECTION=1
export PATH=/opt/rocm/bin:$PATH

Verification

sudo apt install rocminfo
rocminfo
WSL environment detected.
Load librocdxg.so successully!
Load all DTIF APIs OK!
=====================
HSA System Attributes
=====================
Runtime Version:         1.1
Runtime Ext Version:     1.15
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE
System Endianness:       LITTLE
Mwaitx:                  DISABLED
XNACK enabled:           NO
DMAbuf Support:          YES
VMM Support:             YES==========
HSA Agents
==========
*******
Agent 1
*******Name:                    AMD RYZEN AI MAX+ 395 w/ Radeon 8060SUuid:                    CPU-XXMarketing Name:          AMD RYZEN AI MAX+ 395 w/ Radeon 8060SVendor Name:             CPUFeature:                 None specifiedProfile:                 FULL_PROFILEFloat Round Mode:        NEARMax Queue Number:        0(0x0)Queue Min Size:          0(0x0)Queue Max Size:          0(0x0)Queue Type:              MULTINode:                    0Device Type:             CPUCache Info:L1:                      49152(0xc000) KBChip ID:                 0(0x0)Cacheline Size:          64(0x40)BDFID:                   0Internal Node ID:        0Compute Unit:            32SIMDs per CU:            0Shader Engines:          0Shader Arrs. per Eng.:   0Memory Properties:Features:                NonePool Info:Pool 1Segment:                 GLOBAL; FLAGS: FINE GRAINEDSize:                    32419956(0x1eeb074) KBAllocatable:             TRUEAlloc Granule:           4KBAlloc Recommended Granule:4KBAlloc Alignment:         4KBAccessible by all:       TRUEPool 2Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINEDSize:                    32419956(0x1eeb074) KBAllocatable:             TRUEAlloc Granule:           4KBAlloc Recommended Granule:4KBAlloc Alignment:         4KBAccessible by all:       TRUEPool 3Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINEDSize:                    32419956(0x1eeb074) KBAllocatable:             TRUEAlloc Granule:           4KBAlloc Recommended Granule:4KBAlloc Alignment:         4KBAccessible by all:       TRUEPool 4Segment:                 GLOBAL; FLAGS: COARSE GRAINEDSize:                    32419956(0x1eeb074) KBAllocatable:             TRUEAlloc Granule:           4KBAlloc Recommended Granule:4KBAlloc Alignment:         4KBAccessible by all:       TRUEISA Info:
*******
Agent 2
*******Name:                    gfx1151Uuid:                    GPU-ffffffffffffffffMarketing Name:          AMD Radeon(TM) 8060S GraphicsVendor Name:             AMDFeature:                 KERNEL_DISPATCHProfile:                 BASE_PROFILEFloat Round Mode:        NEARMax Queue Number:        128(0x80)Queue Min Size:          64(0x40)Queue Max Size:          131072(0x20000)Queue Type:              MULTINode:                    1Device Type:             GPUCache Info:L1:                      32(0x20) KBL2:                      2048(0x800) KBL3:                      32768(0x8000) KBChip ID:                 5510(0x1586)Cacheline Size:          64(0x40)Max Clock Freq. (MHz):   2900BDFID:                   50176Internal Node ID:        1Compute Unit:            40SIMDs per CU:            2Shader Engines:          2Shader Arrs. per Eng.:   2Coherent Host Access:    FALSEMemory Properties:Features:                KERNEL_DISPATCHFast F16 Operation:      TRUEWavefront Size:          32(0x20)Workgroup Max Size:      1024(0x400)Workgroup Max Size per Dimension:x                        1024(0x400)y                        1024(0x400)z                        1024(0x400)Max Waves Per CU:        32(0x20)Max Work-item Per CU:    1024(0x400)Grid Max Size:           4294967295(0xffffffff)Grid Max Size per Dimension:x                        2147483647(0x7fffffff)y                        65535(0xffff)z                        65535(0xffff)Max fbarriers/Workgrp:   32Packet Processor uCode:: 29SDMA engine uCode::      14IOMMU Support::          NonePool Info:Pool 1Segment:                 GLOBAL; FLAGS: COARSE GRAINEDSize:                    33107874(0x1f92fa2) KBAllocatable:             TRUEAlloc Granule:           4KBAlloc Recommended Granule:2048KBAlloc Alignment:         4KBAccessible by all:       FALSEPool 2Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINEDSize:                    33107874(0x1f92fa2) KBAllocatable:             TRUEAlloc Granule:           4KBAlloc Recommended Granule:2048KBAlloc Alignment:         4KBAccessible by all:       FALSEPool 3Segment:                 GROUPSize:                    64(0x40) KBAllocatable:             FALSEAlloc Granule:           0KBAlloc Recommended Granule:0KBAlloc Alignment:         0KBAccessible by all:       FALSEISA Info:ISA 1Name:                    amdgcn-amd-amdhsa--gfx1151Machine Models:          HSA_MACHINE_MODEL_LARGEProfiles:                HSA_PROFILE_BASEDefault Rounding Mode:   NEARDefault Rounding Mode:   NEARFast f16:                TRUEWorkgroup Max Size:      1024(0x400)Workgroup Max Size per Dimension:x                        1024(0x400)y                        1024(0x400)z                        1024(0x400)Grid Max Size:           4294967295(0xffffffff)Grid Max Size per Dimension:x                        2147483647(0x7fffffff)y                        65535(0xffff)z                        65535(0xffff)FBarrier Max Size:       32ISA 2Name:                    amdgcn-amd-amdhsa--gfx11-genericMachine Models:          HSA_MACHINE_MODEL_LARGEProfiles:                HSA_PROFILE_BASEDefault Rounding Mode:   NEARDefault Rounding Mode:   NEARFast f16:                TRUEWorkgroup Max Size:      1024(0x400)Workgroup Max Size per Dimension:x                        1024(0x400)y                        1024(0x400)z                        1024(0x400)Grid Max Size:           4294967295(0xffffffff)Grid Max Size per Dimension:x                        2147483647(0x7fffffff)y                        65535(0xffff)z                        65535(0xffff)FBarrier Max Size:       32
*** Done ***
Unload librocdxg.so successully!

PyTorch

pip install --index-url https://repo.amd.com/rocm/whl/gfx1151/ -U torch torchvision torchaudio
http://www.jsqmd.com/news/403264/

相关文章:

  • Flink实时计算心智模型——流、窗口、水位线、状态与Checkpoint的协作
  • 百度AI智能客服Prompt设置实战:从零搭建高效对话系统的避坑指南
  • 2024提示工程安全趋势:加密传输机制的3个创新方向
  • 【GitHub项目推荐--Heretic:全自动语言模型去审查工具】⭐⭐⭐
  • 【GitHub项目推荐--Flet:Python全栈开发者的跨平台应用框架】⭐
  • 智能客服转人工:从架构设计到实战避坑指南
  • Node.js运维部署实战:从0到1开始搭建Node.js运行环境
  • 修复网页失效的css
  • 倒立摆系统MPC控制MATLAB代码功能说明
  • 近况报告(II)
  • 北京大兴区附近回收黄金店实测,在跑了三家之后,我更看重这三点
  • 基于MCP的智能客服系统搭建:从架构设计到性能优化实战
  • 客服在线会话智能体流程图:从零构建高可用对话系统的实践指南
  • 网络抓包(hooker 无处不在的眼睛)
  • 基于西门子PLC s7-1200的往返小车的控制设计
  • 椭圆周长问题
  • 基于扣子实现智能客服系统的架构设计与性能优化实战
  • 智能客服系统技术路线解析:从架构设计到生产环境实践
  • 基于AI的物业管理智能客服系统开发实战:从架构设计到性能优化
  • 写代码自动分析简历关键词,匹配招聘要求,颠覆海投没回音。
  • [AI提效-14]-豆包“帮我写作”进阶功能详解:全文修剪与重构三大核心能力
  • 企业邮箱怎么申请注册?自己办还是找服务商,一篇说清楚
  • 公司级智能客服系统入门指南:从零搭建到核心功能实现
  • 开源客服智能体系统入门指南:从零搭建到生产环境部署
  • 智能客服实体填槽技术实战:从原理到避坑指南
  • RStudio是一个功能强大的R语言开发环境,其简洁直观的界面使得数据科学家能够更加高效地进行数据分析和可视化
  • 基于Coze搭建知识库智能客服:从架构设计到生产环境实践
  • 健康管理争议分析:大童保险的“防、治、养“闭环真能破解行业痛点?
  • 数据网格(Data Mesh)中的数据产品用户体验设计
  • 基于LSTM神经网络的金属材料机器学习本构模型研究(硕士级别)