当前位置: 首页 > news >正文

mindie推理框架

  华为自己的大模型推理框架,链接:https://www.hiascend.com/document/detail/zh/mindie/,左上角选版本,目前是2.3.0,据说2.2.rc1最稳定

  三种推理方式:镜像、物理机、容器。镜像方式最省,但还不完美,跑完镜像要进去改配置、手工启动服务。物理机和容器都是要手工安装CANN、mindie等组件。

  mindie镜像下载地址:https://www.hiascend.com/developer/ascendhub/detail/af85b724a7e5469ebd7ea13c3439d48f

  容器启动命令:

docker run -it -d --net=host --shm-size=16g --privileged --restart always --name qwen72b --device=/dev/davinci_manager --device=/dev/hisi_hdc --device=/dev/devmm_svm -v /usr/local/Ascend/driver:/usr/local/Ascend/driver:ro -v /usr/local/sbin:/usr/local/sbin:ro -v /app/model:/model:ro swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:2.2.RC1-800I-A2-py311-openeuler24.03-lts bash

  进入容器:docker exec -it qwen72b bash

  环境配置:若服务启动报错:bin/mindieservice_daemon: error while loading shared libraries: libtorch.so: cannot open shared object file: No such file or directory,需配置环境变量

export LD_LIBRARY_PATH=/usr/local/lib64/python3.11/site-packages/torch/lib/:$LD_LIBRARY_PATH

  修改配置:/usr/local/Ascend/mindie/latest/mindie-service/conf/config.json

{"Version" : "1.0.0","LogConfig" :{"logLevel" : "Info","logFileSize" : 20,"logFileNum" : 20,"logPath" : "logs/mindie-server.log"},"ServerConfig" :{"ipAddress" : "192.168.68.12","managementIpAddress" : "127.0.0.2","port" : 9000,"managementPort" : 1027,"metricsPort" : 1028,"allowAllZeroIpListening" : false,"maxLinkNum" : 1000,"httpsEnabled" : false,"fullTextEnabled" : false,"tlsCaPath" : "security/ca/","tlsCaFile" : ["ca.pem"],"tlsCert" : "security/certs/server.pem","tlsPk" : "security/keys/server.key.pem","tlsPkPwd" : "security/pass/key_pwd.txt","tlsCrlPath" : "security/certs/","tlsCrlFiles" : ["server_crl.pem"],"managementTlsCaFile" : ["management_ca.pem"],"managementTlsCert" : "security/certs/management/server.pem","managementTlsPk" : "security/keys/management/server.key.pem","managementTlsPkPwd" : "security/pass/management/key_pwd.txt","managementTlsCrlPath" : "security/management/certs/","managementTlsCrlFiles" : ["server_crl.pem"],"kmcKsfMaster" : "tools/pmt/master/ksfa","kmcKsfStandby" : "tools/pmt/standby/ksfb","inferMode" : "standard","interCommTLSEnabled" : true,"interCommPort" : 1121,"interCommTlsCaPath" : "security/grpc/ca/","interCommTlsCaFiles" : ["ca.pem"],"interCommTlsCert" : "security/grpc/certs/server.pem","interCommPk" : "security/grpc/keys/server.key.pem","interCommPkPwd" : "security/grpc/pass/key_pwd.txt","interCommTlsCrlPath" : "security/grpc/certs/","interCommTlsCrlFiles" : ["server_crl.pem"],"openAiSupport" : "vllm"},"BackendConfig" : {"backendName" : "mindieservice_llm_engine","modelInstanceNumber" : 1,"npuDeviceIds" : [[0,1,2,3,4,5,6,7]],"tokenizerProcessNumber" : 8,"multiNodesInferEnabled" : false,"multiNodesInferPort" : 1120,"interNodeTLSEnabled" : true,"interNodeTlsCaPath" : "security/grpc/ca/","interNodeTlsCaFiles" : ["ca.pem"],"interNodeTlsCert" : "security/grpc/certs/server.pem","interNodeTlsPk" : "security/grpc/keys/server.key.pem","interNodeTlsPkPwd" : "security/grpc/pass/mindie_server_key_pwd.txt","interNodeTlsCrlPath" : "security/grpc/certs/","interNodeTlsCrlFiles" : ["server_crl.pem"],"interNodeKmcKsfMaster" : "tools/pmt/master/ksfa","interNodeKmcKsfStandby" : "tools/pmt/standby/ksfb","ModelDeployConfig" :{"maxSeqLen" : 32768,"maxInputTokenLen" : 16384,"truncation" : false,"ModelConfig" : [
                {"modelInstanceType" : "Standard","modelName" : "dsqwen32b","modelWeightPath" : "/model/deepseek/DeepSeek-R1-Distill-Qwen-32B","worldSize" : 8,"cpuMemSize" : 5,"npuMemSize" : -1,"backendType" : "atb","trustRemoteCode" : false}]},"ScheduleConfig" :{"templateType" : "Standard","templateName" : "Standard_LLM","cacheBlockSize" : 128,"maxPrefillBatchSize" : 50,"maxPrefillTokens" : 16384,"prefillTimeMsPerReq" : 150,"prefillPolicyType" : 0,"decodeTimeMsPerReq" : 50,"decodePolicyType" : 0,"maxBatchSize" : 200,"maxIterTimes" : 16384,"maxPreemptCount" : 0,"supportSelectBatch" : false,"maxQueueDelayMicroseconds" : 5000}}
}

  修改环境变量:

  启动服务

http://www.jsqmd.com/news/443558/

相关文章:

  • 深入解析:Android音频系列(09)-AudioPolicyManager代码解析
  • CMake 接入第三方库的三种方式:add_subdirectory、FetchContent 与 find_package(C++ 工程入门第六课)
  • 变压器怎么选?聚焦能效、安全与场景适配的实战指南 - 深度智识库
  • 2026最新音乐艺考推荐!辽宁省统考/考研/校考优质音乐艺考机构权威榜单发布 - 十大品牌榜
  • 【Part 3 Unity VR眼镜端播放器开发与优化】第四节|高分辨率VR全景视频播放性能优化 - 指南
  • 2026年济南室内LED显示屏哪家好,高口碑供应商与品牌推荐 - 工业设备
  • 低空慧眼:2026军用2D成像无人机蜂群系统供应商深度解析 - 品牌2026
  • 盒马鲜生卡回收、使用全流程解析 - 团团收购物卡回收
  • 2026深度横评10款护颈枕:从人体工学到材质解码,帮你找回满分深睡体验! - 博客万
  • 279_尚硅谷_管道的注意事项和细节(1)
  • SSH 登录/退出实时监控脚本
  • 揭秘无人机通信链路两大攻击手段:姿态欺骗与电量伪造
  • 2026年投融资纠纷律师价格大揭秘,北京哪家收费合理? - 工业品网
  • 2026成都等地最新别墅装修品牌推荐:全场景覆盖,这家环保家装实力领跑 - 十大品牌榜
  • 2026成都等地最新房屋装修公司推荐:全场景适配,这家实力领跑 - 十大品牌榜
  • 长芯微LPA4112完全P2P替代ADA4522,是一款高精度双通道放大器,采用了自稳零和斩波技术
  • 2026年广州有机硅消泡剂厂家年度排名,哪家性价比高 - 工业品牌热点
  • 像素级清晰:2026战区地形三维成像无人机蜂群系统供应商洞察 - 品牌2026
  • 娱乐办公两不误,【虚拟屏】远程办公的隐私保护神器
  • 2026年牛饲料生产厂哪家技术强,为你揭秘靠谱品牌 - myqiye
  • 2026最新苏州婚纱摄影综合实力TOP10榜单正式发布 - charlieruizvin
  • 深度解析:如何在供应链黑盒中构建嵌入式系统的安全防线
  • ffplayer面试总结
  • 2026年变压器/箱式变电站/配电柜/电抗器/光伏一体机厂家推荐:陕西变压器全品类实力详解 - 深度智识库
  • 二次元影像测量仪什么牌子好
  • 2026年 太空舱厂家推荐排行榜:二手/民宿/景区/露营/酒店/户外/装配式/可移动/一体式/移动/预制/智能太空舱全方位解析 - 品牌企业推荐师(官方)
  • 帝国cms.5版的编辑器默认会清除多余的word代码,如果要保留word格式怎么修改?EmpireCMS
  • 盘点2026年上海口碑好的减震器冲击试验机生产厂家,解决选购难题 - 工业推荐榜
  • 光学动作捕捉技术:机器人科研领域的数据基石与NOKOV度量动捕的应用实践
  • 国内靠谱的https证书供应商有哪些?2026年https证书申请/https证书购买渠道推荐 - 麦麦唛