当前位置: 首页 > news >正文

CANN/AMCT大模型量化示例

AMCT Large Model Quantization

【免费下载链接】amctAMCT是CANN提供的昇腾AI处理器亲和的模型压缩工具仓。项目地址: https://gitcode.com/cann/amct

1 Quantization Prerequisites

1.1 Install Dependencies

The dependency packages for this sample can be found in requirements.txt

Note that the torch_npu package version needs to match the Python and torch package versions, and the CANN package needs to be installed

1.2 Model and Dataset Preparation

This sample uses Llama2-7b, qwen2-7b, and qwen3-8b models with wikitext2 dataset as examples. Please download the models yourself and pass the model path to the script. The dataset is loaded online.

1.3 Simple Quantization Configuration

The quantization configuration used in this sample is built into the tool and can be obtained and used in the following ways:

from amct_pytorch import HIFP8_CAST_CFG

If you need to modify the detailed configuration, please refer to the documentation to construct the required quantization configuration dict.

The cast algorithm supports weight-only quantization and full quantization. The supported quantization types and quantization configurations are:

FieldTypeDescriptionValue RangeNotes
batch_numuint32Number of batches used for quantization1/
skip_layersstrLayers to skip quantization/Skip quantization layers support fuzzy matching. When the configured string is a layer name substring or matches the layer name, skip quantization for that layer and do not generate quantization configuration. The string must contain numbers or letters
weights.typestrQuantized weight type'hifloat8'/
weights.symmetricboolSymmetric quantizationTRUE/
weights.strategystrQuantization granularity'tensor'/'channel'/
inputs.typestrQuantized activation type'hifloat8'/
inputs.symmetricboolSymmetric quantizationTRUE/
inputs.strategystrQuantization granularity'tensor'/'token'/
algorithmdictQuantization algorithm configuration used{'cast'}/

2 Quantization Example

2.1 Use Interface Method to Call

step 1.Please execute the following command in the current directory to run the sample program. Users need to modify the model path in the sample program according to actual conditions:

python3 src/run_llama2_samples.py --model_path=/data/Llama2_7b_hf/
python3 src/run_qwen_samples.py --model_path=/data/Qwen2-7b/
python3 src/run_qwen_samples.py --model_path=/data/Qwen3-8B/

If the following information appears, it indicates that quantization is successful:

Test time taken: 1.0 min 59.24865388870239 s Score: 5.477707

Where Score is the quantized model PPL. For specific values, refer to the following table:

ModelCalibration SetDatasetPre-quantization PPLPost-quantization PPL
LLAMA2-7Bpilevalwikitext25.4725.524
QWEN2-7Bpilevalwikitext27.1377.188
QWEN3-8Bpilevalwikitext29.7159.745

After inference succeeds, a quantization log file ./amct_log/amct_pytorch.log is generated in the current directory

【免费下载链接】amctAMCT是CANN提供的昇腾AI处理器亲和的模型压缩工具仓。项目地址: https://gitcode.com/cann/amct

创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考

http://www.jsqmd.com/news/959629/

相关文章:

  • 从半模到全模:一份给CFDer的ICEM结构化网格镜像避坑手册(附Fluent接口设置)
  • 071、姿态控制:俯仰通道设计
  • 用Python和NumPy手把手教你:从方波合成动画看懂傅里叶级数(附完整代码)
  • CANN/amct GPTQ量化示例
  • Mythos:首个可规模化漏洞挖掘的AI安全研究员
  • 【AI×古董修复革命】:20年文保专家首曝3大智能工具整合框架,错过再等十年?
  • 机器学习模型服务化:从Notebook到高可用生产的四层架构实践
  • LDDC:一款高效精准的逐字歌词下载与匹配工具
  • Kodi云端观影革命:用115proxy插件实现无限存储的智能影院系统
  • 3步搭建你的AI智能交易系统:TradingAgents-CN中文版全攻略
  • 速腾RS-Lidar-16 + 超核CH110 IMU:手把手教你搞定LIO-SAM数据适配与标定(Ubuntu 18.04 ROS Melodic)
  • SQL高手进阶:从语法熟练到执行引擎直觉的跃迁路径
  • 072、姿态控制:偏航通道设计
  • 告别双端维护!Lynx-native实现一套代码运行iOS与Android的终极方案
  • 2026年实际成本分摊ERP解决方案TOP5排行盘点:NAV MES、NAV MPS、NAV MRP、NAV Mobile选择指南 - 优质品牌商家
  • 从config.json到实战:深入理解distilbert_finetuned_yahoo_answers_topics-openmind配置文件
  • 知乎式问答社区源码:SpringBoot后端 + Vue2前端,含数据库脚本与部署文档
  • 从‘空口令’到‘security123’:一次完整的L0phtCrack密码审计实验复盘与防御思考
  • 别再只用SSH了!手把手教你用CentOS 8和VMware搭建Telnet实验环境(附Windows 10客户端开启教程)
  • 从防火墙到探针:拆解一份真实的等保2.0设备采购清单,看看钱都花在哪了
  • 2026宣城疑难税务处理技术要点与靠谱服务解析 - 优质品牌商家
  • 别再用颜色识别了!用OpenMV 4 Plus + Edge Impulse,5分钟搞定一个垃圾分类小助手
  • Veo视频风格迁移效果翻车全复盘,37个真实项目案例对比(含Stable Video Diffusion基准线)
  • 2026上门地漏疏通服务评测:上门下水道疏通/上门通下水/上门马桶疏通/马桶疏通/上门地漏疏通/上门管道疏通/地漏疏通/选择指南 - 优质品牌商家
  • 51单片机搭配ADC0832实测100V直流电压的完整软硬件方案
  • 大模型MoE架构揭秘:稀疏激活如何实现万亿参数高效推理
  • 从std::mutex到std::recursive_mutex:你的C++多线程设计可能需要一次重构
  • Mac Mouse Fix 终极指南:让普通鼠标在 macOS 上超越苹果触控板
  • Apache服务器安全配置:从.htaccess文件解析漏洞看如何防护你的网站
  • B站视频解析终极指南:5个简单技巧助你轻松获取高清资源