当前位置：首页 > news >正文

CANN/AMCT基于精度自动校准API

news 2026/5/10 3:36:56

accuracy_based_auto_calibration

【免费下载链接】amctAMCT是CANN提供的昇腾AI处理器亲和的模型压缩工具仓。项目地址: https://gitcode.com/cann/amct

产品支持情况

产品	是否支持
Ascend 950PR/Ascend 950DT	√
Atlas A3 训练系列产品/Atlas A3 推理系列产品	√
Atlas A2 训练系列产品/Atlas A2 推理系列产品	√

功能说明

根据用户输入的模型、配置文件进行自动的校准过程，搜索得到一个满足目标精度的量化配置，输出可以在ONNX Runtime环境下做精度仿真的fake_quant模型，和可在AI处理器上做推理的deploy模型。

函数原型

accuracy_based_auto_calibration(model,model_evaluator,config_file,record_file,save_dir,input_data,input_names,output_names,dynamic_axes,strategy='BinarySearch',sensitivity='CosineSimilarity')

参数说明

参数名	输入/输出	说明
model	输入	含义：用户的torch model。数据类型：torch.nn.Module
model_evaluator	输入	含义：自动量化进行校准和评估精度的Python实例。数据类型：Python实例
config_file	输入	含义：用户生成的量化配置文件。数据类型：string
record_file	输入	含义：存储量化因子的路径，如果该路径下已存在文件，则会被重写。数据类型：string
save_dir	输入	含义：模型存放路径。该路径需要包含模型名前缀，例如./quantized_model/*model。数据类型：string
input_data	输入	含义：模型的输入数据。一个torch.tensor会被等价为tuple（torch.tensor）。数据类型：tuple
input_names	输入	含义：模型的输入的名称，用于modfied_onnx_file中显示。默认值：None 数据类型：list(string)
output_names	输入	含义：模型的输出的名称，用于modfied_onnx_file中显示。默认值：None 数据类型：list(string)
dynamic_axes	输入	含义：对模型输入输出动态轴的指定，例如对于输入inputs（NCHW），N、H、W为不确定大小，输出outputs（NL），N为不确定大小，则{"inputs": [0,2,3], "outputs": [0]}。默认值：None 数据类型：dict<string, dict<python:int, string>> or dict<string, list(int)>
strategy	输入	含义：搜索满足精度要求的量化配置的策略，默认是二分法策略。数据类型：string或Python实例默认值：BinarySearch
sensitivity	输入	含义：评价每一层量化层对于量化敏感度的指标，默认是余弦相似度。数据类型：string或Python实例默认值：CosineSimilarity

返回值说明

无

调用示例

import amct_pytorch as amct from amct_pytorch.common.auto_calibration import AutoCalibrationEvaluatorBase # You need to implement the AutoCalibrationEvaluator's calibration(), evaluate() and metric_eval() funcs class AutoCalibrationEvaluator(AutoCalibrationEvaluatorBase): """ subclass of AutoCalibrationEvaluatorBase""" def __init__(self, target_loss, batch_num): super(AutoCalibrationEvaluator, self).__init__() self.target_loss = target_loss self.batch_num = batch_num def calibration(self, model): """ implement the calibration function of AutoCalibrationEvaluatorBase calibration() need to finish the calibration inference procedure so the inference batch num need to >= the batch_num pass to create_quant_config """ model_forward(model=model, batch_size=32, iterations=self.batch_num) def evaluate(self, model): """ implement the evaluate function of AutoCalibrationEvaluatorBase params: model in torch.nn.module return: the accuracy of input model on the eval dataset, or other metric which can describe the 'accuracy' of model """ top1, _ = model_forward(model=model, batch_size=32, iterations=5) if torch.cuda.is_available(): torch.cuda.empty_cache() return top1 def metric_eval(self, original_metric, new_metric): """ implement the metric_eval function of AutoCalibrationEvaluatorBase params: original_metric: the returned accuracy of evaluate() on non quantized model new_metric: the returned accuracy of evaluate() on fake quant model return: [0]: whether the accuracy loss between non quantized model and fake quant model can satisfy the requirement [1]: the accuracy loss between non quantized model and fake quant model """ loss = original_metric - new_metric if loss * 100 < self.target_loss: return True, loss return False, loss ... # 1. step1 create quant config json file config_json_file = os.path.join(TMP, 'config.json') skip_layers = [] batch_num = 2 amct.create_quant_config( config_json_file, model, input_data, skip_layers, batch_num ) # 2. step2 construct the instance of AutoCalibrationEvaluator evaluator = AutoCalibrationEvaluator(target_loss=0.5, batch_num=batch_num) # 3. step3 using the accuracy_based_auto_calibration to quantized the model record_file = os.path.join(TMP, 'scale_offset_record.txt') result_path = os.path.join(PATH, 'result/mobilenet_v2') amct.accuracy_based_auto_calibration( model=model, model_evaluator=evaluator, config_file=config_json_file, record_file=record_file, save_dir=result_path, input_data=input_data, input_names=['input'], output_names=['output'], dynamic_axes={ 'input': {0: 'batch_size'}, 'output': {0: 'batch_size'} }, strategy='BinarySearch', sensitivity='CosineSimilarity' )

落盘文件说明：