CANN/hccl 测试指南
HCCL Test
【免费下载链接】hccl集合通信库(Huawei Collective Communication Library,简称HCCL)是基于昇腾AI处理器的高性能集合通信库,为计算集群提供高性能、高可靠的通信方案项目地址: https://gitcode.com/cann/hccl
This directory contains the HCCL test code, which is divided into system tests (ST) and unit tests (UT).
Directory Structure
test/ ├── st/ # System Test │ ├── algorithm/ # Algorithm analyzer tests │ │ ├── testcase/ # Algorithm test cases │ │ ├── utils/ # Test utility code │ │ │ └── src/ │ │ │ ├── aicpu/ # AICPU-related stubs │ │ │ ├── common/ # Common utilities │ │ │ │ ├── exception/ # Exception handling │ │ │ │ └── utils/ # Utility functions │ │ │ ├── hccl_depends_stub/ # HCCL dependency interface stubs │ │ │ ├── hccl_proxy/ # Simulated communicator implementation │ │ │ │ ├── communicator/ │ │ │ │ └── topo_model/ │ │ │ ├── hccl_verifier/ # Verifier │ │ │ │ ├── mem_conflict_check/ # Memory conflict checking │ │ │ │ ├── semantics_check/ # Semantics checking │ │ │ │ ├── singletask_check/ # Single-task checking │ │ │ │ └── task_graph_generator/# Task graph generation │ │ │ ├── sim_world/ # Simulation world implementation │ │ │ └── ut/ # Algorithm analyzer UT tests │ │ ├── figures/ # Test illustration images │ │ ├── CMakeLists.txt │ │ ├── README.md # Algorithm analyzer detailed guide │ │ └── build.sh # Compilation and execution script └── ut/ # Unit Test └── common/ └── prepare_ut_env/ # UT environment preparation codeTest Types
System Test (ST)
System tests mainly verify the correctness of HCCL collective communication algorithm logic, including memory operation validation and semantics validation.
Algorithm Analyzer
The algorithm analyzer verifies algorithm logic and memory operation functions by simulating the HCCL single-operator execution flow.
Principle:
- The algorithm analyzer stubs the dependencies (hcomm and runtime interfaces) of the HCCL single-operator flow to obtain the Task sequences of all ranks during algorithm execution.
- The Task information of all ranks is formed into adirected acyclic graph.
- Validation is performed based ongraph algorithms, including memory read-write conflict checking and semantics checking.
Core Functions:
- Memory conflict checking: Analyzes whether there are possible read-write conflicts based on the synchronization situation in the Task graph.
- Semantics checking: Simulates Task graph execution, records data movement information, and verifies whether the data movement in the output memory meets the operator requirements.
For details, see the Algorithm Analyzer Guide.
Unit Test (UT)
Run the following commands in the repository root directory:
# Compile and run all unit test cases bash build.sh --ut # Compile and run all system test cases bash build.sh --st【免费下载链接】hccl集合通信库(Huawei Collective Communication Library,简称HCCL)是基于昇腾AI处理器的高性能集合通信库,为计算集群提供高性能、高可靠的通信方案项目地址: https://gitcode.com/cann/hccl
创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考
