当前位置: 首页 > news >正文

CANN/HCCL环状批量收发示例

Point-to-Point Communication - HcclBatchSendRecv (Ring)

【免费下载链接】hccl集合通信库(Huawei Collective Communication Library,简称HCCL)是基于昇腾AI处理器的高性能集合通信库,为计算集群提供高性能、高可靠的通信方案项目地址: https://gitcode.com/cann/hccl

Sample Description

This sample demonstrates how to use theHcclBatchSendRecv()API to implement point-to-point communication in a ring topology. It covers the following functions:

  • CallaclrtGetDeviceCount()to detect devices and query the number of available devices.

  • CallHcclGetRootInfo()and userank 0as the root rank to generate the rootinfo identifier.

    The rootinfo identifier contains the device IP address and device ID. This information must be broadcast to all ranks in the cluster to initialize the communicator.

  • In each thread, callHcclCommInitRootInfo()to initialize the communicator based on the rootinfo identifier.

  • Call theHcclBatchSendRecv()API to send data to the next node while receiving data from the previous node, and display the result.

Directory Structure

├── main.cc # Sample source file ├── Makefile # Compilation and build configuration file └── batch_send_recv_ring # Compiled executable file

Environment Preparation

Environment Requirements

This sample supports the following products in a single-server N-card configuration (N >= 2):

  • Ascend 950PR / Ascend 950DT
  • Atlas A3 Training Series Products / Atlas A3 Inference Series Products
  • Atlas A2 Training Series Products
  • Atlas Training Series Products

Setting Environment Variables

# Set CANN environment variables. The following uses the root user default installation path as an example. source /usr/local/Ascend/cann/set_env.sh

Compiling and Running the Sample

Run the following commands in the sample code directory:

make make test

Note: You can set theHCCL_OP_EXPANSION_MODEenvironment variable to configure the task orchestration expansion location of communication algorithms. For the supported ranges for different product models, see the usage instructions for this environment variable in the Environment Variable List.

# Set the orchestration expansion location of communication algorithms to the AI CPU on the Device side. The Device side automatically selects the corresponding scheduler based on the hardware model. export HCCL_OP_EXPANSION_MODE=AI_CPU

Sample Output

ThesendBufcontent on each node is initialized to the Device ID. Data is sent to the next node and received from the previous node. Therefore, each node receives the Device ID of the previous node.

Found 8 NPU device(s) available rankId: 0, output: [ 7 7 7 7 7 7 7 7 ] rankId: 1, output: [ 0 0 0 0 0 0 0 0 ] rankId: 2, output: [ 1 1 1 1 1 1 1 1 ] rankId: 3, output: [ 2 2 2 2 2 2 2 2 ] rankId: 4, output: [ 3 3 3 3 3 3 3 3 ] rankId: 5, output: [ 4 4 4 4 4 4 4 4 ] rankId: 6, output: [ 5 5 5 5 5 5 5 5 ] rankId: 7, output: [ 6 6 6 6 6 6 6 6 ]

【免费下载链接】hccl集合通信库(Huawei Collective Communication Library,简称HCCL)是基于昇腾AI处理器的高性能集合通信库,为计算集群提供高性能、高可靠的通信方案项目地址: https://gitcode.com/cann/hccl

创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考

http://www.jsqmd.com/news/1120282/

相关文章:

  • postcss-write-svg常见问题解答:新手必知的8个疑难解决方法
  • NixOps4完全指南:如何用Nix声明式管理资源与部署
  • Steam Achievement Manager完整指南:开源Steam成就管理工具终极教程
  • 思源宋体完整使用指南:7种字重免费开源字体终极教程
  • Websocket-Rails部署指南:独立服务器模式与生产环境配置
  • CMS扩展性测试:Instatic插件加载性能与资源占用全解析
  • VS Code 1.26 发布:新增安全模式,多维度功能升级助力开发者
  • 如何在30分钟内部署kube-prod-runtime?多平台快速入门教程
  • 终极视频画质修复指南:如何用Video2X免费实现4K超分辨率与智能插帧
  • 紫队演练框架PTEF版本演进:从v1到v3的重要改进与最佳实践
  • 别再按固定间隔截帧了:claude-real-video 让任意大模型真正“看懂”视频
  • Genome转换器详解:Swift中自定义数据类型的映射与序列化完整指南
  • 如何部署高可用GhostDB集群?企业级分布式存储解决方案终极指南 [特殊字符]
  • 30天掌握AIGC:从Transformer到项目实战
  • 2023最新Python-Backdoor安装指南:从克隆到配置的完整步骤
  • 内容自动化工作流:Instatic与IFTTT、Zapier集成的终极指南
  • 如何配置Instatic内容发布审批工作流与权限控制
  • Windows Research Kernel (WRK) 性能优化:深入分析Windows内核调度算法
  • 噪声条件得分网络(NCSN)训练攻略:参数设置与优化技巧
  • Spectre社区与生态系统:如何贡献代码和参与项目开发
  • Genome快速入门:5分钟内学会Swift JSON数据映射
  • 秒懂Flink:PyFlink Python API开发入门到精通
  • jqjq性能优化技巧:提升解释器执行效率的10个终极方法
  • 从论文到代码:深入理解RingAttention的块注意力计算逻辑
  • CANN/asc-devkit SIMD对齐数据搬运接口
  • CMS容器编排工具:Instatic与Docker Swarm配置
  • 2023终极指南:GhostDB分布式键值存储系统快速上手指南
  • 西工大软院大二软件工程案例分析:nwpu-cram复习资料全攻略
  • Ovine CLI命令完全手册:提升开发效率的10个必备技巧
  • CANN PID窗口化残差诊断算子API参考