Execution Configuration Optimizations Sample Introduction
【免费下载链接】asc-devkit本项目是CANN 推出的昇腾AI处理器专用的算子程序开发语言,原生支持C和C++标准规范,主要由类库和语言扩展层构成,提供多层级API,满足多维场景算子开发诉求。项目地址: https://gitcode.com/cann/asc-devkit
Overview
Operator execution configuration optimization introduces gridDim/blockDim related configurations in SIMT programming model through direct invocation using <<<>>>. Currently, optimization cases for setting maximum thread blocks are provided to fully utilize registers and optimize operator execution performance.
Sample List
| Directory Name | Description |
|---|---|
| sincos_compute | This example uses sincos computation as an example to compare performance differences between using default values and 512 for __launch_bounds__, demonstrating the impact of register spilling on performance in SIMT programming and methods for configuring maximum thread count per block. |
| grid_dim_config | Using the Gather operator as an example, demonstrate the impact of different thread block configuration strategies on operator performance under different data size scenarios and provide corresponding optimization guidance. |
【免费下载链接】asc-devkit本项目是CANN 推出的昇腾AI处理器专用的算子程序开发语言,原生支持C和C++标准规范,主要由类库和语言扩展层构成,提供多层级API,满足多维场景算子开发诉求。项目地址: https://gitcode.com/cann/asc-devkit
创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考