楼主: 途中_jed
115 0

[教育经济学基本知识] 【CANN训练营】+学习打卡+ascendoptest工具测试 [推广有奖]

  • 0关注
  • 0粉丝

等待验证会员

学前班

80%

还不是VIP/贵宾

-

威望
0
论坛币
0 个
通用积分
0
学术水平
0 点
热心指数
0 点
信用等级
0 点
经验
30 点
帖子
2
精华
0
在线时间
0 小时
注册时间
2018-10-9
最后登录
2018-10-9

楼主
途中_jed 发表于 2025-12-5 16:57:22 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

基于昇腾CANN构建高效且易用的Ascend C算子测试工具——精度与性能一体化验证方案

环境搭建

使用华为云Notebook服务,加载指定镜像以确保开发环境一致性:

swr.cn-southwest-2.myhuaweicloud.com/chenhui/cann8.3.rc1_python_3_9:3.0

项目获取

从远程仓库拉取完整项目代码,初始化本地工作空间。

git clone https://gitcode.com/cann/ops-math.git
git clone https://gitee.com/sutonghua/ascendoptest.git

算子编译流程

完成源码获取后,进入算子目录并执行编译指令,生成目标算子二进制文件。

source /home/ma-user/Ascend/ascend-toolkit/set_env.sh
cd ops-math
bash build.sh --pkg --soc=ascendxxxx --ops=add_example

安装部署

将编译完成的算子模块进行安装,使其可被后续测试框架调用。

./build_out/cann-ops-math-custom_linux-aarch64.run

测试前准备

确保测试运行环境已正确配置,包括依赖库、路径设置及设备状态检查。

cd ascendoptest
pip install ml_dtypes

关键环境变量设置

根据CANN运行时要求,配置必要的环境变量,保障算子能够正常加载和执行。

export LD_LIBRARY_PATH=/home/ma-user/Ascend/ascend-toolkit/latest/opp/vendors/custom_math/op_api/lib/:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH=/home/ma-user/Ascend/ascend-toolkit/latest/tools/simulator/Ascendxxxxx/lib:$LD_LIBRARY_PATH

算子信息定义文件创建

新建JSON格式的算子描述文件,用于声明输入输出参数结构。示例如下:

[
  {
    "op": "AddExample",
    "input_desc": [
      {
        "name": "x",
        "param_type": "required",
        "format": ["ND", "ND"],
        "type": ["float", "int32"]
      },
      {
        "name": "y",
        "param_type": "required",
        "format": ["ND", "ND"],
        "type": ["float", "int32"]
      }
    ],
    "output_desc": [
      {
        "name": "z",
        "param_type": "required",
        "format": ["ND", "ND"],
        "type": ["float", "int32"]
      }
    ]
  }
]
add_example_prototype.json

测试用例配置文件生成

依次创建多个测试场景所需的配置文件,适配不同数据类型与维度组合。

add_example_cases.json
[
    {
        "case_name": "Test_001",
        "op_name": "AddExample",
        "case_path": "",
        "expect_func":"./custom_add.py:custom_add",
        "input_desc": [
            {
                "name": "x",
                "format": "ND",
                "data_type": "float",
                "param_type":"required",
                "shape": [32,4,4,4],
                "data_path":"",
                "value_range":[0,100]
            },
            {
                "name": "y",
                "format": "ND",
                "data_type": "float",
                "param_type":"required",
                "shape": [32,4,4,4],
                "data_path":"",
                "value_range":[0,100]
            }
        ],
        "output_desc": [
            {
                "name": "z",
                "format": "ND",
                "data_type": "float",
                "param_type":"required",
                "shape": [32,4,4,4],
                "data_path":"",
                "golden_path":"",
                "err_threshold":[0.001,0.001]
            }
        ],
        "attr_desc": [
        ]
    }
]

新增测试项支持

扩展测试覆盖范围,添加新的功能测试条目。

custom_add.py
def custom_add(a, b):
    c = a + b
    return [c]

精度验证测试

运行精度比对脚本,评估自定义算子在多种输入条件下的数值准确性。

python run_test.py -i add_example_prototype.json -c add_example_cases.json  --op-type "custom"  --op-path "/home/ma-user/Ascend/ascend-toolkit/latest/opp/vendors/custom_math/op_api"

Application级别性能测试

模拟实际应用场景,测量端到端执行时间与资源消耗情况。

python run_test.py -i add_example_prototype.json -c add_example_cases.json  --op-type "custom"  --op-path "/home/ma-user/Ascend/ascend-toolkit/latest/opp/vendors/custom_math/op_api" --msprof -d ./msprof

Op级别性能测试

针对单个算子进行细粒度性能打点,分析其在NPU上的执行效率。

python run_test.py -i add_example_prototype.json -c add_example_cases.json  --op-type "custom"  --op-path "/home/ma-user/Ascend/ascend-toolkit/latest/opp/vendors/custom_math/op_api" --msprof --op -d ./msprof


"""
2025-11-19 16:19:35 [INFO]  Performance Summary Report:

        1) MTE2 bandwidth utilization lower than 80% when active.
        2) MTE3 bandwidth utilization lower than 80% when active.
        3) aivector compute usage lower than 20%.

2025-11-19 16:19:35 [INFO]  Operator Basic Information:

        Op Name: AddExample_a1532827238e1555db7b997c7bce2928_high_performance_0
        Op Type: vector
        Task Duration(us): 8.140000
        Block Dim: 8
        Mix Block Dim: 
        Device Id: 0
        Pid: 98883
        Current Freq: N/A
        Rated Freq: 1650

2025-11-19 16:19:35 [INFO]  Profiling results saved in /home/ma-user/work/ascendoptest/msprof/Test_001_20251119161924/OPPROF_20251119161926_VXHEJATSWLPWGQUL
2025-11-19 16:19:35 [INFO]  Profiling data parse finished.
2025-11-19 16:19:35 [INFO]  Op profiling finish. Welcome to next use.
case_name: Test_001, output_name: z  compare passed
************************************************************
************************************************************
run case Test_001 result:

case_name,name,data_path,golden_path,compare_result
Test_001, x, /home/ma-user/work/ascendoptest/op_test/addexample_test_001_20251119161924/input/x.bin,, Test_001, y, /home/ma-user/work/ascendoptest/op_test/addexample_test_001_20251119161924/input/y.bin,, Test_001, z, op_test/addexample_test_001_20251119161924/output/z.bin, op_test/addexample_test_001_20251119161924/output/golden_z.bin,pass
************************************************************
************************************************************
end run case Test_001
"""

Op Simulator仿真测试

利用模拟器环境对算子行为进行预测性验证,辅助调试与优化。

python run_test.py -i add_example_prototype.json -c add_example_cases.json  --op-type "custom"  --op-path "/home/ma-user/Ascend/ascend-toolkit/latest/opp/vendors/custom_math/op_api" --msprof --op  --sim -d ./msprof

"""
start run case Test_001
gen data x  success, data save in  /home/ma-user/work/ascendoptest/op_test/addexample_test_001_20251119162308/input/x.bin
gen data y  success, data save in  /home/ma-user/work/ascendoptest/op_test/addexample_test_001_20251119162308/input/y.bin
gen golden data to:  /home/ma-user/work/ascendoptest/op_test/addexample_test_001_20251119162308/output/golden_z.bin
2025-11-19 16:23:10 [INFO]  Op profiling analysis start.
2025-11-19 16:23:10 [INFO]  Running simulation task: Binary Simulation Running, use simulator in LD_LIBRARY_PATH
[INFO]  Running case: Test_001
[INFO] Config file [config_stars.json] from environment variable [CAMODEL_CONFIG_PATH]. Path: /home/ma-user/work/ascendoptest/msprof/Test_001_20251119162308/OPPROF_20251119162310_JNDLRIBCWWWNCXVK/device0/tmp_dump/config/config_stars.json
[INFO] Config file is found, path is /home/ma-user/work/ascendoptest/msprof/Test_001_20251119162308/OPPROF_20251119162310_JNDLRIBCWWWNCXVK/device0/tmp_dump/config/config_stars.json.
[FuncCache]: size:0x20000, line_size:128, way_num:16, line_num:1024, idx_num:64
             idx_lsb:7, idx_mask:0x3f, tag_lsb:13, tag_mask:0xffffffffffffffff, ofst_mask:0x7f

[TmSim]: Run in parallel worker mode, core num is: 24

[INFO] Config file [config.json] from environment variable [CAMODEL_CONFIG_PATH]. Path: /home/ma-user/work/ascendoptest/msprof/Test_001_20251119162308/OPPROF_20251119162310_JNDLRIBCWWWNCXVK/device0/tmp_dump/config/config.json
[INFO] AicWrapper attach AIC 0, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 1, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 2, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 3, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 4, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 5, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 6, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 7, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 8, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 9, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 10, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 11, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 12, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 13, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 14, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 15, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 16, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 17, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 18, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 19, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 20, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 21, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 22, num_vec_core=2, num_subcore=3
[INFO] AicWrapper attach AIC 23, num_vec_core=2, num_subcore=3
[INFO] Chip 0 AIC / Scheduler / Soc periods: 200.0000 / 200.0000 / 105.0000
[INFO] chip 0 die 0 device created
>> Start ModelParsim with 24 threads for 25 thread units, mode is 0
================================================================================
>>>>                                                                        
>>>>                             " PEM MODEL "                                
>>>>             Total no. of 1 chip(s) Model Init Success!                
>>>>                                                                        
================================================================================
[INFO] Model Start Time: 2025-11-19 16:23:12
[DRVSTUB_LOG] driver_api.c:550 sendSwapBuf:swapbuf_base_addr:10000000
[DRVSTUB_LOG] driver_api.c:551 sendSwapBuf:sq:0 swapbuf_addr:10000000
[DRVSTUB_LOG] driver_api.c:550 sendSwapBuf:swapbuf_base_addr:10000000
[DRVSTUB_LOG] driver_api.c:551 sendSwapBuf:sq:1 swapbuf_addr:10000040
[DRVSTUB_LOG] driver_api.c:550 sendSwapBuf:swapbuf_base_addr:10000000
[DRVSTUB_LOG] driver_api.c:551 sendSwapBuf:sq:2 swapbuf_addr:10000080
[INFO]  Input preparation success.
[INFO]  Output preparation success.
[INFO] <ProfInit> Start profiling on kernel: AddExample_a1532827238e1555db7b997c7bce2928_high_performance_0
2025-11-19 16:23:15 [INFO]  Extract 722 relations from kernel
2025-11-19 16:23:15 [WARN]  Kernel missed debug_line information. If you need code call stack, please recompile kernel with -g option
[DRVSTUB_LOG] driver_api.c:2213 send_stars_interrupt:get cq_0 base_addr: 10020000
[INFO]  Write output success.
[INFO] Model Stop Time: 2025-11-19 16:23:21
Model RUN TIME: 8923.83 ms
[INFO] Total tick: 42142
[INFO] Model stopped successfully.
[INFO]  Successfully generated output for 'Test_001' !
2025-11-19 16:23:25 [WARN]  Code call stack is empty
2025-11-19 16:23:25 [WARN]  Lack of code info of files
2025-11-19 16:23:25 [INFO]  Core operator results run in simulator as follow:
core_name           duration_time(us)   running_time(us)    
core0.veccore0      7.56                7.13                
core1.veccore0      7.55                7.14                
core2.veccore0      7.56                7.14                
core3.veccore0      7.55                7.14                
core4.veccore0      7.55                7.13                
core5.veccore0      7.56                7.13                
core6.veccore0      7.55                7.13                
core7.veccore0      7.33                7.07                
2025-11-19 16:23:26 [INFO]  Profiling running finished. All task success.
2025-11-19 16:23:26 [INFO]  Start parse dump file
2025-11-19 16:23:26 [INFO]  Profiling results saved in /home/ma-user/work/ascendoptest/msprof/Test_001_20251119162308/OPPROF_20251119162310_JNDLRIBCWWWNCXVK
2025-11-19 16:23:26 [INFO]  Profiling data parse finished.
2025-11-19 16:23:26 [INFO]  Op profiling finish. Welcome to next use.
case_name: Test_001, output_name: z  compare passed
************************************************************
************************************************************
run case Test_001 result:

case_name,name,data_path,golden_path,compare_result
Test_001, x, /home/ma-user/work/ascendoptest/op_test/addexample_test_001_20251119162308/input/x.bin,, Test_001, y, /home/ma-user/work/ascendoptest/op_test/addexample_test_001_20251119162308/input/y.bin,, Test_001, z, op_test/addexample_test_001_20251119162308/output/z.bin, op_test/addexample_test_001_20251119162308/output/golden_z.bin,pass
************************************************************
************************************************************
end run case Test_001
"""

参考文档

性能分析相关内容详见官方技术手册:《工具概述 - CANN商用版8.3.RC1》 昇腾社区发布版本。

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:ASCE test NDO Est SCE

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
扫码
拉您进交流群
GMT+8, 2026-2-12 09:14