创建一个LessEqual算子,对标torch.le
https://pytorch.org/docs/1.5.0/torch.html?highlight=torch%20le#torch.le
下载mindstudio免安装版本
https://www.hiascend.com/software/mindstudio/download
clone canndev
cd ~
git clone https://gitee.com/ascend_wuyongkang/canndev.git
cd canndev
./build.sh --aicpu -u -j100
报错
CMake 3.14 or higher is required. You are running version 3.10.2
sudo apt remove --purge cmake
hash -r
sudo snap install cmake --classic
cmake --version
export ASCEND_CUSTOM_PATH=$HOME/Ascend/ascend-toolkit/latest
重新执行./build.sh --aicpu -u -j100
find …/…/…/ -name “*”
算子还是太难了。
那我们先参考这个做个单算子调用
https://gitee.com/ascend/samples/wikis/%E8%AE%AD%E7%BB%83%E8%90%A5/CANN%E8%AE%AD%E7%BB%83%E8%90%A5–%E5%8D%95%E7%AE%97%E5%AD%90%E8%B0%83%E7%94%A8
conv2d算子验证
https://gitee.com/ascend/samples/tree/master/cplusplus/level1_single_api/4_op_dev/2_verify_op/acl_execute_conv2d
cd samples/cplusplus/level1_single_api/4_op_dev/2_verify_op/acl_execute_conv2d
export DDK_PATH=$HOME/Ascend/ascend-toolkit/latest
export NPU_HOST_LIB=$DDK_PATH/acllib/lib64/stub
#用于设置python3.7.5库文件路径
export LD_LIBRARY_PATH=/usr/local/python3.7.5/lib:$LD_LIBRARY_PATH
#如果用户环境存在多个python3版本,则指定使用python3.7.5版本
export PATH=/usr/local/python3.7.5/bin:$PATH
cd run/out/
atc --singleop=test_data/config/conv2d_tik_op.json --soc_version=Ascend310 --output=op_models
然后就报错了
EZ3003: No supported Ops kernel and engine are found for [Conv2DTik], optype [Conv2DTik].
查了一圈,没查到,只能暂且先放弃了,走下一步
前面跳过了一步,没有acl文件,猜想是不是因为aclLite没有初始化编译?
cd ${HOME}/samples/cplusplus/common/acllite
make
make install
貌似还真是第一次编译,不然不会这么大串信息
最后还是失败了,那我没办法了。
然后我们屡败屡战,看下面这个高清图像修复,用到了matmul_27648.json算子
https://gitee.com/ascend/samples/tree/master/python/level2_simple_inference/6_other/imageinpainting_hifill
首先看版本,我们的是符合要求的
安装第三方依赖
https://gitee.com/ascend/samples/tree/master/python/environment
这个算子又转换成功了
与官方文档不太一样,这里可以直接python3.7.5
效果很不错,真的很高清,不过这个算法是怎么识别到右上角是主角,然后留下他的呢
回到前面的问题,
是不是算子名字变了,但是文档还没改呢?
cd samples/cplusplus/level1_single_api/4_op_dev/2_verify_op/acl_execute_conv2d
cd run/out/
cp test_data/config/conv2d_tik_op.json test_data/config/Conv2D.json
vi test_data/config/Conv2D.json
atc --singleop=test_data/config/Conv2D.json --soc_version=Ascend310 --output=op_models
终于成功了
但是前面那个问题还是没有解决
参考:https://www.hiascend.com/document/detail/zh/mindstudio/50RC1/msug/msug_000215.html
使用AICPU算子开发开发方式实现LessEqual算子,对标torch.le
torch.le:https://pytorch.org/docs/stable/generated/torch.le.html
算子原型定义
/**
* Copyright (C) 2020. Huawei Technologies Co., Ltd. All rights reserved.
* This program is free software; you can redistribute it and/or modify
* it under the terms of the Apache License Version 2.0.You may not use this file except in compliance with the License.
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* Apache License for more details at
* http://www.apache.org/licenses/LICENSE-2.0
*
* @file add_dsl.h
*
* @brief
*
* @version 1.0
*
*/
#ifndef GE_OPS_OP_PROTO_ADDDSL_H_
#define GE_OPS_OP_PROTO_ADDDSL_H_
#include "graph/operator_reg.h"
namespace ge {
REG_OP(AddDsl)
.INPUT(x1, TensorType({DT_FLOAT, DT_INT32, DT_INT64, DT_FLOAT16, DT_INT16,
DT_INT8, DT_UINT8, DT_DOUBLE, DT_COMPLEX128,
DT_COMPLEX64, DT_STRING}))
.INPUT(x2,
TensorType({DT_FLOAT, DT_INT32, DT_INT64, DT_FLOAT16, DT_INT16,
DT_INT8, DT_UINT8, DT_DOUBLE, DT_COMPLEX128,
DT_COMPLEX64, DT_STRING}))
.OUTPUT(y,
TensorType({DT_FLOAT, DT_INT32, DT_INT64, DT_FLOAT16, DT_INT16,
DT_INT8, DT_UINT8, DT_DOUBLE, DT_COMPLEX128,
DT_COMPLEX64, DT_STRING}))
.OP_END_FACTORY_REG(AddDsl)
}
#endif //GE_OPS_OP_PROTO_ADDDSL_H_
lessequal.cc实现
/**
* Copyright (C) 2020. Huawei Technologies Co., Ltd. All rights reserved.
* This program is free software; you can redistribute it and/or modify
* it under the terms of the Apache License Version 2.0.You may not use this file except in compliance with the License.
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* Apache License for more details at
* http://www.apache.org/licenses/LICENSE-2.0
*
* @file add_dsl.h
*
* @brief
*
* @version 1.0
*
*/
#ifndef GE_OPS_OP_PROTO_ADDDSL_H_
#define GE_OPS_OP_PROTO_ADDDSL_H_
#include "graph/operator_reg.h"
namespace ge {
REG_OP(AddDsl)
.INPUT(x1, TensorType({DT_FLOAT, DT_INT32, DT_INT64, DT_FLOAT16, DT_INT16,
DT_INT8, DT_UINT8, DT_DOUBLE, DT_COMPLEX128,
DT_COMPLEX64, DT_STRING}))
.INPUT(x2,
TensorType({DT_FLOAT, DT_INT32, DT_INT64, DT_FLOAT16, DT_INT16,
DT_INT8, DT_UINT8, DT_DOUBLE, DT_COMPLEX128,
DT_COMPLEX64, DT_STRING}))
.OUTPUT(y,
TensorType({DT_FLOAT, DT_INT32, DT_INT64, DT_FLOAT16, DT_INT16,
DT_INT8, DT_UINT8, DT_DOUBLE, DT_COMPLEX128,
DT_COMPLEX64, DT_STRING}))
.OP_END_FACTORY_REG(AddDsl)
}
#endif //GE_OPS_OP_PROTO_ADDDSL_H_
cpukernel/impl/lessequal_kernel.h
cpukernel/impl/lessequal_kernel.cc
cpukernel/op_info_cfg/aicpu_kernel/reshape_cust.ini
framework/tf_plugin/tensorflow_lessequal_plugin.cc
这里遇到了一个问题,就是按照文档来做,右键没有找到New Cases > AI CPU UT Case
但是就算没有自动生成模板,我们也可以自己写下:
testcases/ut/aicpu_test/lessequal/test_lessequal_impl.cc
testcases/ut/aicpu_test/lessequal/test_lessequal_proto.cc
连接远程云服务器成功后,进行编译
旧版本需要在130行往后添加代码,我们这次新版本就不用
单独打开算子工程文件夹,
然后进行编译
我在ascendtoolkit的安装路径是/home/HwHiAiUser/Ascend/ascend-toolkit,因此配置环境变量
ASCEND_OPP_PATH=/home/HwHiAiUser/Ascend/ascend-toolkit/latest/opp;
TOOLCHAIN_DIR=/home/HwHiAiUser/Ascend/ascend-toolkit/latest/toolkit/toolchain/hcc;
ASCEND_TENSOR_COMPILER_INCLUDE=/home/HwHiAiUser/Ascend/ascend-toolkit/latest/include;
ASCEND_AICPU_PATH=/home/HwHiAiUser/Ascend/ascend-toolkit/latest
貌似我在中文路径下,这就不太行,那么我们改到I盘根目录。
这样就大题实现了算子功能了