2.6.云空间开发与测试
以BM1684X-PCIE通用云开发空间运行sophgo/sophon-demo为例。
2.6.1.下载sophon-demo源码
apt update
git clone https://github.com/sophgo/sophon-demo.git
详细介绍请查看Github开源仓库:sophgo/sophon-demo
Last login: Thu May 9 12:07:21 2024 from 10.133.16.11
root@7562a0b9528a:~# apt update
Get:1 http://security.ubuntu.com/ubuntu jammy-security InRelease [110 kB]
Hit:2 http://archive.ubuntu.com/ubuntu jammy InRelease
Get:3 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [119 kB]
Get:4 http://security.ubuntu.com/ubuntu jammy-security/main amd64 Packages [1789 kB]
Get:5 http://archive.ubuntu.com/ubuntu jammy-backports InRelease [109 kB]
Get:6 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 Packages [1373 kB]
Get:7 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages [2069 kB]
Get:8 http://archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 Packages [2382 kB]
Get:9 http://archive.ubuntu.com/ubuntu jammy-updates/multiverse amd64 Packages [51.1 kB]
Get:10 http://archive.ubuntu.com/ubuntu jammy-backports/universe amd64 Packages [31.9 kB]
Get:11 http://archive.ubuntu.com/ubuntu jammy-backports/main amd64 Packages [81.0 kB]
Get:12 http://security.ubuntu.com/ubuntu jammy-security/universe amd64 Packages [1078 kB]
Get:13 http://security.ubuntu.com/ubuntu jammy-security/restricted amd64 Packages [2308 kB]
Get:14 http://security.ubuntu.com/ubuntu jammy-security/multiverse amd64 Packages [44.7 kB]
Fetched 11.5 MB in 5s (2524 kB/s)
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
65 packages can be upgraded. Run 'apt list --upgradable' to see them.
root@7562a0b9528a:~# git clone https://github.com/sophgo/sophon-demo.git
Cloning into 'sophon-demo'...
remote: Enumerating objects: 7262, done.
remote: Counting objects: 100% (3965/3965), done.
remote: Compressing objects: 100% (1483/1483), done.
remote: Total 7262 (delta 2620), reused 3510 (delta 2439), pack-reused 3297
Receiving objects: 100% (7262/7262), 49.31 MiB | 11.70 MiB/s, done.
Resolving deltas: 100% (4303/4303), done.
root@7562a0b9528a:~# ls
2.6.2.YOLOv7模型和数据集
本案例使用scripts目录下提供的下载脚本download.sh,获取相关模型和数据集。
# 安装7z和zip,若已安装请跳过,非ubuntu系统视情况使用yum或其他方式安装
apt install unzip
apt install p7zip
apt install p7zip-full
cd sophon-demo/sample/YOLOv7
chmod -R +x scripts/
./scripts/download.sh
下载的模型包括:
./models
├── BM1684
│ ├── yolov7_v0.1_3output_fp32_1b.bmodel # 使用TPU-MLIR编译,用于BM1684的FP32 BModel,batch_size=1
│ ├── yolov7_v0.1_3output_fp32_4b.bmodel # 使用TPU-MLIR编译,用于BM1684的FP32 BModel,batch_size=4
│ ├── yolov7_v0.1_3output_int8_1b.bmodel # 使用TPU-MLIR编译,用于BM1684的INT8 BModel,batch_size=1
│ └── yolov7_v0.1_3output_int8_4b.bmodel # 使用TPU-MLIR编译,用于BM1684的INT8 BModel,batch_size=4
├── BM1684X
│ ├── yolov7_v0.1_3output_fp16_1b.bmodel # 使用TPU-MLIR编译,用于BM1684X的FP16 BModel,batch_size=1
│ ├── yolov7_v0.1_3output_fp16_4b.bmodel # 使用TPU-MLIR编译,用于BM1684X的FP16 BModel,batch_size=4
│ ├── yolov7_v0.1_3output_fp32_1b.bmodel # 使用TPU-MLIR编译,用于BM1684X的FP32 BModel,batch_size=1
│ ├── yolov7_v0.1_3output_fp32_4b.bmodel # 使用TPU-MLIR编译,用于BM1684X的FP32 BModel,batch_size=4
│ ├── yolov7_v0.1_3output_int8_1b.bmodel # 使用TPU-MLIR编译,用于BM1684X的INT8 BModel,batch_size=1
│ └── yolov7_v0.1_3output_int8_4b.bmodel # 使用TPU-MLIR编译,用于BM1684X的INT8 BModel,batch_size=4
├── BM1688
│ ├── yolov7_v0.1_3output_fp16_1b_2core.bmodel # 使用TPU-MLIR编译,用于BM1688的FP16 BModel,batch_size=1,num_core=2
│ ├── yolov7_v0.1_3output_fp16_1b.bmodel # 使用TPU-MLIR编译,用于BM1688的FP16 BModel,batch_size=1
│ ├── yolov7_v0.1_3output_fp32_1b_2core.bmodel # 使用TPU-MLIR编译,用于BM1688的FP32 BModel,batch_size=1,num_core=2
│ ├── yolov7_v0.1_3output_fp32_1b.bmodel # 使用TPU-MLIR编译,用于BM1688的FP32 BModel,batch_size=1
│ ├── yolov7_v0.1_3output_int8_1b_2core.bmodel # 使用TPU-MLIR编译,用于BM1688的INT8 BModel,batch_size=1,num_core=2
│ ├── yolov7_v0.1_3output_int8_1b.bmodel # 使用TPU-MLIR编译,用于BM1688的INT8 BModel,batch_size=1
│ ├── yolov7_v0.1_3output_int8_4b_2core.bmodel # 使用TPU-MLIR编译,用于BM1688的INT8 BModel,batch_size=4,num_core=2
│ └── yolov7_v0.1_3output_int8_4b.bmodel # 使用TPU-MLIR编译,用于BM1688的INT8 BModel,batch_size=4
├── CV186X
│ ├── yolov7_v0.1_3output_fp16_1b.bmodel # 使用TPU-MLIR编译,用于CV186X的FP16 BModel,batch_size=1
│ ├── yolov7_v0.1_3output_fp32_1b.bmodel # 使用TPU-MLIR编译,用于CV186X的FP32 BModel,batch_size=1
│ ├── yolov7_v0.1_3output_int8_1b.bmodel # 使用TPU-MLIR编译,用于CV186X的INT8 BModel,batch_size=1
│ └── yolov7_v0.1_3output_int8_4b.bmodel # 使用TPU-MLIR编译,用于CV186X的INT8 BModel,batch_size=4
├── onnx
│ ├── yolov7_qtable
│ ├── yolov7_v0.1_3output_1b.onnx # 导出的onnx模型,batch_size=1
│ └── yolov7_v0.1_3output_4b.onnx # 导出的onnx模型,batch_size=4
└── torch
└── yolov7_v0.1_3outputs.torchscript.pt # trace后的torchscript模型
下载的数据包括:
./datasets
├── test # 测试图片
├── test_car_person_1080P.mp4 # 测试视频
├── coco.names # coco类别名文件
├── coco128 # coco128数据集,用于模型量化
└── coco
├── val2017_1000 # coco val2017_1000数据集:coco val2017中随机抽取的1000张样本
└── instances_val2017_1000.json # coco val2017数据集标签文件,用于计算精度评价指标
模型信息:

2.6.3.YOLOv7例程测试
由于使用下载好的BModel,因此可以跳过模型编译环节。此外可能还需要安装其他第三方库:
pip3 install 'opencv-python-headless<4.3'
python例程不需要编译,可以直接运行,参数说明如下:
yolov7_opencv.py和yolov7_bmcv.py的参数一致,以yolov7_opencv.py为例:
usage: yolov7_opencv.py [--input INPUT_PATH] [--bmodel BMODEL] [--dev_id DEV_ID]
[--conf_thresh CONF_THRESH] [--nms_thresh NMS_THRESH]
--input: 测试数据路径,可输入整个图片文件夹的路径或者视频路径;
--bmodel: 用于推理的bmodel路径,默认使用stage 0的网络进行推理;
--dev_id: 用于推理的tpu设备id;
--conf_thresh: 置信度阈值;
--nms_thresh: nms阈值。
图片测试实例如下,支持对整个图片文件夹进行测试。
python3 python/yolov7_opencv.py --input datasets/test --bmodel models/BM1684X/yolov7_v0.1_3output_fp32_1b.bmodel --dev_id 0 --conf_thresh 0.5 --nms_thresh 0.5
图片测试实例测试结果如下:
测试结束后,会将预测的图片保存在results/images下,预测的结果保存在results/yolov7_v0.1_3output_fp32_1b.bmodel_test_opencv_python_result.json下,同时会打印预测结果、推理时间等信息。
root@7562a0b9528a:~/sophon-demo/sample/YOLOv7# python3 python/yolov7_opencv.py --input datasets/test --bmodel models/BM1684X/yolov7_v0.1_3output_fp32_1b.bmodel --dev_id 0 --conf_thresh 0.5 --nms_thresh 0.5
[/workspace/middleware-soc/bm_opencv/modules/core/src/cv_bmcpu.cpp:49->InternalBMCpuRegister]total 16 devices need to enable on-chip CPU. It may need serveral minutes for loading, please be patient....
...................................................................................................
[BMRT][bmcpu_setup:435] INFO:cpu_lib 'libcpuop.so' is loaded.
bmcpu init: skip cpu_user_defined
open usercpu.so, init user_cpu_init
[BMRT][BMProfile:59] INFO:Profile For arch=3
[BMRT][BMProfileDeviceBase:190] INFO:gdma=0, tiu=0, mcu=0
[BMRT][load_bmodel:1594] INFO:Loading bmodel from [models/BM1684X/yolov7_v0.1_3output_fp32_1b.bmodel]. Thanks for your patience...
[BMRT][load_bmodel:1503] INFO:pre net num: 0, load net num: 1
[BMRT][load_tpu_module:1575] INFO:loading firmare in bmodel
INFO:root:load models/BM1684X/yolov7_v0.1_3output_fp32_1b.bmodel success!
INFO:root:1, img_file: datasets/test/zidane.jpg
Open /dev/bm-sophon0 successfully, device index = 0, jpu fd = 13, vpp fd = 13
INFO:root:2, img_file: datasets/test/000000547383.jpg
INFO:root:3, img_file: datasets/test/3.jpg
INFO:root:4, img_file: datasets/test/dog.jpg
INFO:root:result saved in ./results/yolov7_v0.1_3output_fp32_1b.bmodel_test_opencv_python_result.json
INFO:root:------------------ Predict Time Info ----------------------
INFO:root:decode_time(ms): 3.02
INFO:root:preprocess_time(ms): 8.10
INFO:root:inference_time(ms): 104.29
INFO:root:postprocess_time(ms): 13.54
all done.
[/workspace/middleware-soc/bm_opencv/modules/core/src/cv_bmcpu.cpp:115->~InternalBMCpuRegister]deconstructor function is called
视频测试实例如下,支持对视频流进行测试。
python3 python/yolov7_opencv.py --input datasets/test_car_person_1080P.mp4 --bmodel models/BM1684X/yolov7_v0.1_3output_fp32_1b.bmodel --dev_id 0 --conf_thresh 0.5 --nms_thresh 0.5
视频测试实例测试结果如下:
测试结束后,会将预测的结果画在results/test_car_person_1080P.avi中,同时会打印预测结果、推理时间等信息。
yolov7_bmcv.py会将预测结果画在图片上并保存在results/images中。
root@7562a0b9528a:~/sophon-demo/sample/YOLOv7# python3 python/yolov7_opencv.py --input datasets/test_car_person_1080P.mp4 --bmodel models/BM1684X/yolov7_v0.1_3output_fp32_1b.bmodel --dev_id 0 --conf_thresh 0.5 --nms_thresh 0.5
.......................部分输出省略................................................
[BMRT][bmcpu_setup:435] INFO:cpu_lib 'libcpuop.so' is loaded.
bmcpu init: skip cpu_user_defined
open usercpu.so, init user_cpu_init
[BMRT][BMProfile:59] INFO:Profile For arch=3
[BMRT][BMProfileDeviceBase:190] INFO:gdma=0, tiu=0, mcu=0
[BMRT][load_bmodel:1594] INFO:Loading bmodel from [models/BM1684X/yolov7_v0.1_3output_fp32_1b.bmodel]. Thanks for your patience...
[BMRT][load_bmodel:1503] INFO:pre net num: 0, load net num: 1
[BMRT][load_tpu_module:1575] INFO:loading firmare in bmodel
INFO:root:load models/BM1684X/yolov7_v0.1_3output_fp32_1b.bmodel success!
BMvidDecCreateW5 board id 0 coreid 0
[VDI] Open board 0, core 0, fd 14, dev /dev/bm-sophon0
libbmvideo.so addr : /opt/sophon/libsophon-current/lib/libbmvideo.so.0, name_len: 34
vpu firmware addr: /opt/sophon/libsophon-current/lib/vpu_firmware/chagall_dec.bin
[VDI] Open board 0, core 0, fd 15, dev /dev/bm-sophon0
VERSION=0, REVISION=213135
.......................部分输出省略................................................
maybe grab ends normally, retry count = 513
INFO:root:result saved in ./results/test_car_person_1080P.avi
INFO:root:------------------ Predict Time Info ----------------------
INFO:root:decode_time(ms): 5.86
INFO:root:preprocess_time(ms): 7.45
INFO:root:inference_time(ms): 104.00
INFO:root:postprocess_time(ms): 14.06
all done.
[/workspace/middleware-soc/bm_opencv/modules/core/src/cv_bmcpu.cpp:115->~InternalBMCpuRegister]deconstructor function is called