You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

6.3 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

TIPC Linux端Benchmark测试文档

该文档为Benchmark测试说明Benchmark预测功能测试的主程序为benchmark_train.sh,用于验证监控模型训练的性能。

1. 测试流程

1.1 准备数据和环境安装

运行test_tipc/prepare.sh,完成训练数据准备和安装环境流程。

# 运行格式bash test_tipc/prepare.sh  train_benchmark.txt  mode
bash test_tipc/prepare.sh test_tipc/configs/det_mv3_db_v2_0/train_infer_python.txt benchmark_train

1.2 功能测试

执行test_tipc/benchmark_train.sh,完成模型训练和日志解析

# 运行格式bash test_tipc/benchmark_train.sh train_benchmark.txt mode
bash test_tipc/benchmark_train.sh test_tipc/configs/det_mv3_db_v2_0/train_infer_python.txt benchmark_train

test_tipc/benchmark_train.sh支持根据传入的第三个参数实现只运行某一个训练配置,如下:

# 运行格式bash test_tipc/benchmark_train.sh train_benchmark.txt mode
bash test_tipc/benchmark_train.sh test_tipc/configs/det_mv3_db_v2_0/train_infer_python.txt benchmark_train  dynamic_bs8_fp32_DP_N1C1

dynamic_bs8_fp32_DP_N1C1为test_tipc/benchmark_train.sh传入的参数格式如下 ${modeltype}_${batch_size}_${fp_item}_${run_mode}_${device_num} 包含的信息有模型类型、batchsize大小、训练精度如fp32,fp16等、分布式运行模式以及分布式训练使用的机器信息如单机单卡N1C1

2. 日志输出

运行后将保存模型的训练日志和解析日志,使用 test_tipc/configs/det_mv3_db_v2_0/train_infer_python.txt 参数文件的训练日志解析结果是:

{"model_branch": "dygaph", "model_commit": "7c39a1996b19087737c05d883fd346d2f39dbcc0", "model_name": "det_mv3_db_v2_0_bs8_fp32_SingleP_DP", "batch_size": 8, "fp_item": "fp32", "run_process_type": "SingleP", "run_mode": "DP", "convergence_value": "5.413110", "convergence_key": "loss:", "ips": 19.333, "speed_unit": "samples/s", "device_num": "N1C1", "model_run_time": "0", "frame_commit": "8cc09552473b842c651ead3b9848d41827a3dbab", "frame_version": "0.0.0"}

训练日志和日志解析结果保存在benchmark_log目录下文件组织格式如下

train_log/
├── index
│   ├── PaddleOCR_det_mv3_db_v2_0_bs8_fp32_SingleP_DP_N1C1_speed
│   └── PaddleOCR_det_mv3_db_v2_0_bs8_fp32_SingleP_DP_N1C4_speed
├── profiling_log
│   └── PaddleOCR_det_mv3_db_v2_0_bs8_fp32_SingleP_DP_N1C1_profiling
└── train_log
    ├── PaddleOCR_det_mv3_db_v2_0_bs8_fp32_SingleP_DP_N1C1_log
    └── PaddleOCR_det_mv3_db_v2_0_bs8_fp32_SingleP_DP_N1C4_log

3. 各模型单卡性能数据一览

*注本节中的速度指标均使用单卡1块Nvidia V100 16G GPU测得。通常情况下。

模型名称 配置文件 大数据集 float32 fps 小数据集 float32 fps diff 大数据集 float16 fps 小数据集 float16 fps diff 大数据集大小 小数据集大小
ch_ppocr_mobile_v2.0_det config 53.836 53.343 / 53.914 / 52.785 0.020940758 45.574 45.57 / 46.292 / 46.213 0.015596647 10,000 2,000
ch_ppocr_mobile_v2.0_rec config 2083.311 2043.194 / 2066.372 / 2093.317 0.023944295 2153.261 2167.561 / 2165.726 / 2155.614 0.005511725 600,000 160,000
ch_ppocr_server_v2.0_det config 20.716 20.739 / 20.807 / 20.755 0.003268131 20.592 20.498 / 20.993 / 20.75 0.023579288 10,000 2,000
ch_ppocr_server_v2.0_rec config 528.56 528.386 / 528.991 / 528.391 0.001143687 1189.788 1190.007 / 1176.332 / 1192.084 0.013213834 600,000 160,000
ch_PP-OCRv2_det config 13.87 13.386 / 13.529 / 13.428 0.010569887 17.847 17.746 / 17.908 / 17.96 0.011915367 10,000 2,000
ch_PP-OCRv2_rec config 109.248 106.32 / 106.318 / 108.587 0.020895687 117.491 117.62 / 117.757 / 117.726 0.001163413 140,000 40,000
det_mv3_db_v2.0 config 61.802 62.078 / 61.802 / 62.008 0.00444602 82.947 84.294 / 84.457 / 84.005 0.005351836 10,000 2,000
det_r50_vd_db_v2.0 config 29.955 29.092 / 29.31 / 28.844 0.015899011 51.097 50.367 / 50.879 / 50.227 0.012814717 10,000 2,000
det_r50_vd_east_v2.0 config 42.485 42.624 / 42.663 / 42.561 0.00239083 67.61 67.825/ 68.299/ 68.51 0.00999854 10,000 2,000
det_r50_vd_pse_v2.0 config 16.455 16.517 / 16.555 / 16.353 0.012201752 27.02 27.288 / 27.152 / 27.408 0.009340339 10,000 2,000
rec_mv3_none_bilstm_ctc_v2.0 config 2288.358 2291.906 / 2293.725 / 2290.05 0.001602197 2336.17 2327.042 / 2328.093 / 2344.915 0.007622025 600,000 160,000
layoutxlm_ser config 18.001 18.114 / 18.107 / 18.307 0.010924783 21.982 21.507 / 21.116 / 21.406 0.018180127 1490 1490
PP-Structure-table config 14.151 14.077 / 14.23 / 14.25 0.012140351 16.285 16.595 / 16.878 / 16.531 0.020559308 20,000 5,000
det_r50_dcn_fce_ctw_v2.0 config 14.057 14.029 / 14.02 / 14.014 0.001069214 18.298 18.411 / 18.376 / 18.331 0.004345228 10,000 2,000
ch_PP-OCRv3_det config 8.622 8.431 / 8.423 / 8.479 0.006604552 14.203 14.346 14.468 14.23 0.016450097 10,000 2,000
PP-OCRv3_mobile_rec config 90.239 90.077 / 91.513 / 91.325 0.01569176 160,000 40,000