CN114821660A

CN114821660A - A pedestrian detection and reasoning method based on embedded devices

Info

Publication number: CN114821660A
Application number: CN202210512803.XA
Authority: CN
Inventors: 陈其宾; 李锐; 张晖
Original assignee: Shandong Inspur Science Research Institute Co Ltd
Current assignee: Shandong Inspur Science Research Institute Co Ltd
Priority date: 2022-05-12
Filing date: 2022-05-12
Publication date: 2022-07-29

Abstract

The invention provides a pedestrian detection and reasoning method based on an embedded device, which is used for running a calculation-intensive pedestrian detection model on a low-power embedded device. The edge-end deep learning device based on RISC‑V architecture is adopted, the MCU development board is used as the hardware platform, the flat-head brother wujian100 open source IP is used as the MCU core, and the serial port, HDMI interface and OV5640 camera are onboard. Obtain the training data, and train the pedestrian detection model MobileNetV1‑SSD; calculate the quantization factor of the model weight; calculate the activation value quantization factor of each layer by minimizing the mean square error; quantize each operator in the model, convert the floating point Type model weights are quantized into int8 data types, and activation values are quantized into uint8 data types; model inference and inverse quantization, inverse quantization of inference results into int32 data types; compile on the MCU development board and run the model.

Description

A pedestrian detection and reasoning method based on embedded devices

技术领域technical field

本发明涉及一种基于嵌入式设备的行人检测推理方法，属于行人检测技术领域。The invention relates to a pedestrian detection and reasoning method based on an embedded device, and belongs to the technical field of pedestrian detection.

背景技术Background technique

近年来，神经网络模型被广泛应用在许多领域，并取得了非常好的效果，尤其是在行人检测领域。但是，行人检测神经网络模型由于模型复杂度高、模型大，导致推理时效率较低，推理时间较长，尤其是运行在性能较低的移动设备以及低功耗设备。因此，如何设计低资源消耗的，可以实时预测的、同时保证预测精度的模型成为一个现实问题。在类似于MCU的低功耗设备上，需要低资源消耗的模型，另外，很多MCU不支持浮点数运算，限制了模型的应用。模型量化在应对这些问题取得了较好的效果，将模型从浮点类型量化成定点类型可以有效降低模型大小，同时提高模型推理速度，增加支持的嵌入式设备类型。In recent years, neural network models have been widely used in many fields and achieved very good results, especially in the field of pedestrian detection. However, due to the high model complexity and large model size of the pedestrian detection neural network model, the inference efficiency is low and the inference time is long, especially when running on low-performance mobile devices and low-power devices. Therefore, how to design a model with low resource consumption, which can be predicted in real time and at the same time guarantees the prediction accuracy has become a real problem. On low-power devices like MCUs, models with low resource consumption are required. In addition, many MCUs do not support floating-point arithmetic, which limits the application of models. Model quantization has achieved good results in dealing with these problems. Quantizing the model from floating-point type to fixed-point type can effectively reduce the model size, improve the model inference speed, and increase the supported embedded device types.

发明内容SUMMARY OF THE INVENTION

本发明目的是提供了一种基于嵌入式设备的行人检测推理方法，保证了模型的精度，同时通过提前计算量化因子，提高了模型的推理速度。The purpose of the present invention is to provide a pedestrian detection and reasoning method based on an embedded device, which ensures the accuracy of the model, and at the same time improves the reasoning speed of the model by calculating the quantization factor in advance.

本发明为实现上述目的，通过以下技术方案实现：The present invention is achieved by the following technical solutions in order to achieve the above object:

1、一种基于嵌入式设备的行人检测推理方法，其特征在于，包括以下步骤：1. A pedestrian detection and reasoning method based on an embedded device is characterized in that, comprising the following steps:

1)获取训练数据，并训练行人检测模型；1) Obtain training data and train a pedestrian detection model;

2)计算模型权重的量化因子，通过计算模型权重绝对值最大值，基于量化范围计算模型权重量化因子；2) Calculate the quantization factor of the model weight, and calculate the model weight quantization factor based on the quantization range by calculating the maximum value of the absolute value of the model weight;

3)通过最小化均方误差计算得到每层的激活值量化因子，基于部分测试数据集，计算每层量化后输出以及不量化输出的均方误差，通过使得均方误差最小，得到激活值量化因子；3) Calculate the activation value quantization factor of each layer by minimizing the mean square error. Based on some test data sets, calculate the mean square error of the quantized output and the unquantized output of each layer, and obtain the activation value quantization by minimizing the mean square error. factor;

4)针对模型中的每个算子进行量化，采用非对称量化的方式，将浮点类型模型权重量化成int8数据类型，并将激活值量化成uint8数据类型；4) Quantize each operator in the model, adopt asymmetric quantization, quantize the floating-point type model weight into int8 data type, and quantize the activation value into uint8 data type;

5)模型推理以及逆量化，采用量化后的定点类型的权重和激活值进行模型推理，并将推理结果逆量化为int32数据类型；5) Model inference and inverse quantization, using the weights and activation values of the quantized fixed-point type for model inference, and inversely quantizing the inference results into int32 data types;

6)在MCU开发板上进行编译，运行模型。6) Compile and run the model on the MCU development board.

优选的，所述均方误差公式如下：Preferably, the mean square error formula is as follows:

α＝r/255α=r/255

式中：y_i和

分别表示不量化输出以及量化后输出，量化范围r>0,α表示量化因子,clip指将激活值裁剪到[-r,r]范围，rounding是指将浮点数近似到最近的整数。where: y _i and

Represents the unquantized output and the quantized output, respectively, the quantization range r>0, α represents the quantization factor, clip refers to clipping the activation value to the [-r, r] range, and rounding refers to approximating the floating point number to the nearest integer.

优选的，所述行人检测模型采用轻量化网络MobileNetV1-SSD。Preferably, the pedestrian detection model adopts a lightweight network MobileNetV1-SSD.

优选的，所述量化因子的量化范围为[-128,127]。Preferably, the quantization range of the quantization factor is [-128, 127].

本发明的优点在于：本发明通过最小化均方误差计算得到每层的激活值量化因子。该方法保证了模型的精度，同时通过提前计算量化因子，提高了模型的推理速度，可以应用于行人检测推理。另外将行人检测推理模型运行在MCU开发板降低了模型功耗。The advantage of the present invention is that the present invention calculates the activation value quantization factor of each layer by minimizing the mean square error. The method ensures the accuracy of the model, and at the same time, by calculating the quantization factor in advance, the inference speed of the model is improved, and it can be applied to pedestrian detection and inference. In addition, running the pedestrian detection inference model on the MCU development board reduces the power consumption of the model.

附图说明Description of drawings

附图用来提供对本发明的进一步理解，并且构成说明书的一部分，与本发明的实施例一起用于解释本发明，并不构成对本发明的限制。The accompanying drawings are used to provide a further understanding of the present invention, and constitute a part of the specification, and are used to explain the present invention together with the embodiments of the present invention, and do not constitute a limitation to the present invention.

图1为本发明流程结构示意图。FIG. 1 is a schematic diagram of the flow structure of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

一种基于嵌入式设备的行人检测推理方法，用于将计算密集的行人检测模型运行到低功耗嵌入式设备上。采用基于RISC-V架构的边缘端深度学习设备，以Xilinx的MCU开发板为硬件平台，以平头哥wujian100开源IP为MCU核心，板载串口、HDMI接口和OV5640摄像头，可通过摄像头实现实时图像数据的抓取，通过MCU上运行行人检测推理模型，将检测结果通过后串口及HDMI输出到外围设备。An embedded device-based pedestrian detection inference method for running computationally intensive pedestrian detection models onto low-power embedded devices. Using edge-end deep learning equipment based on RISC-V architecture, Xilinx MCU development board as hardware platform, Pingtou brother wujian100 open source IP as MCU core, onboard serial port, HDMI interface and OV5640 camera, real-time image data can be realized through the camera By running the pedestrian detection inference model on the MCU, the detection results are output to the peripheral devices through the rear serial port and HDMI.

1)获取训练数据，并训练行人检测模型，模型采用轻量化网络MobileNetV1-SSD。1) Obtain the training data and train the pedestrian detection model. The model adopts the lightweight network MobileNetV1-SSD.

2)计算模型权重的量化因子，通过计算模型权重绝对值最大值，基于量化范围计算模型权重量化因子，模型权重量化成int8类型，因此量化范围为[-128,127]；2) Calculate the quantization factor of the model weight. By calculating the maximum value of the absolute value of the model weight, the model weight quantization factor is calculated based on the quantization range. The model weight is quantized into int8 type, so the quantization range is [-128,127];

3)通过最小化均方误差计算得到每层的激活值量化因子，基于部分测试数据集，计算每层量化后输出以及不量化输出的均方误差，通过使得均方误差最小，得到激活值量化因子。如下为均方误差公式，y_i和

分别表示不量化输出以及量化后输出。量化范围r(r>0),量化因子α,clip指将激活值裁剪到[-r,r]范围，rounding是指将浮点数近似到最近的整数。3) Calculate the activation value quantization factor of each layer by minimizing the mean square error. Based on some test data sets, calculate the mean square error of the quantized output and the unquantized output of each layer, and obtain the activation value quantization by minimizing the mean square error. factor. The following is the mean square error formula, y _i and

Represent unquantized output and quantized output, respectively. Quantization range r (r>0), quantization factor α, clip refers to clipping the activation value to the [-r,r] range, and rounding refers to approximating floating point numbers to the nearest integer.

α＝r/255α=r/255

4)针对模型中的每个算子进行量化，采用非对称量化的方式，将浮点类型模型权重量化成int8数据类型，并将激活值量化成uint8数据类型。4) Perform quantization for each operator in the model, using asymmetric quantization to quantize the floating-point model weights into int8 data types, and quantize the activation values into uint8 data types.

5)模型推理以及逆量化，采用量化后的定点类型的权重和激活值进行模型推理，并将推理结果逆量化为int32数据类型，激活值量化因子以及逆量化因子通过移位方式参与计算，避免使用浮点数计算。5) Model inference and inverse quantization, use the quantized fixed-point type weight and activation value for model inference, and inversely quantize the inference result into an int32 data type, and the activation value quantization factor and inverse quantization factor participate in the calculation by shifting to avoid Use floating point calculations.

6)在MCU开发板上进行编译，运行模型，将摄像头抓的图像数据预处理后传送到模型，将行人检测结果通过后串口及HDMI输出到外围设备。6) Compile on the MCU development board, run the model, preprocess the image data captured by the camera and transmit it to the model, and output the pedestrian detection result to the peripheral device through the rear serial port and HDMI.

Claims

1. a pedestrian detection and reasoning method based on embedded equipment, is characterized in that, comprises the following steps:

1) Obtain training data and train a pedestrian detection model;

2) Calculate the quantization factor of the model weight, and calculate the model weight quantization factor based on the quantization range by calculating the maximum value of the absolute value of the model weight;

3) Calculate the activation value quantization factor of each layer by minimizing the mean square error. Based on some test data sets, calculate the mean square error of the quantized output and the unquantized output of each layer, and obtain the activation value quantization by minimizing the mean square error. factor;

4) Quantize each operator in the model, adopt asymmetric quantization, quantize the floating-point type model weight into int8 data type, and quantize the activation value into uint8 data type;

5) Model inference and inverse quantization, using the weights and activation values of the quantized fixed-point type for model inference, and inversely quantizing the inference results into int32 data types;

6) Compile and run the model on the MCU development board.

2. the pedestrian detection and reasoning method based on embedded device according to claim 1, is characterized in that, described mean square error formula is as follows:

α=r/255

where: y _i and

3 . The pedestrian detection and reasoning method based on an embedded device according to claim 1 , wherein the pedestrian detection model adopts a lightweight network MobileNetV1-SSD. 4 .

4 . The pedestrian detection and reasoning method based on an embedded device according to claim 1 , wherein the quantization factor has a quantization range of [-128, 127]. 5 .