CN114821660A - A pedestrian detection and reasoning method based on embedded devices - Google Patents
A pedestrian detection and reasoning method based on embedded devices Download PDFInfo
- Publication number
- CN114821660A CN114821660A CN202210512803.XA CN202210512803A CN114821660A CN 114821660 A CN114821660 A CN 114821660A CN 202210512803 A CN202210512803 A CN 202210512803A CN 114821660 A CN114821660 A CN 114821660A
- Authority
- CN
- China
- Prior art keywords
- model
- quantization
- pedestrian detection
- calculate
- method based
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 26
- 238000000034 method Methods 0.000 title claims abstract description 12
- 238000013139 quantization Methods 0.000 claims abstract description 43
- 230000004913 activation Effects 0.000 claims abstract description 19
- 238000011161 development Methods 0.000 claims abstract description 7
- 238000012549 training Methods 0.000 claims abstract description 4
- 238000012360 testing method Methods 0.000 claims description 3
- 238000013135 deep learning Methods 0.000 abstract description 2
- 241000755266 Kathetostoma giganteum Species 0.000 abstract 1
- 238000004364 calculation method Methods 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- Traffic Control Systems (AREA)
Abstract
本发明提供了一种基于嵌入式设备的行人检测推理方法,用于将计算密集的行人检测模型运行到低功耗嵌入式设备上。采用基于RISC‑V架构的边缘端深度学习设备,以MCU开发板为硬件平台,以平头哥wujian100开源IP为MCU核心,板载串口、HDMI接口和OV5640摄像头。获取训练数据,并训练行人检测模型MobileNetV1‑SSD;计算模型权重的量化因子;通过最小化均方误差计算得到每层的激活值量化因子;对模型中的每个算子进行量化,将浮点类型模型权重量化成int8数据类型,并将激活值量化成uint8数据类型;模型推理以及逆量化,将推理结果逆量化为int32数据类型;在MCU开发板上进行编译,运行模型。
The invention provides a pedestrian detection and reasoning method based on an embedded device, which is used for running a calculation-intensive pedestrian detection model on a low-power embedded device. The edge-end deep learning device based on RISC‑V architecture is adopted, the MCU development board is used as the hardware platform, the flat-head brother wujian100 open source IP is used as the MCU core, and the serial port, HDMI interface and OV5640 camera are onboard. Obtain the training data, and train the pedestrian detection model MobileNetV1‑SSD; calculate the quantization factor of the model weight; calculate the activation value quantization factor of each layer by minimizing the mean square error; quantize each operator in the model, convert the floating point Type model weights are quantized into int8 data types, and activation values are quantized into uint8 data types; model inference and inverse quantization, inverse quantization of inference results into int32 data types; compile on the MCU development board and run the model.
Description
技术领域technical field
本发明涉及一种基于嵌入式设备的行人检测推理方法,属于行人检测技术领域。The invention relates to a pedestrian detection and reasoning method based on an embedded device, and belongs to the technical field of pedestrian detection.
背景技术Background technique
近年来,神经网络模型被广泛应用在许多领域,并取得了非常好的效果,尤其是在行人检测领域。但是,行人检测神经网络模型由于模型复杂度高、模型大,导致推理时效率较低,推理时间较长,尤其是运行在性能较低的移动设备以及低功耗设备。因此,如何设计低资源消耗的,可以实时预测的、同时保证预测精度的模型成为一个现实问题。在类似于MCU的低功耗设备上,需要低资源消耗的模型,另外,很多MCU不支持浮点数运算,限制了模型的应用。模型量化在应对这些问题取得了较好的效果,将模型从浮点类型量化成定点类型可以有效降低模型大小,同时提高模型推理速度,增加支持的嵌入式设备类型。In recent years, neural network models have been widely used in many fields and achieved very good results, especially in the field of pedestrian detection. However, due to the high model complexity and large model size of the pedestrian detection neural network model, the inference efficiency is low and the inference time is long, especially when running on low-performance mobile devices and low-power devices. Therefore, how to design a model with low resource consumption, which can be predicted in real time and at the same time guarantees the prediction accuracy has become a real problem. On low-power devices like MCUs, models with low resource consumption are required. In addition, many MCUs do not support floating-point arithmetic, which limits the application of models. Model quantization has achieved good results in dealing with these problems. Quantizing the model from floating-point type to fixed-point type can effectively reduce the model size, improve the model inference speed, and increase the supported embedded device types.
发明内容SUMMARY OF THE INVENTION
本发明目的是提供了一种基于嵌入式设备的行人检测推理方法,保证了模型的精度,同时通过提前计算量化因子,提高了模型的推理速度。The purpose of the present invention is to provide a pedestrian detection and reasoning method based on an embedded device, which ensures the accuracy of the model, and at the same time improves the reasoning speed of the model by calculating the quantization factor in advance.
本发明为实现上述目的,通过以下技术方案实现:The present invention is achieved by the following technical solutions in order to achieve the above object:
1、一种基于嵌入式设备的行人检测推理方法,其特征在于,包括以下步骤:1. A pedestrian detection and reasoning method based on an embedded device is characterized in that, comprising the following steps:
1)获取训练数据,并训练行人检测模型;1) Obtain training data and train a pedestrian detection model;
2)计算模型权重的量化因子,通过计算模型权重绝对值最大值,基于量化范围计算模型权重量化因子;2) Calculate the quantization factor of the model weight, and calculate the model weight quantization factor based on the quantization range by calculating the maximum value of the absolute value of the model weight;
3)通过最小化均方误差计算得到每层的激活值量化因子,基于部分测试数据集,计算每层量化后输出以及不量化输出的均方误差,通过使得均方误差最小,得到激活值量化因子;3) Calculate the activation value quantization factor of each layer by minimizing the mean square error. Based on some test data sets, calculate the mean square error of the quantized output and the unquantized output of each layer, and obtain the activation value quantization by minimizing the mean square error. factor;
4)针对模型中的每个算子进行量化,采用非对称量化的方式,将浮点类型模型权重量化成int8数据类型,并将激活值量化成uint8数据类型;4) Quantize each operator in the model, adopt asymmetric quantization, quantize the floating-point type model weight into int8 data type, and quantize the activation value into uint8 data type;
5)模型推理以及逆量化,采用量化后的定点类型的权重和激活值进行模型推理,并将推理结果逆量化为int32数据类型;5) Model inference and inverse quantization, using the weights and activation values of the quantized fixed-point type for model inference, and inversely quantizing the inference results into int32 data types;
6)在MCU开发板上进行编译,运行模型。6) Compile and run the model on the MCU development board.
优选的,所述均方误差公式如下:Preferably, the mean square error formula is as follows:
α=r/255α=r/255
式中:yi和分别表示不量化输出以及量化后输出,量化范围r>0,α表示量化因子,clip指将激活值裁剪到[-r,r]范围,rounding是指将浮点数近似到最近的整数。where: y i and Represents the unquantized output and the quantized output, respectively, the quantization range r>0, α represents the quantization factor, clip refers to clipping the activation value to the [-r, r] range, and rounding refers to approximating the floating point number to the nearest integer.
优选的,所述行人检测模型采用轻量化网络MobileNetV1-SSD。Preferably, the pedestrian detection model adopts a lightweight network MobileNetV1-SSD.
优选的,所述量化因子的量化范围为[-128,127]。Preferably, the quantization range of the quantization factor is [-128, 127].
本发明的优点在于:本发明通过最小化均方误差计算得到每层的激活值量化因子。该方法保证了模型的精度,同时通过提前计算量化因子,提高了模型的推理速度,可以应用于行人检测推理。另外将行人检测推理模型运行在MCU开发板降低了模型功耗。The advantage of the present invention is that the present invention calculates the activation value quantization factor of each layer by minimizing the mean square error. The method ensures the accuracy of the model, and at the same time, by calculating the quantization factor in advance, the inference speed of the model is improved, and it can be applied to pedestrian detection and inference. In addition, running the pedestrian detection inference model on the MCU development board reduces the power consumption of the model.
附图说明Description of drawings
附图用来提供对本发明的进一步理解,并且构成说明书的一部分,与本发明的实施例一起用于解释本发明,并不构成对本发明的限制。The accompanying drawings are used to provide a further understanding of the present invention, and constitute a part of the specification, and are used to explain the present invention together with the embodiments of the present invention, and do not constitute a limitation to the present invention.
图1为本发明流程结构示意图。FIG. 1 is a schematic diagram of the flow structure of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
一种基于嵌入式设备的行人检测推理方法,用于将计算密集的行人检测模型运行到低功耗嵌入式设备上。采用基于RISC-V架构的边缘端深度学习设备,以Xilinx的MCU开发板为硬件平台,以平头哥wujian100开源IP为MCU核心,板载串口、HDMI接口和OV5640摄像头,可通过摄像头实现实时图像数据的抓取,通过MCU上运行行人检测推理模型,将检测结果通过后串口及HDMI输出到外围设备。An embedded device-based pedestrian detection inference method for running computationally intensive pedestrian detection models onto low-power embedded devices. Using edge-end deep learning equipment based on RISC-V architecture, Xilinx MCU development board as hardware platform, Pingtou brother wujian100 open source IP as MCU core, onboard serial port, HDMI interface and OV5640 camera, real-time image data can be realized through the camera By running the pedestrian detection inference model on the MCU, the detection results are output to the peripheral devices through the rear serial port and HDMI.
1)获取训练数据,并训练行人检测模型,模型采用轻量化网络MobileNetV1-SSD。1) Obtain the training data and train the pedestrian detection model. The model adopts the lightweight network MobileNetV1-SSD.
2)计算模型权重的量化因子,通过计算模型权重绝对值最大值,基于量化范围计算模型权重量化因子,模型权重量化成int8类型,因此量化范围为[-128,127];2) Calculate the quantization factor of the model weight. By calculating the maximum value of the absolute value of the model weight, the model weight quantization factor is calculated based on the quantization range. The model weight is quantized into int8 type, so the quantization range is [-128,127];
3)通过最小化均方误差计算得到每层的激活值量化因子,基于部分测试数据集,计算每层量化后输出以及不量化输出的均方误差,通过使得均方误差最小,得到激活值量化因子。如下为均方误差公式,yi和分别表示不量化输出以及量化后输出。量化范围r(r>0),量化因子α,clip指将激活值裁剪到[-r,r]范围,rounding是指将浮点数近似到最近的整数。3) Calculate the activation value quantization factor of each layer by minimizing the mean square error. Based on some test data sets, calculate the mean square error of the quantized output and the unquantized output of each layer, and obtain the activation value quantization by minimizing the mean square error. factor. The following is the mean square error formula, y i and Represent unquantized output and quantized output, respectively. Quantization range r (r>0), quantization factor α, clip refers to clipping the activation value to the [-r,r] range, and rounding refers to approximating floating point numbers to the nearest integer.
α=r/255α=r/255
4)针对模型中的每个算子进行量化,采用非对称量化的方式,将浮点类型模型权重量化成int8数据类型,并将激活值量化成uint8数据类型。4) Perform quantization for each operator in the model, using asymmetric quantization to quantize the floating-point model weights into int8 data types, and quantize the activation values into uint8 data types.
5)模型推理以及逆量化,采用量化后的定点类型的权重和激活值进行模型推理,并将推理结果逆量化为int32数据类型,激活值量化因子以及逆量化因子通过移位方式参与计算,避免使用浮点数计算。5) Model inference and inverse quantization, use the quantized fixed-point type weight and activation value for model inference, and inversely quantize the inference result into an int32 data type, and the activation value quantization factor and inverse quantization factor participate in the calculation by shifting to avoid Use floating point calculations.
6)在MCU开发板上进行编译,运行模型,将摄像头抓的图像数据预处理后传送到模型,将行人检测结果通过后串口及HDMI输出到外围设备。6) Compile on the MCU development board, run the model, preprocess the image data captured by the camera and transmit it to the model, and output the pedestrian detection result to the peripheral device through the rear serial port and HDMI.
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210512803.XA CN114821660A (en) | 2022-05-12 | 2022-05-12 | A pedestrian detection and reasoning method based on embedded devices |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210512803.XA CN114821660A (en) | 2022-05-12 | 2022-05-12 | A pedestrian detection and reasoning method based on embedded devices |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114821660A true CN114821660A (en) | 2022-07-29 |
Family
ID=82513753
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210512803.XA Withdrawn CN114821660A (en) | 2022-05-12 | 2022-05-12 | A pedestrian detection and reasoning method based on embedded devices |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114821660A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111950715A (en) * | 2020-08-24 | 2020-11-17 | 云知声智能科技股份有限公司 | 8-bit integer full-quantization inference method and device based on self-adaptive dynamic shift |
CN111950716A (en) * | 2020-08-25 | 2020-11-17 | 云知声智能科技股份有限公司 | Quantification method and system for optimizing int8 |
CN112926415A (en) * | 2021-02-05 | 2021-06-08 | 西安电子科技大学 | Pedestrian avoiding system and pedestrian monitoring method |
CN113947177A (en) * | 2020-07-15 | 2022-01-18 | 安徽寒武纪信息科技有限公司 | Quantization calibration method, calculation device and computer readable storage medium |
CN114021691A (en) * | 2021-10-13 | 2022-02-08 | 山东浪潮科学研究院有限公司 | Neural network model quantification method, system, device and computer readable medium |
CN114418062A (en) * | 2021-12-25 | 2022-04-29 | 山东云海国创云计算装备产业创新中心有限公司 | Method, system, device and storage medium for deep convolutional neural network quantization |
-
2022
- 2022-05-12 CN CN202210512803.XA patent/CN114821660A/en not_active Withdrawn
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113947177A (en) * | 2020-07-15 | 2022-01-18 | 安徽寒武纪信息科技有限公司 | Quantization calibration method, calculation device and computer readable storage medium |
CN111950715A (en) * | 2020-08-24 | 2020-11-17 | 云知声智能科技股份有限公司 | 8-bit integer full-quantization inference method and device based on self-adaptive dynamic shift |
CN111950716A (en) * | 2020-08-25 | 2020-11-17 | 云知声智能科技股份有限公司 | Quantification method and system for optimizing int8 |
CN112926415A (en) * | 2021-02-05 | 2021-06-08 | 西安电子科技大学 | Pedestrian avoiding system and pedestrian monitoring method |
CN114021691A (en) * | 2021-10-13 | 2022-02-08 | 山东浪潮科学研究院有限公司 | Neural network model quantification method, system, device and computer readable medium |
CN114418062A (en) * | 2021-12-25 | 2022-04-29 | 山东云海国创云计算装备产业创新中心有限公司 | Method, system, device and storage medium for deep convolutional neural network quantization |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110378468B (en) | A neural network accelerator based on structured pruning and low-bit quantization | |
CN110705684A (en) | Environment self-adaptive learning method and system based on end cloud cooperation | |
CN111126599B (en) | A neural network weight initialization method based on transfer learning | |
CN101466040B (en) | A Bit Rate Estimation Method for Video Coding Mode Decision | |
CN107292382A (en) | A kind of neutral net acoustic model activation primitive pinpoints quantization method | |
WO2023060959A1 (en) | Neural network model quantification method, system and device, and computer-readable medium | |
CN113011571B (en) | INT8 offline quantization and integer inference method based on Transformer model | |
CN110276451A (en) | A Compression Method for Deep Neural Networks Based on Weight Normalization | |
CN109635927A (en) | A kind of convolutional neural networks training method and device | |
CN110531996B (en) | Particle swarm optimization-based computing task unloading method in multi-micro cloud environment | |
CN113595993A (en) | Vehicle-mounted sensing equipment joint learning method for model structure optimization under edge calculation | |
CN110807744B (en) | An image defogging method based on convolutional neural network | |
CN108985444A (en) | A kind of convolutional neural networks pruning method inhibited based on node | |
WO2021057926A1 (en) | Method and apparatus for training neural network model | |
CN112819873A (en) | High-generalization cross-domain road scene semantic segmentation method and system | |
CN114821660A (en) | A pedestrian detection and reasoning method based on embedded devices | |
CN114143541B (en) | Cloud edge collaborative video compression uploading method and device for semantic segmentation | |
CN113673532B (en) | Target detection method and device based on quantitative model | |
CN114998661A (en) | Target detection method based on fixed point number quantization | |
CN112988229B (en) | Convolutional neural network resource optimization configuration method based on heterogeneous computation | |
CN112734673B (en) | A low-light image enhancement method and system based on multi-expression fusion | |
CN114372565B (en) | Target detection network compression method for edge equipment | |
CN116341639A (en) | Convolutional Neural Network Hybrid Computing Post-Training Quantization Algorithm for Embedded Systems | |
CN114444688A (en) | Neural network quantization method, apparatus, device, storage medium, and program product | |
Li et al. | A deep neural network compression algorithm based on knowledge transfer for edge device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20220729 |