CN115082751A

CN115082751A - Improved YOLOv 4-based mobile robot target detection method

Info

Publication number: CN115082751A
Application number: CN202210496122.9A
Authority: CN
Inventors: 刘钢; 胡艳鑫; 郭建伟; 陈志雨
Original assignee: Changchun University of Technology
Current assignee: Changchun University of Technology
Priority date: 2022-05-07
Filing date: 2022-05-07
Publication date: 2022-09-20

Abstract

The invention discloses a mobile robot target detection method based on improved YOLOv4, wherein a target detection model comprises the following steps: (1) acquiring a DJI ROCO data set, performing data enhancement processing on the data set, and dividing the data set into a training set and a test set; (2) inputting the training set into an improved YOLOv4 network model for model training; (3) carrying out quantization processing on the trained optimal model; (4) and inputting the test set into the quantized optimal model for testing to obtain the trained mobile robot target detection model. When the target detection is carried out, a moving robot picture is collected in real time, and the robot target detection is carried out on the robot picture according to the trained movable robot target detection model based on the improved YOLOv 4.

Description

Improved YOLOv 4-based mobile robot target detection method

Technical Field

The invention relates to the field of computer vision and image processing, in particular to a mobile robot target detection method based on improved YOLOv 4.

Background

As a new robotics platform, the RoboMaster race team Committee of DJI initiates a RoboMaster artificial intelligence challenge race specially aiming at the field of mobile robots from 2017, and the RoboMaster artificial intelligence challenge race enables global enthusiasts to research the robot technology based on the deep neural network together, is expected to apply the achievement to the industries of field rescue, unmanned driving, automatic logistics and the like, and benefits human life. In the robosmaster artificial intelligence challenge race of the great-fronted organization, a single robot needs to have the capability of detecting, tracking and shooting a target, which requires that a mobile robot can correctly and quickly detect a target object. How to realize a target detection algorithm combining detection effect and performance on a wheeled mobile robot has become a key for winning. As target detection algorithms are deployed on devices with mobile wheeled robots, limited computing power becomes a key factor limiting detection performance.

In recent years, object detection algorithms have been developed. The method can be roughly divided into two types, one type is a dual-stage target detection algorithm which mainly represents R-CNN, Fast R-CNN and Fast R-CNN, the algorithm firstly generates a series of candidate frames based on a candidate region generation network, the candidate frames firstly frame the position of a part to be detected, and then the convolutional neural network is utilized to carry out specific positioning and classification on the framed position, the method firstly generates a candidate region and then carries out classification and regression, so that the detection precision is higher and the detection time is longer; the other type is a single-stage target detection algorithm which mainly represents a YOLO series and an SSD, the algorithm does not need to generate a candidate region, a convolutional neural network is used for directly extracting input image features and predicting the position information of a detection target, the method is an end-to-end detection method, and detection accuracy is sacrificed although detection time is shortened.

The target detection algorithm, whether two-stage or single-stage, has been well developed on high-performance devices, but running these methods requires a large amount of computing resources and is therefore very inefficient on some mobile robotic devices. Although some lightweight target detection algorithms have been proposed. Such as YOLOv3_ tiny, YOLOv4_ tiny, these lightweight networks have excellent performance on mobile robot devices, but the recognition accuracy is also much lower than that of non-lightweight network models, and complex target detection tasks cannot be completed.

Disclosure of Invention

In view of the shortcomings of the prior art, the present invention provides in a first aspect a mobile robot target detection model based on improved YOLOv4, comprising:

(1) acquiring a DJI ROCO data set, performing data enhancement processing on the data set, and dividing the data set into a training set and a test set;

(2) inputting the training set into an improved YOLOv4 network model for model training;

(3) carrying out quantization processing on the trained optimal model;

(4) and inputting the test set into the quantized optimal model for testing to obtain the trained mobile robot target detection model.

The second aspect of the invention provides a mobile robot target detection method based on improved YOLOv4, which comprises the following steps: and acquiring a moving robot picture in real time, and carrying out robot target detection on the robot picture according to the trained movable robot target detection model based on the improved YOLOv 4.

A third aspect of the invention provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the improved YOLOv 4-based mobile robot object detection model.

The invention provides a mobile robot, wherein the mobile robot target detection model based on the improved YOLOv4 is mounted on a miniPC device of the mobile robot;

in the moving process of the mobile robot, the mobile robot target detection method based on the improved YOLOv4 is adopted to realize target detection.

Compared with the prior art, the invention has the following characteristics:

(1) the data quantity is expanded by adopting a data enhancement mode, so that the trained target detection model has higher robustness;

(2) the Yolov4 trunk feature extraction network is replaced by the Ghostnet, so that the model volume is compressed, and a Ghostnet feature layer is introduced into the enhanced feature extraction network to keep the precision;

(3) the original common convolution is replaced by a deep separable convolution method in the enhanced feature extraction network, so that the parameter quantity is greatly reduced, the calculated quantity is reduced, the model performance is improved, and the detection precision is reserved;

(4) by adopting the method of static quantization after training, floating point operation is further reduced, the reasoning speed is accelerated, and the deployment on the robot equipment becomes friendly.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

Fig. 2 is a diagram of the improved YOLOv4 model structure.

FIG. 3 is a diagram illustrating the detection results of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Example 1

The operating system of the training environment selected in this embodiment is Ubuntu 18.04, the deep learning framework of the network model is Pytorch, the hardware environment is intel (r) core (tm) i5-11260H @ 2.60GHz, and the Nvidia GTX 1070 graphics card is configured with 6GB and 16GB memories. The selected test environment operates as a miniPC device Dell OptiPlex 3060 that can be mounted on a mobile robot, the CPU processor is Intel (R) Pentium (R) Gold G5400T CPU @ 3.10GHz, and Wirows 10 is deployed for an operating system and a Pyorch framework.

As shown in fig. 1, a mobile robot object detection model based on improved YOLOv4 includes:

specifically, the DJI ROCO data set is expanded by three times by adopting a mode of randomly combining Gaussian noise increase, brightness change, image translation, image rotation, image turnover and image scaling on the images in the DJI ROCO data set;

the expanded data set is divided into a training set and a testing set according to the ratio of 9: 1.

as shown in fig. 2, the method for improving the YOLOv4 network model is as follows:

deleting original trunk characteristics of a YOLOv4 network model, extracting a network CSPDarknet53, and replacing the trunk characteristics with Ghostnet; specifically, firstly, effective feature layers with the same height and width as those of CSPDarknet53 are found in GhostNet, and then the original CSPDarknet53 feature layers are replaced by the feature layers;

and introducing a GhostNet feature layer into an enhanced feature extraction network PANet of a YOLOv4 network model.

(3) Carrying out quantization processing on the trained optimal model:

step 3.1, adding Quantstub and DeQuantstub modules, and specifying where to perform explicit quantization and inverse quantization on activation values;

step 3.2, fusing the Ghost module operation and the Batch Normalization operation in the G-bneck together:

firstly, normalizing the input value of the BN layer, and then scaling and shifting the normalized result, wherein the BN layer is regarded as a 1x1 convolutional layer; secondly, performing display quantization calculation on input values and weighted values in all the convolution layers, mapping the input values and weighted values into integers with a numerical range of (0, 255), and performing inverse quantization calculation on calculation results after the convolution calculation is completed; finally, multiplying the weight of the quantized convolutional layer by the weight of the quantized BN layer to form a fused weight; multiplying the weight of the BN layer by the bias of the Ghost module and adding the bias of the BN layer to form a fused bias, so that the fusion operation is finished;

step 3.3, selecting an asymmetric quantization and L2Norm calibration technology for calibration;

step 3.4, using the torch.rectangle.convert () module to convert the model; specifically, each activation tensor implements the quantization operation by computing them from FP32 type to INT type using the corresponding scale and bias values.

(4) Inputting the test set into the quantized optimal model for testing to obtain a trained mobile robot target detection model;

during specific training, the picture input size is 416, the maximum iteration number of epoch is 300, the initial learning rate is 8-3, the learning rate is set to be 10 times attenuated when each iteration number is 30 epochs, so that the loss function is further converged, and parameters with the best effect are selected through repeated training.

In order to fully analyze the detection performance of the model, a comparison experiment is performed on the trained improved YOLOv4 model, YOLOv4_ tiny and YOLOv4 models based on the same experiment conditions, and the detection results of different models are shown in table 1:

through experimental comparison, the improved YOLOv4 model has good comprehensive performance, under the condition that the model volume is compressed to only YOLOv4 with the size of about 1/8, the mAP value is not lost too much and is far higher than YOLOv4_ tiny, even exceeds YOLOv4 on the index of mAP (0.75), and meanwhile, the running time performance of the improved YOLOv4 is far higher than YOLOv4 and is an order of magnitude with YOLOv4_ tiny. The quantized inference speed obviously catches up with the lightweight network YOLOv4_ tiny. Taken together, the improved YOLOv4 greatly compresses the model volume while substantially retaining accuracy, making the model convenient for deployment on a robot.

Fig. 3 shows the detection and identification results of the improved YOLOv4 model on the test set pictures in the DJI ROCO data set. The figure shows that the invention has good detection performance on small targets, darker illumination and multiple targets.

Example 2

The embodiment provides a mobile robot target detection method based on improved YOLOv 4:

and acquiring a moving robot picture in real time, and carrying out robot target detection on the robot picture according to the trained mobile robot target detection model based on the improved YOLOv4 in the embodiment 1.

Example 3

The present embodiment provides a non-transitory computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements the improved YOLOv 4-based mobile robot object detection model of embodiment 1.

Example 4

The present embodiment provides a mobile robot: the mobile robot target detection model based on the improved YOLOv4 and described in embodiment 1 is mounted on a miniPC device of a mobile robot;

in the moving process of the mobile robot, the mobile robot target detection method based on the improved YOLOv4 described in embodiment 2 is adopted to realize target detection.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-non-transitory readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer program code embodied therein.

The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A mobile robot target detection model based on improved YOLOv4, comprising:

(3) carrying out quantization processing on the trained optimal model;

2. The improved YOLOv 4-based mobile robot target detection model of claim 1, wherein: the method for dividing the data set into the training set and the test set after the data enhancement processing is carried out on the data set comprises the following steps:

the method comprises the following steps of expanding a DJI ROCO data set by three times by adopting a random combination mode of increasing Gaussian noise, changing brightness, translating images, rotating images, overturning images and scaling images for pictures in the DJI ROCO data set;

3. The improved YOLOv 4-based mobile robot target detection model according to claim 1, wherein the improvement method for the YOLOv4 network model is as follows:

deleting original trunk characteristics of a YOLOv4 network model, extracting a network CSPDarknet53, and replacing the trunk characteristics with Ghostnet;

4. The improved YOLOv 4-based mobile robot target detection model according to claim 1, wherein the method for quantizing the trained optimal model is as follows:

step 3.4, model is transformed using the torch.

5. A mobile robot target detection method based on improved YOLOv4 is characterized in that:

acquiring a moving robot picture in real time, and carrying out robot target detection on the robot picture according to the trained mobile robot target detection model based on improved YOLOv4 of any one of claims 1-4.

6. A non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the improved YOLOv 4-based mobile robot target detection model of any one of claims 1-4.

7. A mobile robot, characterized by: mounting the improved YOLOv 4-based mobile robot target detection model of any one of claims 1-4 on a miniPC device of a mobile robot;

the mobile robot adopts the mobile robot target detection method based on the improved YOLOv4 as claimed in claim 5 to realize target detection during the moving process.