CN113378748A

CN113378748A - Target detection method based on improved algorithm

Info

Publication number: CN113378748A
Application number: CN202110690835.4A
Authority: CN
Inventors: 段大锴
Original assignee: Shanghai Zhongtongji Network Technology Co Ltd
Current assignee: Shanghai Zhongtongji Network Technology Co Ltd
Priority date: 2021-06-22
Filing date: 2021-06-22
Publication date: 2021-09-10

Abstract

The invention relates to the technical field of artificial intelligence, in particular to a target detection method based on an improved algorithm. An improved algorithm based target detection method, comprising: acquiring the picture to be identified; identifying the picture by utilizing a modified yolov5 model; the improved yolov5 model is obtained by changing the G _ IOU _ loss function of the original yolov5 model into a C _ IOU _ loss function and changing the non-maximum suppression algorithm in the original yolov5 model into a D _ IOU _ nms algorithm; and obtaining the identification result of the improved yolov5 model on the picture. In the scheme provided by the application, the improved yolov5 model considers the information of the center distance of the boundary box and the scale information of the width-to-height ratio of the boundary box, so that the boundary of each target can be detected more accurately under the condition that a plurality of detected targets are close to each other.

Description

Target detection method based on improved algorithm

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a target detection method based on an improved algorithm.

Background

The latest existing open-source target detection algorithm is a target detection algorithm based on yolov5 model. The target detection algorithm based on the yolov5 model is evolved on the basis of the yolov3 model and the yolov4 model, and the yolov5 model is evolved by updating partial technical points of the yolov3 model and the yolov4 model and adding some new technologies.

However, the target detection algorithm based on the yolov5 model in the prior art has low accuracy of detecting each target when a plurality of detected targets are relatively close to each other.

Disclosure of Invention

In view of the above, an object detection method based on an improved algorithm is provided to solve the problem in the related art that the accuracy of detecting each object is low when a plurality of detected objects are close to each other in an object detection algorithm based on the yolov5 model.

The invention adopts the following technical scheme:

in a first aspect, an embodiment of the present invention provides a target detection method based on an improved algorithm, where the method includes:

acquiring the picture to be identified;

identifying the picture by utilizing a modified yolov5 model; the improved yolov5 model is obtained by changing the G _ IOU _ loss function of the original yolov5 model into a C _ IOU _ loss function and changing the non-maximum suppression algorithm in the original yolov5 model into a D _ IOU _ nms algorithm;

and obtaining the identification result of the improved yolov5 model on the picture.

Optionally, the C _ IOU _ loss function is a loss function that considers information of a center distance of the bounding box and scale information of an aspect ratio of the bounding box on the basis of the G _ IOU _ loss function.

Optionally, the D _ IOU _ nms algorithm is an algorithm for determining whether a box is deleted based on a non-maximum suppression algorithm, taking into account a cross-over ratio and a distance between center points of two boxes.

Optionally, before identifying the picture, the method further includes:

and removing the black edge of the picture.

Optionally, the method further includes: the improved yolov5 model was trained to optimize its recognition results.

According to the technical scheme, the picture to be identified is obtained; identifying the picture by utilizing a modified yolov5 model; the improved yolov5 model is obtained by changing the G _ IOU _ loss function of the original yolov5 model into a C _ IOU _ loss function and changing the non-maximum suppression algorithm in the original yolov5 model into a D _ IOU _ nms algorithm; and obtaining the identification result of the improved yolov5 model on the picture. The improved yolov5 model is obtained by changing the G _ IOU _ loss function of the original yolov5 model into the C _ IOU _ loss function and changing the non-maximum suppression algorithm in the original yolov5 model into the D _ IOU _ nms algorithm; in the process of identifying the targets, the combination of the C _ IOU _ loss function and the D _ IOU _ nms algorithm considers the information of the center distance of the bounding box and the dimension information of the width-to-height ratio of the bounding box, so that the boundary of each target can be detected more accurately under the condition that a plurality of detected targets are close to each other.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart of a target detection method based on an improved algorithm according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail below. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without any inventive step, are within the scope of the present invention.

First, an application scenario of the embodiment of the present invention is explained, and an existing latest open-source target detection algorithm is a target detection algorithm based on yolov5 model. The target detection algorithm based on the yolov5 model is evolved on the basis of the yolov3 model and the yolov4 model, and the yolov5 model is evolved by updating partial technical points of the yolov3 model and the yolov4 model and adding some new technologies.

Examples

Fig. 1 is a flowchart of a response method of an intelligent customer service device according to an embodiment of the present invention, where the method may be executed by the intelligent customer service device according to an embodiment of the present invention. Referring to fig. 1, the method may specifically include the following steps:

s101, acquiring a picture to be identified;

s102, identifying the picture by utilizing an improved yolov5 model; the improved yolov5 model is obtained by changing the G _ IOU _ loss function of the original yolov5 model into a C _ IOU _ loss function and changing the non-maximum suppression algorithm in the original yolov5 model into a D _ IOU _ nms algorithm;

specifically, the C _ IOU _ loss function is a loss function that considers information of a center distance of the bounding box and scale information of an aspect ratio of the bounding box on the basis of the G _ IOU _ loss function.

The D _ IOU _ nms algorithm is an algorithm for determining whether a box is deleted or not by considering the intersection ratio and the distance between the center points of two boxes on the basis of a non-maximum suppression algorithm.

Further, the scheme provided by the application further comprises: the improved yolov5 model was trained to optimize its recognition results.

Further, before identifying the picture, the method further includes: and removing the black edges of the picture to reduce the amount of identification.

S103, obtaining the recognition result of the improved yolov5 model on the picture.

To more clearly illustrate the solution provided by the present application, a brief introduction will first be made to the yolov5 model:

yolov5 is improved on the basis of Yolov4, and in official code, a total of 4 versions of a given target detection network are respectively four models of Yolov5s, Yolov5m, Yolov5l and Yolov5 x.

The Yolov5 model is divided into four parts of an input end, a backhaul, a Neck and a Prediction. (1) Input end: and performing Mosaic data enhancement and adaptive anchor frame calculation. (2) Backbone: focus structure, CSP structure. (3) And (6) selecting Neck: FPN + PAN structure. (4) Prediction: GIOU _ Loss.

And Mosaic data enhancement: the input of Yolov5 uses the same Mosaic data enhancement as Yolov 4. However, the detection effect of small objects is still good when splicing is performed in the modes of random scaling, random clipping and random arrangement.

And (3) self-adaptive anchor frame calculation: in the Yolo algorithm, there are anchor boxes with initial set length and width for different data sets. In the network training, the network outputs a prediction frame on the basis of an initial anchor frame, and then compares the prediction frame with a real frame group, calculates the difference between the prediction frame and the real frame group, and then reversely updates and iterates network parameters. The initial anchor frame is therefore also an important part. However, Yolov5 embeds this function in the code, and adaptively calculates the best anchor block value in different training sets each time training. Of course, if the computed anchor box is not perceived to work well, the auto-compute anchor box function may also be turned off in the code.

Adaptive picture scaling: in a common target detection algorithm, different pictures are different in length and width, so that the common method is to uniformly scale the original picture to a standard size and then send the standard size to a detection network. For example, dimensions 416 × 416, 608 × 608 are commonly used in the Yolo algorithm, such as scaling the image of 800 × 600 below.

However, the Yolov5 code improves this, and is a good click that Yolov5 inference speed can be fast. When the project is actually used, the aspect ratios of a plurality of pictures are different, so after the zoom filling, the sizes of the black edges at two ends are different, and if the filling ratio is more, information redundancy exists, and the reasoning speed is influenced. Thus, the letterbox function of datasets. py in the code of Yolov5 is modified to adaptively add the least black edges to the original image.

Focus structure: this structure is not present in Yolov3& Yolov4, where the comparison key is the slicing operation. The image slices of 4 × 3 become a feature map of 3 × 12. Taking the Yolov5s structure as an example, the original 608 × 3 image is input into a Focus structure, and is changed into a 304 × 12 feature map by a slicing operation, and is then subjected to a convolution operation of 32 convolution kernels, and finally changed into a 304 × 32 feature map. It should be noted that: the Focus structure of Yolov5s finally uses 32 convolution kernels, while the other three structures.

CSP structure: in the Yolov4 network structure, the CSP structure is designed in the backbone network by taking the design idea of CSPNet as a reference. Yolov5 differs from Yolov4 in that only the backbone network in Yolov4 uses the CSP structure. Two CSP structures are designed in the Yolov5, taking a Yolov5s network as an example, a CSP1_ X structure is applied to a Backbone network of a backhaul, and another CSP2_ X structure is applied to a Neck.

NECK structure: in the nerck structure of Yolov4, a common convolution operation is adopted. In the Neck structure of Yolov5, a CSP2 structure designed by referring to CSPnet is adopted to enhance the capability of network feature fusion.

Bounding box loss function with nms non-maximum inhibition: in Yolov5, the GIOU _ Loss therein is used as a Loss function of the Bounding box. In the post-processing process of target detection, the filtering for many target frames usually requires nms operation. The Yolov4 adopts a DIOU _ nms mode on the basis of DIOU _ Loss, and the Yolov5 still adopts a common nms mode.

In summary, in an actual test, the accuracy of Yolov4 has good advantages, but various network structures of Yolov5 are more flexible to use, and people can make up for deficiencies according to different project requirements and exert the advantages of different detection networks.

However, the main disadvantage of the prior art is that if a scene is a situation in which a plurality of objects are adjacent and relatively close, the original model combines the closest objects into one object in recognition effect as an output result, which results in inaccurate recognition.

The invention aims to solve the defects of the prior art, and the G _ IOU _ loss function in the original yolov5 model is changed into C _ IOU _ loss and the NMS in the original yolov5 model is changed into D _ IOU _ NMS to be combined for final identification.

It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar parts in other embodiments may be referred to for the content which is not described in detail in some embodiments.

It should be noted that the terms "first," "second," and the like in the description of the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present invention, the meaning of "a plurality" means at least two unless otherwise specified.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims

1. An improved algorithm based target detection method, comprising:

acquiring a picture to be identified;

2. The improved algorithm based object detection method of claim 1, wherein the C _ IOU _ loss function is a loss function considering information of the bounding box center distance and scale information of the bounding box aspect ratio based on the G _ IOU _ loss function.

3. The improved algorithm based object detection method of claim 1, wherein the D _ IOU _ nms algorithm is an algorithm for deciding whether a box is deleted or not based on the non-maximum suppression algorithm, taking into account the cross-over ratio and the distance between the two box center points.

4. The improved algorithm-based target detection method according to claim 1, wherein the identifying the picture by using the improved yolov5 model further comprises:

and removing the black edge of the picture.

5. The improved algorithm based object detection method of claim 1, further comprising: the improved yolov5 model was trained to optimize its recognition results.