CN113378748A - Target detection method based on improved algorithm - Google Patents

Target detection method based on improved algorithm Download PDF

Info

Publication number
CN113378748A
CN113378748A CN202110690835.4A CN202110690835A CN113378748A CN 113378748 A CN113378748 A CN 113378748A CN 202110690835 A CN202110690835 A CN 202110690835A CN 113378748 A CN113378748 A CN 113378748A
Authority
CN
China
Prior art keywords
algorithm
improved
iou
yolov5
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110690835.4A
Other languages
Chinese (zh)
Inventor
段大锴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhongtongji Network Technology Co Ltd
Original Assignee
Shanghai Zhongtongji Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Zhongtongji Network Technology Co Ltd filed Critical Shanghai Zhongtongji Network Technology Co Ltd
Priority to CN202110690835.4A priority Critical patent/CN113378748A/en
Publication of CN113378748A publication Critical patent/CN113378748A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of artificial intelligence, in particular to a target detection method based on an improved algorithm. An improved algorithm based target detection method, comprising: acquiring the picture to be identified; identifying the picture by utilizing a modified yolov5 model; the improved yolov5 model is obtained by changing the G _ IOU _ loss function of the original yolov5 model into a C _ IOU _ loss function and changing the non-maximum suppression algorithm in the original yolov5 model into a D _ IOU _ nms algorithm; and obtaining the identification result of the improved yolov5 model on the picture. In the scheme provided by the application, the improved yolov5 model considers the information of the center distance of the boundary box and the scale information of the width-to-height ratio of the boundary box, so that the boundary of each target can be detected more accurately under the condition that a plurality of detected targets are close to each other.

Description

Target detection method based on improved algorithm
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a target detection method based on an improved algorithm.
Background
The latest existing open-source target detection algorithm is a target detection algorithm based on yolov5 model. The target detection algorithm based on the yolov5 model is evolved on the basis of the yolov3 model and the yolov4 model, and the yolov5 model is evolved by updating partial technical points of the yolov3 model and the yolov4 model and adding some new technologies.
However, the target detection algorithm based on the yolov5 model in the prior art has low accuracy of detecting each target when a plurality of detected targets are relatively close to each other.
Disclosure of Invention
In view of the above, an object detection method based on an improved algorithm is provided to solve the problem in the related art that the accuracy of detecting each object is low when a plurality of detected objects are close to each other in an object detection algorithm based on the yolov5 model.
The invention adopts the following technical scheme:
in a first aspect, an embodiment of the present invention provides a target detection method based on an improved algorithm, where the method includes:
acquiring the picture to be identified;
identifying the picture by utilizing a modified yolov5 model; the improved yolov5 model is obtained by changing the G _ IOU _ loss function of the original yolov5 model into a C _ IOU _ loss function and changing the non-maximum suppression algorithm in the original yolov5 model into a D _ IOU _ nms algorithm;
and obtaining the identification result of the improved yolov5 model on the picture.
Optionally, the C _ IOU _ loss function is a loss function that considers information of a center distance of the bounding box and scale information of an aspect ratio of the bounding box on the basis of the G _ IOU _ loss function.
Optionally, the D _ IOU _ nms algorithm is an algorithm for determining whether a box is deleted based on a non-maximum suppression algorithm, taking into account a cross-over ratio and a distance between center points of two boxes.
Optionally, before identifying the picture, the method further includes:
and removing the black edge of the picture.
Optionally, the method further includes: the improved yolov5 model was trained to optimize its recognition results.
According to the technical scheme, the picture to be identified is obtained; identifying the picture by utilizing a modified yolov5 model; the improved yolov5 model is obtained by changing the G _ IOU _ loss function of the original yolov5 model into a C _ IOU _ loss function and changing the non-maximum suppression algorithm in the original yolov5 model into a D _ IOU _ nms algorithm; and obtaining the identification result of the improved yolov5 model on the picture. The improved yolov5 model is obtained by changing the G _ IOU _ loss function of the original yolov5 model into the C _ IOU _ loss function and changing the non-maximum suppression algorithm in the original yolov5 model into the D _ IOU _ nms algorithm; in the process of identifying the targets, the combination of the C _ IOU _ loss function and the D _ IOU _ nms algorithm considers the information of the center distance of the bounding box and the dimension information of the width-to-height ratio of the bounding box, so that the boundary of each target can be detected more accurately under the condition that a plurality of detected targets are close to each other.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a target detection method based on an improved algorithm according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail below. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without any inventive step, are within the scope of the present invention.
First, an application scenario of the embodiment of the present invention is explained, and an existing latest open-source target detection algorithm is a target detection algorithm based on yolov5 model. The target detection algorithm based on the yolov5 model is evolved on the basis of the yolov3 model and the yolov4 model, and the yolov5 model is evolved by updating partial technical points of the yolov3 model and the yolov4 model and adding some new technologies.
However, the target detection algorithm based on the yolov5 model in the prior art has low accuracy of detecting each target when a plurality of detected targets are relatively close to each other.
Examples
Fig. 1 is a flowchart of a response method of an intelligent customer service device according to an embodiment of the present invention, where the method may be executed by the intelligent customer service device according to an embodiment of the present invention. Referring to fig. 1, the method may specifically include the following steps:
s101, acquiring a picture to be identified;
s102, identifying the picture by utilizing an improved yolov5 model; the improved yolov5 model is obtained by changing the G _ IOU _ loss function of the original yolov5 model into a C _ IOU _ loss function and changing the non-maximum suppression algorithm in the original yolov5 model into a D _ IOU _ nms algorithm;
specifically, the C _ IOU _ loss function is a loss function that considers information of a center distance of the bounding box and scale information of an aspect ratio of the bounding box on the basis of the G _ IOU _ loss function.
The D _ IOU _ nms algorithm is an algorithm for determining whether a box is deleted or not by considering the intersection ratio and the distance between the center points of two boxes on the basis of a non-maximum suppression algorithm.
Further, the scheme provided by the application further comprises: the improved yolov5 model was trained to optimize its recognition results.
Further, before identifying the picture, the method further includes: and removing the black edges of the picture to reduce the amount of identification.
S103, obtaining the recognition result of the improved yolov5 model on the picture.
According to the technical scheme, the picture to be identified is obtained; identifying the picture by utilizing a modified yolov5 model; the improved yolov5 model is obtained by changing the G _ IOU _ loss function of the original yolov5 model into a C _ IOU _ loss function and changing the non-maximum suppression algorithm in the original yolov5 model into a D _ IOU _ nms algorithm; and obtaining the identification result of the improved yolov5 model on the picture. The improved yolov5 model is obtained by changing the G _ IOU _ loss function of the original yolov5 model into the C _ IOU _ loss function and changing the non-maximum suppression algorithm in the original yolov5 model into the D _ IOU _ nms algorithm; in the process of identifying the targets, the combination of the C _ IOU _ loss function and the D _ IOU _ nms algorithm considers the information of the center distance of the bounding box and the dimension information of the width-to-height ratio of the bounding box, so that the boundary of each target can be detected more accurately under the condition that a plurality of detected targets are close to each other.
To more clearly illustrate the solution provided by the present application, a brief introduction will first be made to the yolov5 model:
yolov5 is improved on the basis of Yolov4, and in official code, a total of 4 versions of a given target detection network are respectively four models of Yolov5s, Yolov5m, Yolov5l and Yolov5 x.
The Yolov5 model is divided into four parts of an input end, a backhaul, a Neck and a Prediction. (1) Input end: and performing Mosaic data enhancement and adaptive anchor frame calculation. (2) Backbone: focus structure, CSP structure. (3) And (6) selecting Neck: FPN + PAN structure. (4) Prediction: GIOU _ Loss.
And Mosaic data enhancement: the input of Yolov5 uses the same Mosaic data enhancement as Yolov 4. However, the detection effect of small objects is still good when splicing is performed in the modes of random scaling, random clipping and random arrangement.
And (3) self-adaptive anchor frame calculation: in the Yolo algorithm, there are anchor boxes with initial set length and width for different data sets. In the network training, the network outputs a prediction frame on the basis of an initial anchor frame, and then compares the prediction frame with a real frame group, calculates the difference between the prediction frame and the real frame group, and then reversely updates and iterates network parameters. The initial anchor frame is therefore also an important part. However, Yolov5 embeds this function in the code, and adaptively calculates the best anchor block value in different training sets each time training. Of course, if the computed anchor box is not perceived to work well, the auto-compute anchor box function may also be turned off in the code.
Adaptive picture scaling: in a common target detection algorithm, different pictures are different in length and width, so that the common method is to uniformly scale the original picture to a standard size and then send the standard size to a detection network. For example, dimensions 416 × 416, 608 × 608 are commonly used in the Yolo algorithm, such as scaling the image of 800 × 600 below.
However, the Yolov5 code improves this, and is a good click that Yolov5 inference speed can be fast. When the project is actually used, the aspect ratios of a plurality of pictures are different, so after the zoom filling, the sizes of the black edges at two ends are different, and if the filling ratio is more, information redundancy exists, and the reasoning speed is influenced. Thus, the letterbox function of datasets. py in the code of Yolov5 is modified to adaptively add the least black edges to the original image.
Focus structure: this structure is not present in Yolov3& Yolov4, where the comparison key is the slicing operation. The image slices of 4 × 3 become a feature map of 3 × 12. Taking the Yolov5s structure as an example, the original 608 × 3 image is input into a Focus structure, and is changed into a 304 × 12 feature map by a slicing operation, and is then subjected to a convolution operation of 32 convolution kernels, and finally changed into a 304 × 32 feature map. It should be noted that: the Focus structure of Yolov5s finally uses 32 convolution kernels, while the other three structures.
CSP structure: in the Yolov4 network structure, the CSP structure is designed in the backbone network by taking the design idea of CSPNet as a reference. Yolov5 differs from Yolov4 in that only the backbone network in Yolov4 uses the CSP structure. Two CSP structures are designed in the Yolov5, taking a Yolov5s network as an example, a CSP1_ X structure is applied to a Backbone network of a backhaul, and another CSP2_ X structure is applied to a Neck.
NECK structure: in the nerck structure of Yolov4, a common convolution operation is adopted. In the Neck structure of Yolov5, a CSP2 structure designed by referring to CSPnet is adopted to enhance the capability of network feature fusion.
Bounding box loss function with nms non-maximum inhibition: in Yolov5, the GIOU _ Loss therein is used as a Loss function of the Bounding box. In the post-processing process of target detection, the filtering for many target frames usually requires nms operation. The Yolov4 adopts a DIOU _ nms mode on the basis of DIOU _ Loss, and the Yolov5 still adopts a common nms mode.
In summary, in an actual test, the accuracy of Yolov4 has good advantages, but various network structures of Yolov5 are more flexible to use, and people can make up for deficiencies according to different project requirements and exert the advantages of different detection networks.
However, the main disadvantage of the prior art is that if a scene is a situation in which a plurality of objects are adjacent and relatively close, the original model combines the closest objects into one object in recognition effect as an output result, which results in inaccurate recognition.
The invention aims to solve the defects of the prior art, and the G _ IOU _ loss function in the original yolov5 model is changed into C _ IOU _ loss and the NMS in the original yolov5 model is changed into D _ IOU _ NMS to be combined for final identification.
It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar parts in other embodiments may be referred to for the content which is not described in detail in some embodiments.
It should be noted that the terms "first," "second," and the like in the description of the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present invention, the meaning of "a plurality" means at least two unless otherwise specified.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (5)

1. An improved algorithm based target detection method, comprising:
acquiring a picture to be identified;
identifying the picture by utilizing a modified yolov5 model; the improved yolov5 model is obtained by changing the G _ IOU _ loss function of the original yolov5 model into a C _ IOU _ loss function and changing the non-maximum suppression algorithm in the original yolov5 model into a D _ IOU _ nms algorithm;
and obtaining the identification result of the improved yolov5 model on the picture.
2. The improved algorithm based object detection method of claim 1, wherein the C _ IOU _ loss function is a loss function considering information of the bounding box center distance and scale information of the bounding box aspect ratio based on the G _ IOU _ loss function.
3. The improved algorithm based object detection method of claim 1, wherein the D _ IOU _ nms algorithm is an algorithm for deciding whether a box is deleted or not based on the non-maximum suppression algorithm, taking into account the cross-over ratio and the distance between the two box center points.
4. The improved algorithm-based target detection method according to claim 1, wherein the identifying the picture by using the improved yolov5 model further comprises:
and removing the black edge of the picture.
5. The improved algorithm based object detection method of claim 1, further comprising: the improved yolov5 model was trained to optimize its recognition results.
CN202110690835.4A 2021-06-22 2021-06-22 Target detection method based on improved algorithm Pending CN113378748A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110690835.4A CN113378748A (en) 2021-06-22 2021-06-22 Target detection method based on improved algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110690835.4A CN113378748A (en) 2021-06-22 2021-06-22 Target detection method based on improved algorithm

Publications (1)

Publication Number Publication Date
CN113378748A true CN113378748A (en) 2021-09-10

Family

ID=77578267

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110690835.4A Pending CN113378748A (en) 2021-06-22 2021-06-22 Target detection method based on improved algorithm

Country Status (1)

Country Link
CN (1) CN113378748A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114120037A (en) * 2021-11-25 2022-03-01 中国农业科学院农业信息研究所 Germinated potato image recognition method based on improved yolov5 model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112633174A (en) * 2020-12-23 2021-04-09 电子科技大学 Improved YOLOv4 high-dome-based fire detection method and storage medium
CN112686923A (en) * 2020-12-31 2021-04-20 浙江航天恒嘉数据科技有限公司 Target tracking method and system based on double-stage convolutional neural network
CN112819804A (en) * 2021-02-23 2021-05-18 西北工业大学 Insulator defect detection method based on improved YOLOv5 convolutional neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112633174A (en) * 2020-12-23 2021-04-09 电子科技大学 Improved YOLOv4 high-dome-based fire detection method and storage medium
CN112686923A (en) * 2020-12-31 2021-04-20 浙江航天恒嘉数据科技有限公司 Target tracking method and system based on double-stage convolutional neural network
CN112819804A (en) * 2021-02-23 2021-05-18 西北工业大学 Insulator defect detection method based on improved YOLOv5 convolutional neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
梦坠凡尘: "YOLOv4的tricks解读三---目标检测后处理", pages 4, Retrieved from the Internet <URL:https://blog.csdn.net/c2250645962/article/details/106210819> *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114120037A (en) * 2021-11-25 2022-03-01 中国农业科学院农业信息研究所 Germinated potato image recognition method based on improved yolov5 model
CN114120037B (en) * 2021-11-25 2022-07-19 中国农业科学院农业信息研究所 Germinated potato image recognition method based on improved yolov5 model

Similar Documents

Publication Publication Date Title
EP3712841A1 (en) Image processing method, image processing apparatus, and computer-readable recording medium
CN102497489B (en) Method for compressing image, image compressing device and mobile terminal
WO2020007118A1 (en) Display screen peripheral circuit detection method and device, electronic equipment and storage medium
CN110363753B (en) Image quality evaluation method and device and electronic equipment
CN109685806B (en) Image significance detection method and device
RU2718423C2 (en) Method of determining depth map for image and device for implementation thereof
CN114677394B (en) Matting method, matting device, image pickup apparatus, conference system, electronic apparatus, and medium
CN111445424A (en) Image processing method, image processing device, mobile terminal video processing method, mobile terminal video processing device, mobile terminal video processing equipment and mobile terminal video processing medium
JP2021165888A (en) Information processing apparatus, information processing method of information processing apparatus, and program
CN111882578A (en) Foreground image acquisition method, foreground image acquisition device and electronic equipment
CN113378748A (en) Target detection method based on improved algorithm
CN111031359B (en) Video playing method and device, electronic equipment and computer readable storage medium
CN117237648B (en) Training method, device and equipment of semantic segmentation model based on context awareness
CN113312949B (en) Video data processing method, video data processing device and electronic equipment
CN116071651B (en) Voltage equalizing field identification method and device, storage medium and terminal
CN112116102A (en) Method and system for expanding domain adaptive training set
CN112218005A (en) Video editing method based on artificial intelligence
CN103618846A (en) Background removing method for restricting influence of sudden changes of light in video analysis
CN111369591A (en) Method, device and equipment for tracking moving object
CN115311569B (en) Remote sensing image-based method and device for detecting change of push-fill soil and terminal equipment
US9483846B2 (en) Data interpolation and classification method for map data visualization
CN110580706A (en) Method and device for extracting video background model
CN114842213A (en) Obstacle contour detection method and device, terminal equipment and storage medium
CN111726592B (en) Method and apparatus for obtaining architecture of image signal processor
CN103530886A (en) Low-calculation background removing method for video analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination