CN117315224A - Target detection method, system and medium for improving regression loss of bounding box - Google Patents
Target detection method, system and medium for improving regression loss of bounding box Download PDFInfo
- Publication number
- CN117315224A CN117315224A CN202311147316.9A CN202311147316A CN117315224A CN 117315224 A CN117315224 A CN 117315224A CN 202311147316 A CN202311147316 A CN 202311147316A CN 117315224 A CN117315224 A CN 117315224A
- Authority
- CN
- China
- Prior art keywords
- loss
- boundary
- box
- prediction
- real
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 133
- 238000012549 training Methods 0.000 claims abstract description 56
- 238000000034 method Methods 0.000 claims abstract description 16
- 230000006870 function Effects 0.000 claims description 139
- 238000004364 calculation method Methods 0.000 claims description 27
- 238000005457 optimization Methods 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 10
- 238000010276 construction Methods 0.000 claims 1
- 230000009286 beneficial effect Effects 0.000 description 8
- 238000011156 evaluation Methods 0.000 description 4
- 238000000638 solvent extraction Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/766—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a target detection method, a system and a medium for improving the regression loss of a boundary box, and relates to the technical field of computer vision; the method comprises the following steps: step S1, marking a real boundary box and a category of a target object in advance to form a training data set; s2, inputting pictures in the training data set into a trained detection model for prediction to obtain a prediction type and a prediction boundary frame; s3, constructing a loss function, and calculating positioning loss through the loss function; step S4, optimizing the detection model according to the positioning loss, replacing the trained detection model with an optimized detection model, and repeating the steps S2-S4 until the preset training times are reached; and S5, inputting the picture to be detected into the trained optimized detection model for prediction, and obtaining the target boundary box position and target classification. The prediction frame fitting degree is higher by constructing the boundary frame regression loss, so that the detection model is optimized, the convergence is quickened, and the detection precision is improved.
Description
Technical Field
The invention mainly relates to the technical field of computer vision, in particular to a target detection method, a target detection system and a target detection medium for improving regression loss of a boundary box.
Background
The target detection is an important task in the field of computer vision, is a neighborhood which is mutually promoted and developed with deep learning, and can be applied to the fields of real scenes such as vehicle detection, pedestrian detection, traffic light detection and the like, unmanned driving, security and protection systems and the like. In recent years, with the development of convolutional neural networks and the development of various deep learning models carefully designed for target detection, target detection has been greatly progressed.
The Chinese patent literature discloses a traffic scene target detection method and system based on deep learning, which utilizes an improved YOLOv3 target detection method to detect vehicles and pedestrians, and comprises the steps of establishing an improved YOLOv3 model for realizing traffic scene feature extraction and training, wherein the improved YOLOv3 model models a boundary frame based on the central information and the size information of the boundary frame to obtain a corresponding Gaussian model so as to predict the uncertainty of the boundary frame and correspondingly set a loss function; and decomposing the vehicle-mounted collected traffic video into pictures and marking, inputting the pictures into a modified YOLOv3 model obtained through training, and identifying vehicles and pedestrians in a traffic scene.
Measuring the similarity between two bounding boxes will use the l_n loss function. However, these loss functions may lead to inaccurate bounding box regression, as they do not match the evaluation index IoU (intersection over union), i.e. the value of the l_n loss function may be different for IoU. Therefore, the IoU, GIoU, DIoU three types of loss functions propose direct optimization evaluation IoU, so that the problem that the l_n loss function is not matched with the evaluation index can be solved.
The loss based on IoU can not well regress the length and width of the prediction frame in a state that the prediction frame and the real frame are not overlapped, and in addition, the traditional center point loss can not generate a larger gradient when the target height is overlapped, so that the regression of the center point of the prediction frame is further promoted. Thus resulting in a poor overall final detection accuracy.
Disclosure of Invention
The invention aims to solve the technical problem of providing a target detection method, a target detection system and a target detection medium for improving the regression loss of a boundary box aiming at the defects of the prior art.
The technical scheme for solving the technical problems is as follows:
an object detection method for improving a bounding box regression loss, comprising the steps of:
step S1, obtaining a real scene picture, marking a real boundary frame and a category of a target object in the real scene picture in advance to form a training data set, and training a preset detection model through the training data set;
s2, inputting the pictures in the training data set into a trained detection model for prediction to obtain a prediction type and a prediction boundary box of the target object;
s3, constructing a loss function, and calculating the positioning loss of the prediction boundary frame through the loss function to obtain the positioning loss;
step S4, optimizing the detection model according to the positioning loss to obtain an optimized detection model, replacing the trained detection model with the optimized detection model, and repeating the steps S2-S4 to train the optimized detection model until the preset training times are reached, and finishing the training;
and S5, inputting the picture to be detected into the trained optimal detection model for prediction to obtain the target boundary box position and target classification.
The beneficial effects of the invention are as follows: the detection model is trained, a loss function of the regression loss of the boundary box is constructed, the loss function is optimized, and the prediction box is positioned through the loss function optimization, so that the detection model is optimized to predict the pictures to be detected in batches, and the detection precision is improved.
Further, the constructing a loss function, calculating the positioning loss of the prediction boundary box through the loss function, so as to obtain the positioning loss, specifically:
calculating a center point loss function through the prediction boundary box and the real boundary box to obtain the center point loss function; calculating boundary frame edge ratio loss through the prediction boundary frame and the real boundary frame to obtain the boundary frame edge ratio loss;
calculating the intersection of the prediction boundary box and the real boundary box to obtain the intersection; calculating a union of the prediction boundary box and the real boundary box to obtain the union; dividing the intersection set and the union set to obtain an intersection ratio loss;
wherein the prediction bounding box
The real bounding box
Constructing the loss function through the center point loss function, the boundary box side ratio loss, the cross ratio loss and the weight factor to obtain the loss function;
the expression of the loss function is:
wherein,for the loss function, +.>For the center point loss function, +.>For boundary box edge ratio loss, +.>For the cross-ratio loss, α is the weight factor.
The beneficial effects of adopting the further scheme are as follows: considering the alignment of the center point and the edge of the boundary frame, optimizing the detection model by constructing a loss function, wherein the loss function is a regression function with stronger constraint on one hand, and on the other hand, the loss function can carry out good regression on the prediction boundary frame when the prediction frame and the real frame are not overlapped, and can generate higher gradient to promote the regression of the prediction frame when the prediction frame and the real frame are highly overlapped.
Further, the calculating the center point loss function through the prediction boundary box and the real boundary box, so as to obtain the center point loss function, specifically:
obtaining the center point loss function by calculating the distance between the predicted boundary frame center point and the real boundary frame center point and the distance between the predicted boundary frame vertex and the real boundary frame vertex;
the expression of the center point loss function is:
wherein,as a center point loss function, c g Is the true boundary box center point->c p For predicting the boundary box center point-> For real bounding box vertices-> For predicting bounding box verticesD () is the distance calculated between the two points.
The predicted boundary frame vertex is a opposite vertex angle vertex of the predicted boundary frame;
the real boundary frame vertex is a opposite vertex angle vertex of the real boundary frame;
the distance between the prediction bounding box vertex and the real bounding box vertex is: the distance between the two vertices that are furthest adjacent to the true bounding box and the prediction bounding box.
The beneficial effects of adopting the further scheme are as follows: and the coordinates of the center point are obtained through the coordinates of the real boundary box and the predicted boundary box, so that the center loss function effectively helps the detection model to fit data, and the accuracy of the detection model is improved. The method helps the model to converge more quickly, reduces the calculation complexity of the detection model, improves the generalization capability of the detection model, and prevents overfitting.
Further, the calculating the boundary frame edge ratio loss through the prediction boundary frame and the real boundary frame, so as to obtain the boundary frame edge ratio loss, specifically:
calculating the width ratio and the height ratio of the prediction boundary box and the real boundary box to obtain the boundary box edge ratio loss,
the expression of the boundary box edge ratio loss is as follows:
wherein,for boundary box edge ratio loss, w min For predicting the minimum width of the bounding box and the real bounding box, w max To predict the maximum width value of the bounding box and the true bounding box, h min To predict the minimum height of the bounding box and the true bounding box, h max Is the maximum height value of the prediction boundary box and the real boundary box.
The beneficial effects of adopting the further scheme are as follows: and calculating the minimum width, the maximum width, the minimum height and the maximum height of the prediction boundary frame and the real boundary frame through the coordinates of the real boundary frame and the prediction boundary frame, and then calculating the edge ratio loss to optimize the detection model.
The other technical scheme for solving the technical problems is as follows:
an object detection system that improves bounding box regression loss, comprising: the system comprises a training set acquisition module, a model training module, a model optimization module and a model operation module;
the training set acquisition module is used for acquiring a real scene picture, marking a real boundary frame and a category of a target object in the real scene picture in advance to form a training data set, and training a preset detection model through the training data set;
the model training module is used for inputting pictures in the training data set into a trained detection model to predict, so as to obtain a prediction type and a prediction boundary box of the target object;
the model optimization module is used for constructing a loss function, and calculating the positioning loss of the prediction boundary frame through the loss function to obtain the positioning loss; optimizing the detection model according to the positioning loss to obtain an optimized detection model, replacing the trained detection model with the optimized detection model, and repeating the model training module and the model optimizing module to train the optimized detection model until the preset training times are reached, and finishing the training;
the model running module is used for inputting the picture to be detected into the trained optimized detection model for prediction to obtain the target boundary box position and the target classification.
The beneficial effects of the invention are as follows: the detection model is trained, a loss function of the regression loss of the boundary box is constructed, the loss function is optimized, and the prediction box is positioned through the loss function optimization, so that the detection model is optimized to predict the pictures to be detected in batches, and the detection precision is improved.
Further, the model optimization module includes: the system comprises a center point loss calculation unit, an edge ratio loss calculation unit and an intersection ratio loss calculation unit;
the center point loss calculation unit is used for calculating a center point loss function through the prediction boundary box and the real boundary box to obtain the center point loss function;
the boundary ratio loss calculation unit is used for calculating boundary frame boundary ratio loss through the prediction boundary frame and the real boundary frame to obtain the boundary frame boundary ratio loss;
the intersection ratio loss calculation unit is used for calculating the intersection of the prediction boundary frame and the real boundary frame to obtain the intersection; calculating a union of the prediction boundary box and the real boundary box to obtain the union; dividing the intersection set and the union set to obtain an intersection ratio loss;
wherein the prediction bounding box
The real bounding box
Constructing the loss function through the center point loss function, the boundary box side ratio loss, the cross ratio loss and the weight factor to obtain the loss function;
the expression of the loss function is:
wherein alpha is a weight factor,for the loss function, +.>For the center point loss function, +.>For boundary box edge ratio loss, +.>Is the cross-ratio loss.
The beneficial effects of adopting the further scheme are as follows: considering the alignment of the center point and the edge of the boundary frame, optimizing the detection model by constructing a loss function, wherein the loss function is a regression function with stronger constraint on one hand, and on the other hand, the loss function can carry out good regression on the prediction boundary frame when the prediction frame and the real frame are not overlapped, and can generate higher gradient to promote the regression of the prediction frame when the prediction frame and the real frame are highly overlapped.
Further, in the center point loss calculation unit, the center point loss function is calculated through the prediction bounding box and the real bounding box, so as to obtain the center point loss function, which specifically is:
obtaining the center point loss function by calculating the distance between the predicted boundary frame center point and the real boundary frame center point and the distance between the predicted boundary frame vertex and the real boundary frame vertex;
the expression of the center point loss function is:
wherein,as a center point loss function, c g Is the true boundary box center point->c p For predicting the boundary box center point-> For real bounding box vertices-> For predicting bounding box verticesD () is the distance calculated between the two points.
The beneficial effects of adopting the further scheme are as follows: and the coordinates of the center point are obtained through the coordinates of the real boundary box and the predicted boundary box, so that the center loss function effectively helps the detection model to fit data, and the accuracy of the detection model is improved. The method helps the model to converge more quickly, reduces the calculation complexity of the detection model, improves the generalization capability of the detection model, and prevents overfitting.
Further, in the edge ratio loss calculation unit, the edge ratio loss of the boundary frame is calculated through the prediction boundary frame and the real boundary frame, so as to obtain the edge ratio loss of the boundary frame, specifically:
calculating the width ratio and the height ratio of the prediction boundary box and the real boundary box to obtain the boundary box edge ratio loss,
the expression of the boundary box edge ratio loss is as follows:
wherein,for boundary box edge ratio loss, w min For predicting the minimum width of the bounding box and the real bounding box, w max To predict the maximum width value of the bounding box and the true bounding box, h min To predict the minimum height of the bounding box and the true bounding box, h max Is the maximum height value of the prediction boundary box and the real boundary box.
The beneficial effects of adopting the further scheme are as follows: and calculating the minimum width, the maximum width, the minimum height and the maximum height of the prediction boundary frame and the real boundary frame through the coordinates of the real boundary frame and the prediction boundary frame, and then calculating the edge ratio loss to optimize the detection model.
The other technical scheme for solving the technical problems is as follows:
an object detection system for improving a bounding box regression loss, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, when executing the computer program, performing the steps of the object detection method for improving a bounding box regression loss as described.
The other technical scheme for solving the technical problems is as follows:
a computer readable storage medium storing a computer program which, when executed by a processor, performs the steps of the method for target detection to improve bounding box regression loss as described.
Drawings
FIG. 1 is a flowchart of a target detection method for improving the regression loss of a bounding box according to an embodiment of the present invention;
FIG. 2 is a block diagram of an object detection system for improving bounding box regression loss according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a target detection method for improving a regression loss of a bounding box according to an embodiment of the present invention.
Detailed Description
The principles and features of the present invention are described below with reference to the drawings, the examples are illustrated for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.
As shown in fig. 1, a target detection method for improving a regression loss of a bounding box includes the steps of:
step S1, obtaining a real scene picture, marking a real boundary frame and a category of a target object in the real scene picture in advance to form a training data set, and training a preset detection model through the training data set;
s2, inputting the pictures in the training data set into a trained detection model for prediction to obtain a prediction type and a prediction boundary box of the target object;
s3, constructing a loss function, and calculating the positioning loss of the prediction boundary frame through the loss function to obtain the positioning loss;
step S4, optimizing the detection model according to the positioning loss to obtain an optimized detection model, replacing the trained detection model with the optimized detection model, and repeating the steps S2-S4 to train the optimized detection model until the preset training times are reached, and finishing the training;
and S5, inputting the picture to be detected into the trained optimal detection model for prediction to obtain the target boundary box position and target classification.
According to the scheme, the detection model is trained, the loss function of the regression loss of the boundary box is constructed, the loss function is optimized, and the prediction box is positioned through the loss function optimization, so that the detection model is optimized to predict the pictures to be detected in batches, and the detection precision is improved.
As shown in fig. 3, preferably, the constructing a loss function calculates the positioning loss of the prediction bounding box through the loss function, so as to obtain the positioning loss, specifically:
calculating a center point loss function through the prediction boundary box and the real boundary box to obtain the center point loss function; calculating boundary frame edge ratio loss through the prediction boundary frame and the real boundary frame to obtain the boundary frame edge ratio loss;
calculating the intersection of the prediction boundary box and the real boundary box to obtain the intersection; calculating a union of the prediction boundary box and the real boundary box to obtain the union; dividing the intersection set and the union set to obtain an intersection ratio loss;
wherein the prediction bounding box
The real edgeFrame for border
Constructing the loss function through the center point loss function, the boundary box side ratio loss, the cross ratio loss and the weight factor to obtain the loss function;
the expression of the loss function is:
wherein alpha is a weight factor,for the loss function, +.>For the center point loss function, +.>For boundary box edge ratio loss, +.>Is the cross-ratio loss.
Specifically, the value of the weight factor α is set to 2.5.
In the above embodiment, the detection model is optimized by constructing the loss function in consideration of the alignment of the center point and the edge of the boundary frame, on the one hand, the loss function is a regression function with stronger constraint, and on the other hand, the loss function can perform good regression on the prediction boundary frame when the prediction frame and the real frame are not overlapped, and can generate higher gradient to promote the regression of the prediction frame when the prediction frame and the real frame are highly overlapped.
Preferably, the calculating the center point loss function by using the prediction bounding box and the real bounding box, so as to obtain the center point loss function, specifically:
obtaining the center point loss function by calculating the distance between the predicted boundary frame center point and the real boundary frame center point and the distance between the predicted boundary frame vertex and the real boundary frame vertex;
the expression of the center point loss function is:
wherein,as a center point loss function, c g Is the true boundary box center point->c p For predicting the boundary box center point-> For real bounding box vertices-> For predicting bounding box verticesD () is the distance calculated between the two points.
The distance between the prediction bounding box vertex and the real bounding box vertex is: the distance between the two vertices that are furthest adjacent to the true bounding box and the prediction bounding box.
In the above embodiment, the coordinates of the center point are obtained through the coordinates of the real boundary box and the predicted boundary box, so that the center loss function effectively helps the detection model fit data, thereby improving the accuracy of the detection model. The method helps the model to converge more quickly, reduces the calculation complexity of the detection model, improves the generalization capability of the detection model, and prevents overfitting.
Preferably, the calculating, by the prediction bounding box and the real bounding box, the bounding box edge ratio loss is obtained, specifically:
calculating the width ratio and the height ratio of the prediction boundary box and the real boundary box to obtain the boundary box edge ratio loss,
the expression of the boundary box edge ratio loss is as follows:
wherein,for boundary box edge ratio loss, w min For predicting the minimum width of the bounding box and the real bounding box, w max To predict the maximum width value of the bounding box and the true bounding box, h min To predict the minimum height of the bounding box and the true bounding box, h max Is the maximum height value of the prediction boundary box and the real boundary box.
In the above embodiment, the minimum width, the maximum width, the minimum height and the maximum height of the prediction boundary frame and the real boundary frame are calculated according to the coordinates of the real boundary frame and the prediction boundary frame, and then the edge ratio loss is calculated to optimize the detection model.
To verify the effectiveness of the present invention, the present invention is compared to other current forefront loss functions on a homemade target detection dataset.
Table 1 comparison of the results of 7 loss functions, such as IoU, GIoU, DIoU, CIoU, SIoU, EIoU, WIoU, SCA, SCAIoU and SCPAIoU, were used on the homemade target detection dataset using the YOLOv5 (r 7.0) detection framework, respectively, where SCPAIoU is the loss function of the present invention.
Loss/Evaluation | mAP50 | mAP75 | mAP0.5:0.95 |
IoU | 96.6 | 88.2 | 79.2 |
GIoU | 96.5 | 88.1 | 78.8 |
DIoU | 96.7 | 88 | 79.1 |
CIoU | 96.8 | 87.5 | 78.5 |
SIoU | 96.7 | 87.7 | 78.9 |
EIoU | 97.1 | 88.3 | 79.2 |
WIoU | 96.5 | 88.1 | 78.8 |
SCA | 96.8 | 88.2 | 79 |
SCPAIoU | 97.3 | 89.1 | 79.7 |
Table 1 comparison of loss function results
As can be seen from Table 1, the loss function of the present invention achieves the highest accuracy on the homemade dataset, indicating the superiority of the method.
As shown in fig. 2, an object detection system for improving the regression loss of a bounding box, comprising: the system comprises a training set acquisition module, a model training module, a model optimization module and a model operation module;
the training set acquisition module is used for acquiring a real scene picture, marking a real boundary frame and a category of a target object in the real scene picture in advance to form a training data set, and training a preset detection model through the training data set;
the model training module is used for inputting pictures in the training data set into a trained detection model to predict, so as to obtain a prediction type and a prediction boundary box of the target object;
the model optimization module is used for constructing a loss function, and calculating the positioning loss of the prediction boundary frame through the loss function to obtain the positioning loss; optimizing the detection model according to the positioning loss to obtain an optimized detection model, replacing the trained detection model with the optimized detection model, and repeating the model training module and the model optimizing module to train the optimized detection model until the preset training times are reached, and finishing the training;
the model running module is used for inputting the picture to be detected into the trained optimized detection model for prediction to obtain the target boundary box position and the target classification.
According to the scheme, the detection model is trained, the loss function of the regression loss of the boundary box is constructed, the loss function is optimized, and the prediction box is positioned through the loss function optimization, so that the detection model is optimized to predict the pictures to be detected in batches, and the detection precision is improved.
Preferably, the model optimization module includes: the system comprises a center point loss calculation unit, an edge ratio loss calculation unit and an intersection ratio loss calculation unit;
the center point loss calculation unit is used for calculating a center point loss function through the prediction boundary box and the real boundary box to obtain the center point loss function;
the boundary ratio loss calculation unit is used for calculating boundary frame boundary ratio loss through the prediction boundary frame and the real boundary frame to obtain the boundary frame boundary ratio loss;
the intersection ratio loss calculation unit is used for calculating the intersection of the prediction boundary frame and the real boundary frame to obtain the intersection; calculating a union of the prediction boundary box and the real boundary box to obtain the union; dividing the intersection set and the union set to obtain an intersection ratio loss;
wherein the prediction bounding box
The real bounding box
Constructing the loss function through the center point loss function, the boundary box side ratio loss, the cross ratio loss and the weight factor to obtain the loss function;
the expression of the loss function is:
wherein, the alpha weight factor,for the loss function, +.>For the center point loss function, +.>For boundary box edge ratio loss, +.>Is the cross-ratio loss.
Specifically, the value of the weight factor α is set to 2.5.
In the above embodiment, the detection model is optimized by constructing the loss function in consideration of the alignment of the center point and the edge of the boundary frame, on the one hand, the loss function is a regression function with stronger constraint, and on the other hand, the loss function can perform good regression on the prediction boundary frame when the prediction frame and the real frame are not overlapped, and can generate higher gradient to promote the regression of the prediction frame when the prediction frame and the real frame are highly overlapped.
Preferably, in the center point loss calculation unit, the center point loss function is calculated through the prediction bounding box and the real bounding box, so as to obtain the center point loss function, which specifically is:
obtaining the center point loss function by calculating the distance between the predicted boundary frame center point and the real boundary frame center point and the distance between the predicted boundary frame vertex and the real boundary frame vertex;
the expression of the center point loss function is:
wherein,as a center point loss function, c g Is the true boundary box center point->c p For predicting the boundary box center point-> For real bounding box vertices-> For predicting bounding box verticesD () is the distance calculated between the two points.
In the above embodiment, the coordinates of the center point are obtained through the coordinates of the real boundary box and the predicted boundary box, so that the center loss function effectively helps the detection model fit data, thereby improving the accuracy of the detection model. The method helps the model to converge more quickly, reduces the calculation complexity of the detection model, improves the generalization capability of the detection model, and prevents overfitting.
Preferably, in the edge ratio loss calculation unit, the edge ratio loss of the boundary frame is calculated through the prediction boundary frame and the real boundary frame, so as to obtain the edge ratio loss of the boundary frame, specifically:
calculating the width ratio and the height ratio of the prediction boundary box and the real boundary box to obtain the boundary box edge ratio loss,
the expression of the boundary box edge ratio loss is as follows:
wherein,for boundary box edge ratio loss, w min For predicting the minimum width of the bounding box and the real bounding box, w max To predict the maximum width value of the bounding box and the true bounding box, h min To predict the minimum height of the bounding box and the true bounding box, h max Is the maximum height value of the prediction boundary box and the real boundary box.
In the above embodiment, the minimum width, the maximum width, the minimum height and the maximum height of the prediction boundary frame and the real boundary frame are calculated according to the coordinates of the real boundary frame and the prediction boundary frame, and then the edge ratio loss is calculated to optimize the detection model.
Preferably, the target detection system for improving the regression loss of the boundary box comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the steps of the target detection method for improving the regression loss of the boundary box when executing the computer program.
Preferably, a computer readable storage medium stores a computer program which, when executed by a processor, implements the steps of the target detection method of improving bounding box regression loss as described.
In the several embodiments provided in the present application, it should be understood that the disclosed method, system, and storage medium may be implemented in other manners. For example, the above-described method and system embodiments are merely illustrative, e.g., the partitioning of elements is merely a logical functional partitioning, and there may be additional partitioning in actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed.
The elements described as separate elements may or may not be physically separate, and elements shown as elements may or may not be physically separate. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment of the present invention.
In this document, the portions not described in the specification are all prior art or common general knowledge. Relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.
Claims (10)
1. A target detection method for improving a bounding box regression loss, comprising the steps of:
step S1, obtaining a real scene picture, marking a real boundary frame and a category of a target object in the real scene picture in advance to form a training data set, and training a preset detection model through the training data set;
s2, inputting the pictures in the training data set into a trained detection model for prediction to obtain a prediction type and a prediction boundary box of the target object;
s3, constructing a loss function, and calculating the positioning loss of the prediction boundary frame through the loss function to obtain the positioning loss;
step S4, optimizing the detection model according to the positioning loss to obtain an optimized detection model, replacing the trained detection model with the optimized detection model, and repeating the steps S2-S4 to train the optimized detection model until the preset training times are reached, and finishing the training;
and S5, inputting the picture to be detected into the trained optimal detection model for prediction to obtain the target boundary box position and target classification.
2. The method for detecting the target of improving the regression loss of the boundary box according to claim 1, wherein the construction loss function is used for calculating the positioning loss of the prediction boundary box through the loss function to obtain the positioning loss, specifically:
calculating a center point loss function through the prediction boundary box and the real boundary box to obtain the center point loss function; calculating boundary frame edge ratio loss through the prediction boundary frame and the real boundary frame to obtain the boundary frame edge ratio loss;
calculating the intersection of the prediction boundary box and the real boundary box to obtain the intersection; calculating a union of the prediction boundary box and the real boundary box to obtain the union; dividing the intersection set and the union set to obtain an intersection ratio loss;
wherein the prediction bounding box
The real bounding box
Constructing the loss function through the center point loss function, the boundary box side ratio loss, the cross ratio loss and the weight factor to obtain the loss function;
the expression of the loss function is:
wherein,for the loss function, +.>For the center point loss function, +.>For the boundary box edge ratio loss,for the cross-ratio loss, α is the weight factor.
3. The method for detecting the target of improving the regression loss of the boundary box according to claim 2, wherein the calculating the center point loss function by the prediction boundary box and the real boundary box is performed to obtain the center point loss function, specifically:
obtaining the center point loss function by calculating the distance between the predicted boundary frame center point and the real boundary frame center point and the distance between the predicted boundary frame vertex and the real boundary frame vertex;
the expression of the center point loss function is:
wherein,as a center point loss function, c g Is the true boundary box center point->c p For predicting the boundary box center point-> For real bounding box vertices->For predicting bounding box verticesD () is the distance calculated between the two points.
4. The method for detecting the target for improving the regression loss of the boundary frame according to claim 2, wherein the calculating the boundary frame edge ratio loss by the prediction boundary frame and the real boundary frame is performed to obtain the boundary frame edge ratio loss, specifically:
calculating the width ratio and the height ratio of the prediction boundary box and the real boundary box to obtain the boundary box edge ratio loss,
the expression of the boundary box edge ratio loss is as follows:
wherein,for boundary box edge ratio loss, w min For predicting the minimum width of the bounding box and the real bounding box, w max To predict the maximum width value of the bounding box and the true bounding box, h min To predict the minimum height of the bounding box and the true bounding box, h max Is the maximum height value of the prediction boundary box and the real boundary box.
5. An object detection system for improving bounding box regression loss, comprising: the system comprises a training set acquisition module, a model training module, a model optimization module and a model operation module;
the training set acquisition module is used for acquiring a real scene picture, marking a real boundary frame and a category of a target object in the real scene picture in advance to form a training data set, and training a preset detection model through the training data set;
the model training module is used for inputting pictures in the training data set into a trained detection model to predict, so as to obtain a prediction type and a prediction boundary box of the target object;
the model optimization module is used for constructing a loss function, and calculating the positioning loss of the prediction boundary frame through the loss function to obtain the positioning loss; optimizing the detection model according to the positioning loss to obtain an optimized detection model, replacing the trained detection model with the optimized detection model, and repeating the model training module and the model optimizing module to train the optimized detection model until the preset training times are reached, and finishing the training;
the model running module is used for inputting the picture to be detected into the trained optimized detection model for prediction to obtain the target boundary box position and the target classification.
6. The object detection system for improving bounding box regression loss of claim 5 wherein the model optimization module comprises: the system comprises a center point loss calculation unit, an edge ratio loss calculation unit and an intersection ratio loss calculation unit;
the center point loss calculation unit is used for calculating a center point loss function through the prediction boundary box and the real boundary box to obtain the center point loss function;
the boundary ratio loss calculation unit is used for calculating boundary frame boundary ratio loss through the prediction boundary frame and the real boundary frame to obtain the boundary frame boundary ratio loss;
the intersection ratio loss calculation unit is used for calculating the intersection of the prediction boundary frame and the real boundary frame to obtain the intersection; calculating a union of the prediction boundary box and the real boundary box to obtain the union; dividing the intersection set and the union set to obtain an intersection ratio loss;
wherein the prediction bounding box
The real bounding box
Constructing the loss function through the center point loss function, the boundary box side ratio loss, the cross ratio loss and the weight factor to obtain the loss function;
the expression of the loss function is:
wherein,for the loss function, +.>For the center point loss function, +.>For the boundary box edge ratio loss,for the cross-ratio loss, α is the weight factor.
7. The target detection system for improving a regression loss of a bounding box according to claim 6, wherein the center point loss calculation unit calculates a center point loss function by using the prediction bounding box and the real bounding box, to obtain the center point loss function, specifically:
obtaining the center point loss function by calculating the distance between the predicted boundary frame center point and the real boundary frame center point and the distance between the predicted boundary frame vertex and the real boundary frame vertex;
the expression of the center point loss function is:
wherein,as a center point loss function, c g Is the true boundary box center point->c p For predicting the boundary box center point->For real bounding box vertices->For predicting bounding box verticesD () is the distance calculated between the two points.
8. The target detection system for improving a regression loss of a bounding box according to claim 6, wherein the edge ratio loss calculating unit calculates the edge ratio loss of the bounding box by using the prediction bounding box and the real bounding box, and the edge ratio loss of the bounding box is specifically:
calculating the width ratio and the height ratio of the prediction boundary box and the real boundary box to obtain the boundary box edge ratio loss,
the expression of the boundary box edge ratio loss is as follows:
wherein,for boundary box edge ratio loss, w min For predicting the minimum width of the bounding box and the real bounding box, w max To predict the maximum width value of the bounding box and the true bounding box, h min To predict the minimum height of the bounding box and the true bounding box, h max Is the maximum height value of the prediction boundary box and the real boundary box.
9. An object detection system for improving a bounding box regression loss, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the object detection method for improving a bounding box regression loss according to any of claims 1 to 7 when executing the computer program.
10. A computer readable storage medium storing a computer program, characterized in that the object detection method of improving the regression loss of a bounding box according to any one of claims 1 to 7 is implemented when the computer program is executed by a processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311147316.9A CN117315224A (en) | 2023-09-06 | 2023-09-06 | Target detection method, system and medium for improving regression loss of bounding box |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311147316.9A CN117315224A (en) | 2023-09-06 | 2023-09-06 | Target detection method, system and medium for improving regression loss of bounding box |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117315224A true CN117315224A (en) | 2023-12-29 |
Family
ID=89272867
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311147316.9A Pending CN117315224A (en) | 2023-09-06 | 2023-09-06 | Target detection method, system and medium for improving regression loss of bounding box |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117315224A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118298002A (en) * | 2024-04-24 | 2024-07-05 | 国家计算机网络与信息安全管理中心 | Method and device for optimizing network security image target detection model algorithm |
-
2023
- 2023-09-06 CN CN202311147316.9A patent/CN117315224A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118298002A (en) * | 2024-04-24 | 2024-07-05 | 国家计算机网络与信息安全管理中心 | Method and device for optimizing network security image target detection model algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110148196B (en) | Image processing method and device and related equipment | |
Soilán et al. | Traffic sign detection in MLS acquired point clouds for geometric and image-based semantic inventory | |
WO2018068653A1 (en) | Point cloud data processing method and apparatus, and storage medium | |
Zhang et al. | When Dijkstra meets vanishing point: a stereo vision approach for road detection | |
CN106446150B (en) | A kind of method and device of vehicle precise search | |
WO2022134996A1 (en) | Lane line detection method based on deep learning, and apparatus | |
Graovac et al. | Detection of road image borders based on texture classification | |
CN108052904B (en) | Method and device for acquiring lane line | |
CN111179152A (en) | Road sign identification method and device, medium and terminal | |
CN111104538A (en) | Fine-grained vehicle image retrieval method and device based on multi-scale constraint | |
CN111460927A (en) | Method for extracting structured information of house property certificate image | |
CN111274926B (en) | Image data screening method, device, computer equipment and storage medium | |
Mei et al. | Scene-adaptive off-road detection using a monocular camera | |
CN105608417A (en) | Traffic signal lamp detection method and device | |
CN112016605A (en) | Target detection method based on corner alignment and boundary matching of bounding box | |
CN113011331B (en) | Method and device for detecting whether motor vehicle gives way to pedestrians, electronic equipment and medium | |
CN111126459A (en) | Method and device for identifying fine granularity of vehicle | |
CN117315224A (en) | Target detection method, system and medium for improving regression loss of bounding box | |
CN111400533A (en) | Image screening method and device, electronic equipment and storage medium | |
CN117949942B (en) | Target tracking method and system based on fusion of radar data and video data | |
CN113255444A (en) | Training method of image recognition model, image recognition method and device | |
CN103745197A (en) | Detection method of license plate and device thereof | |
CN113158895A (en) | Bill identification method and device, electronic equipment and storage medium | |
WO2024140336A1 (en) | Vehicle-mounted ranging method, system and apparatus, electronic apparatus and storage medium | |
Abujayyab et al. | Integrating object-based and pixel-based segmentation for building footprint extraction from satellite images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |