CN112241675A - Object detection model training method and device - Google Patents

Object detection model training method and device Download PDF

Info

Publication number
CN112241675A
CN112241675A CN201910659672.6A CN201910659672A CN112241675A CN 112241675 A CN112241675 A CN 112241675A CN 201910659672 A CN201910659672 A CN 201910659672A CN 112241675 A CN112241675 A CN 112241675A
Authority
CN
China
Prior art keywords
bounding box
object detection
loss function
value
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910659672.6A
Other languages
Chinese (zh)
Inventor
周定富
方进
宋希彬
官晨晔
杨睿刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910659672.6A priority Critical patent/CN112241675A/en
Publication of CN112241675A publication Critical patent/CN112241675A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides an object detection model training method and device. The method comprises the following steps: determining an object true value bounding box according to the labeling data of the detection target; inputting the detection target into an object detection model, and obtaining an object detection surrounding frame according to detection data output by the object detection model; determining the intersection and parallel ratio of the object true value surrounding frame and the object detection surrounding frame according to the object true value surrounding frame and the object detection surrounding frame; determining a loss function value according to the intersection ratio; and according to the loss function value, performing back propagation to optimize the object detection model. The object detection model training method and device provided by the embodiment of the invention can improve the accuracy of object detection model detection.

Description

Object detection model training method and device
Technical Field
The invention relates to the technical field of computers, in particular to a training method and a training device for an object detection model.
Background
Object detection is an important subject in the field of computer vision, and has wide application in the fields of virtual reality, cultural relic protection, machining, computer simulation and the like. Unmanned driving is a new technology in the field of transportation at present, and has a wide prospect. In the unmanned technology, an object detection technology plays a very important role. In order to realize object detection in intelligent driving, a deep learning model is usually adopted, and some existing data are used for training the deep learning model, so that the model can learn how to recognize objects in the training process. The object detection technology in the driving technology needs to have extremely high accuracy based on the consideration of safety.
Disclosure of Invention
The embodiment of the invention provides an object detection model training method, which aims to solve one or more technical problems in the prior art.
In a first aspect, an embodiment of the present invention provides an object detection model training method, including:
determining an object true value bounding box according to the labeling data of the detection target;
inputting the detection target into an object detection model, and obtaining an object detection surrounding frame according to detection data output by the object detection model;
determining the intersection and parallel ratio of the object true value surrounding frame and the object detection surrounding frame according to the object true value surrounding frame and the object detection surrounding frame;
determining a loss function value according to the intersection ratio;
and optimizing the object detection model according to the loss function value.
In one embodiment, determining the loss function value according to the intersection ratio in a case where there is an overlap between the object true value bounding box and the object detection bounding box includes:
determining a loss function value according to the following formula:
L=1-IoU;
where L is the loss function value and IoU is the cross-over ratio.
In one embodiment, determining the loss function value according to the intersection ratio in a case where there is no overlap between the object true value bounding box and the object detection bounding box comprises:
determining a loss function value according to the following formula:
L=1-GIoU;
wherein the content of the first and second substances,
Figure BDA0002137034310000021
wherein, AreaCFor containing the Area of the object truth value bounding box and the smallest bounding box of the object detection bounding box, AreaUThe area of the union of the object truth bounding box and the object detection bounding box is IoU, which is the intersection ratio.
In one embodiment, optimizing the object detection model based on the loss function values comprises:
according to the loss function value, performing back propagation calculation to obtain a back propagation calculation gradient value;
and calculating gradient values according to the back propagation, and optimizing parameters of the object detection model.
In one embodiment, determining the intersection ratio of the object truth value bounding box and the object detection bounding box according to the object truth value bounding box and the object detection bounding box includes:
determining the intersection and union of the object detection bounding box and the object truth value bounding box;
and determining the intersection ratio according to the intersection and the union.
In one embodiment, any edge of the object truth value bounding box and any edge of the object detection bounding box are in a non-parallel state.
In a second aspect, an embodiment of the present invention provides an object detection model training apparatus, including:
a truth value module: the real-value bounding box of the object is determined according to the marking data of the detection target;
a detection module: the object detection surrounding frame is used for inputting the detection target into an object detection model and obtaining an object detection surrounding frame according to detection data output by the object detection model;
and a cross-over ratio calculation module: the intersection ratio of the object truth value bounding box and the object detection bounding box is determined according to the object truth value bounding box and the object detection bounding box;
a loss function calculation module: determining a loss function value according to the intersection ratio;
an optimization module: for optimizing the object detection model in dependence on the loss function values.
In one embodiment, in the case that there is an overlap between the object truth bounding box and the object detection bounding box, the loss function calculation module is configured to:
determining a loss function value according to the following formula:
L=1-IoU;
where L is the loss function value and IoU is the cross-over ratio.
In one embodiment, in the case that there is no overlap between the object truth bounding box and the object detection bounding box, the loss function calculation module is configured to:
determining a loss function value according to the following formula:
L=1-GIoU;
wherein the content of the first and second substances,
Figure BDA0002137034310000031
wherein, AreaCFor containing the Area of the object truth value bounding box and the smallest bounding box of the object detection bounding box, AreaUThe area of the union of the object truth bounding box and the object detection bounding box is IoU, which is the intersection ratio.
In one embodiment, the optimization module comprises:
a back propagation calculation unit: the system is used for carrying out back propagation calculation according to the loss function value to obtain a back propagation calculation gradient value;
a reverse processing unit: and the gradient value is calculated according to the back propagation, and the parameters of the object detection model are optimized.
In one embodiment, the intersection ratio calculation module comprises:
an intersection set calculation unit: the intersection and union of the object detection bounding box and the object truth value bounding box are determined;
an intersection set processing unit: for determining the intersection ratio from the intersection and the union.
In one embodiment, any edge of the object truth value bounding box and any edge of the object detection bounding box are in a non-parallel state.
In a third aspect, an embodiment of the present invention provides an object detection model training apparatus, where functions of the apparatus may be implemented by hardware, or may be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above-described functions.
In one possible design, the structure of the apparatus includes a processor and a memory, the memory is used for storing a program for supporting the apparatus to execute the above object detection model training method, and the processor is configured to execute the program stored in the memory. The device may also include a communication interface for communicating with other devices or a communication network.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium for storing computer software instructions for an object detection model training apparatus, which includes a program for executing the object detection model training method.
One of the above technical solutions has the following advantages or beneficial effects:
in the embodiment of the invention, the loss function is determined by the intersection ratio of the object true value bounding box and the object detection bounding box, so that the accuracy of the object detection model for detecting the object can be improved, and the accuracy of the object detection model is ensured.
The foregoing summary is provided for the purpose of description only and is not intended to be limiting in any way. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features of the present invention will be readily apparent by reference to the drawings and following detailed description.
Drawings
In the drawings, like reference numerals refer to the same or similar parts or elements throughout the several views unless otherwise specified. The figures are not necessarily to scale. It is appreciated that these drawings depict only some embodiments in accordance with the disclosure and are therefore not to be considered limiting of its scope.
FIG. 1 shows a flow diagram of an object detection model training method according to an embodiment of the invention.
FIG. 2 shows a flow diagram of an object detection model training method according to an embodiment of the invention.
FIG. 3 shows a flow diagram of an object detection model training method according to an embodiment of the invention.
FIG. 4 shows a flow diagram of an object detection model training method according to an embodiment of the invention.
Fig. 5 is a block diagram showing a structure of an object detection model training apparatus according to an embodiment of the present invention.
Fig. 6 is a block diagram showing a structure of an object detection model training apparatus according to an embodiment of the present invention.
Fig. 7 shows a block diagram of the structure of an object detection model training apparatus according to an embodiment of the present invention.
Detailed Description
In the following, only certain exemplary embodiments are briefly described. As those skilled in the art will recognize, the described embodiments may be modified in various different ways, all without departing from the spirit or scope of the present invention. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.
Fig. 1 shows a flow chart of a training method of an object detection model according to an embodiment of the invention. As shown in fig. 1, the training method of the object detection model includes:
step S11: and determining an object true value bounding box according to the labeling data of the detection target.
Step S12: and inputting the detection target into an object detection model, and obtaining an object detection surrounding frame according to detection data output by the object detection model.
Step S13: and determining the intersection and combination ratio of the object true value surrounding frame and the object detection surrounding frame according to the object true value surrounding frame and the object detection surrounding frame.
Step S14: and determining a loss function value according to the intersection ratio.
Step S15: and optimizing the object detection model according to the loss function value.
In the object detection technology, an object is usually represented by a 2D (Two-dimensional) Bounding Box (Bounding Box) or a 3D (Three-dimensional) Bounding Box with parameters, specifically, the center, dimension, direction, and the like of the Bounding Box. Therefore, the object detection problem is tasked with narrowing the difference between the annotation data and the detection data. Compared with the detection data, the labeling data can accurately represent information such as the position, the angle and the like of the object in the detection target and can be used as reference data for identifying the object in the detection target. In the embodiment of the present invention, the execution order of step S11 and step S12 is not limited, and the two steps may be executed simultaneously or in any order.
Typically, the object detection framework uses a square loss and an absolute loss to optimize the object detection model. However, these general loss functions simply perform direct regression by using the attributes of the object, such as the center point position, the length, width, height, and deflection angle, as regression parameters. In the embodiment of the invention, the loss function is determined by the intersection ratio of the object true value bounding box and the object detection bounding box, so that the accuracy of the object detection model for detecting the object can be improved, and the accuracy of the object detection model is ensured.
In the embodiment of the invention, the detection target can be an image or a three-dimensional point cloud.
In one embodiment, determining the loss function value according to the intersection ratio in a case where there is no overlap between the object true value bounding box and the object detection bounding box comprises:
determining a loss function value according to the following formula:
L-1-IoU formula 1;
where L is the loss function value and IoU is the cross-over ratio.
In an embodiment of the present invention, IoU (overlap over Union) may represent a ratio of Intersection to Union of an object detection bounding box and a true value bounding box in an image. Compared with the square loss and the absolute value loss, the loss function value is determined by adopting IoU, and each parameter of a frame can be considered when the object detection model is optimized, so that the model optimized by adopting the loss function value can detect the object more accurately, and the effect of reducing the difference between detection data and a true value is achieved. Secondly, the calculation of IoU explicitly involves the relationship between parameters, and embodiments of the present invention can be optimized in conjunction with the relationship between the parameters when optimizing the model, as opposed to using two independent data as the loss function values. Again, IoU is itself an object detection related parameter, preferably as a loss function of the object detection model.
In one embodiment, determining the loss function value according to the intersection ratio in a case where there is no overlap between the object true value bounding box and the object detection bounding box comprises:
determining a loss function value according to the following formula:
L-1-GIoU formula 2;
wherein the content of the first and second substances,
Figure BDA0002137034310000061
wherein, GIoU is generalized cross-over ratio; areaCFor containing the Area of the object truth value bounding box and the smallest bounding box of the object detection bounding box, AreaUThe area of the union of the object truth bounding box and the object detection bounding box is IoU, which is the intersection ratio.
The optimization method of the object detection model provided by the embodiment of the invention is suitable for 2D object detection and 3D object detection in a detection target.
In the embodiment of the present invention, when there is an intersection between the detection bounding box and the true bounding box, the IoU value is calculated by the following formula:
Figure BDA0002137034310000062
a denotes a true value bounding box and B denotes a detection bounding box.
Fig. 2 shows a flow chart of an object detection method according to an embodiment of the invention. In this embodiment, the steps S11-S15 can refer to the related descriptions in the above embodiments, and are not described herein again.
The difference from the above embodiment is that, as shown in fig. 2, optimizing the object detection model according to the loss function value includes:
step S21: and carrying out back propagation calculation according to the loss function value to obtain a back propagation calculation gradient value.
Step S22: and calculating gradient values according to the back propagation, and optimizing parameters of the object detection model.
In the back propagation calculation process, the loss function value calculated by using the loss function calculation method provided by the embodiment of the invention can be used in any known back propagation calculation process.
In one embodiment, determining the intersection ratio of the object truth value bounding box and the object detection bounding box according to the object truth value bounding box and the object detection bounding box includes:
determining the intersection and union of the object detection bounding box and the object truth value bounding box;
and determining the intersection ratio according to the intersection and the union.
For example, when both the detection bounding box and the truth bounding box are two-dimensional boxes, in equation 3 above, a ≧ B denotes an area where the detection bounding box and the truth bounding box intersect, and a ≧ B denotes an area where the detection bounding box and the truth bounding box merge. When the detection surrounding frame and the truth surrounding frame are both three-dimensional frames, in the above formula 3, a ≡ B denotes a volume of an intersection of the detection surrounding frame and the truth surrounding frame, and a ≡ B denotes a volume of a union of the detection surrounding frame and the truth surrounding frame.
In a specific embodiment, when the detection bounding box and the true bounding box are both 2D boxes, IoU values may be calculated according to the area values of the overlap and merge portions between the two bounding boxes, that is:
A∩B=Areaoverlap;Areaoverlaprepresents the area of the overlapping portion of A and B;
A∪B=AreaA+AreaB-Areaoverlapi.e. the area of the union of a and B.
Because the value range of IoU is between 0 and 1, the value range of the loss value is also between 0 and 1.
In a specific embodiment, when both the detection bounding box and the true bounding box are 3D boxes, IoU is calculated from the volume values of the overlap and merge portions between the two bounding boxes, i.e.:
A∩B=Areaoverlap×hoverlap;Areaoverlaprepresents the area of the overlapping portion of A and B; h isoverlapRepresents the height of the overlapping part of A and B;
A∪B=(AreaA+AreaB-Areaoverlap)×hunioni.e. the area of the union of a and B. h isunionIndicating the height of the union of a and B.
In one embodiment, the annotation data comprises an object angle annotation value and the detection data comprises an object angle detection value.
In a specific embodiment, when the long sides or the wide sides of the detection bounding box and the true value bounding box are parallel to the coordinate axis, as shown in the figureAnd 3, neither the labeling data nor the detection data contains the angle coordinate value of the object, and the shaded part represents the overlapped part. The annotation data includes the coordinate values of the four angles of the true bounding box of the object: c1(x1,y1),D1(x2,y1),E1(x2,y2),F1(x1,y2). The detection data includes angle values C of four coordinates of a detection bounding box of the object2(x′1,y′1),D2(x′2,y′1),E2(x′2,y′2),F2(x′1,y′2). Wherein x1≤x2,y2≤y1;x′1≤x′2,y′2≤y′1
The intersection of A and B is calculated by the following formula:
A∩B=Areaoverlap=(max(x2,x′2)-min(x1,x′1))×(max(y1,y′1)-min(y2,y′2))。
the union calculation formula of A and B is as follows: a ═ U B ═ AreaA+AreaB-Areaoverlap
Wherein, AreaA=(x2-x1)×(y1-y2);AreaB=(x′2-x′1)×(y′1-y′2)。
In a specific embodiment, when the detection bounding box and the true bounding box have no longer parallel long sides and width sides to the coordinate axes, as shown in fig. 4, the annotation data includes coordinate values of four angles of the true bounding box of the object: c1(x1,y1),D1(x2,y1),E(x2,y2),F(x1,y2). The detection data includes angle values C of four coordinates of a detection bounding box of the object3(x3,y3),D3(x4,y3),E3(x4,y4),F3(x3,y4)
Figure BDA0002137034310000081
Figure BDA0002137034310000082
By adopting the object detection model training method provided by the embodiment of the invention, the data on the public data set is utilized for testing, and compared with the object detection model training method in the prior art, the object detection model training method has a remarkable improvement effect. And selecting public data on the PointPillars public data set for testing, and dividing the data into simple, general and difficult types according to the detection difficulty. And evaluating the quality of the test by adopting AP (Average Precision) and mAP (mean Average Precision). The test results are shown in table 1 below:
Figure BDA0002137034310000083
Figure BDA0002137034310000091
TABLE 1
As can be seen from the above table, according to the object detection model training method provided by the embodiment of the present invention, the loss function value is determined according to the cross-over ratio, and then the object detection model is optimized according to the loss function value. In a similar way, the object detection model training method provided by the embodiment of the invention is tested by adopting the public data set, and compared with the prior art that the object detection model is optimized by other loss function values, the accuracy of object detection can be obviously improved.
An embodiment of the present invention further provides an object detection model training apparatus, which has a structure shown in fig. 5, and includes:
true value block 51: the real-value bounding box of the object is determined according to the marking data of the detection target;
the detection module 52: the object detection surrounding frame is used for inputting the detection target into an object detection model and obtaining an object detection surrounding frame according to detection data output by the object detection model;
intersection ratio calculation module 53: the intersection ratio of the object truth value bounding box and the object detection bounding box is determined according to the object truth value bounding box and the object detection bounding box;
loss function calculation module 54: determining a loss function value according to the intersection ratio;
the optimization module 55: for optimizing the object detection model in dependence on the loss function values.
In one embodiment, in the case that there is an overlap between the object truth bounding box and the object detection bounding box, the loss function calculation module is configured to:
determining a loss function value according to the following formula:
L=1-IoU:
where L is the loss function value and IoU is the cross-over ratio.
In one embodiment, in the case that there is no overlap between the object truth bounding box and the object detection bounding box, the loss function calculation module is configured to:
determining a loss function value according to the following formula:
L=1-GIoU;
wherein the content of the first and second substances,
Figure BDA0002137034310000101
wherein, AreaCFor containing the Area of the object truth value bounding box and the smallest bounding box of the object detection bounding box, AreaUThe area of the union of the object truth bounding box and the object detection bounding box is IoU, which is the intersection ratio.
In one embodiment, as shown in fig. 6, the optimization module comprises:
back propagation calculating unit 61: the system is used for carrying out back propagation calculation according to the loss function value to obtain a back propagation calculation gradient value;
the reverse processing unit 62: and the gradient value is calculated according to the back propagation, and the parameters of the object detection model are optimized.
In one embodiment, the intersection ratio calculation module comprises:
an intersection set calculation unit: the intersection and union of the object detection bounding box and the object truth value bounding box are determined;
an intersection set processing unit: for determining the intersection ratio from the intersection and the union.
In one embodiment, any edge of the object truth value bounding box and any edge of the object detection bounding box are in a non-parallel state.
The functions of each module in each apparatus in the embodiments of the present invention may refer to the corresponding description in the above method, and are not described herein again.
Fig. 7 shows a block diagram of the structure of an apparatus according to an embodiment of the invention. As shown in fig. 7, the apparatus includes: a memory 910 and a processor 920, the memory 910 having stored therein computer programs operable on the processor 920. The processor 920, when executing the computer program, implements the object detection model optimization method in the above embodiments. The number of the memory 910 and the processor 920 may be one or more.
The apparatus/device/terminal/server further comprises:
and a communication interface 930 for communicating with an external device to perform data interactive transmission.
Memory 910 may include high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
If the memory 910, the processor 920 and the communication interface 930 are implemented independently, the memory 910, the processor 920 and the communication interface 930 may be connected to each other through a bus and perform communication with each other. The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 7, but this is not intended to represent only one bus or type of bus.
Optionally, in an implementation, if the memory 910, the processor 920 and the communication interface 930 are integrated on a chip, the memory 910, the processor 920 and the communication interface 930 may complete communication with each other through an internal interface.
An embodiment of the present invention provides a computer-readable storage medium, which stores a computer program, and the computer program is used for implementing the method of any one of the above embodiments when being executed by a processor.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a computer readable storage medium. The storage medium may be a read-only memory, a magnetic or optical disk, or the like.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various changes or substitutions within the technical scope of the present invention, and these should be covered by the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (14)

1. An object detection model training method, comprising:
determining an object true value bounding box according to the labeling data of the detection target;
inputting the detection target into an object detection model, and obtaining an object detection surrounding frame according to detection data output by the object detection model;
determining the intersection and parallel ratio of the object true value surrounding frame and the object detection surrounding frame according to the object true value surrounding frame and the object detection surrounding frame;
determining a loss function value according to the intersection ratio;
and optimizing the object detection model according to the loss function value.
2. The method of claim 1, wherein determining a loss function value based on the intersection ratio in the case of an overlap between the object truth bounding box and the object detection bounding box comprises:
determining a loss function value according to the following formula:
L=1-IoU;
where L is the loss function value and IoU is the cross-over ratio.
3. The method of claim 1, wherein determining a loss function value according to the intersection ratio in the absence of overlap between the object truth bounding box and the object detection bounding box comprises:
determining a loss function value according to the following formula:
L=1-GIoU;
wherein the content of the first and second substances,
Figure FDA0002137034300000011
wherein, AreaCFor containing the Area of the object truth value bounding box and the smallest bounding box of the object detection bounding box, AreaUThe area of the union of the object truth bounding box and the object detection bounding box is IoU, which is the intersection ratio.
4. The method of claim 1, wherein optimizing the object detection model based on the loss function values comprises:
according to the loss function value, performing back propagation calculation to obtain a back propagation calculation gradient value;
and calculating gradient values according to the back propagation, and optimizing parameters of the object detection model.
5. The method of claim 1, wherein determining the intersection ratio of the object truth bounding box and the object detection bounding box according to the object truth bounding box and the object detection bounding box comprises:
determining the intersection and union of the object detection bounding box and the object truth value bounding box;
and determining the intersection ratio according to the intersection and the union.
6. The method according to any one of claims 1 to 5, wherein any edge of the object truth bounding box is non-parallel to any edge of the object detection bounding box.
7. An object detection model training device, comprising:
a truth value module: the real-value bounding box of the object is determined according to the marking data of the detection target;
a detection module: the object detection surrounding frame is used for inputting the detection target into an object detection model and obtaining an object detection surrounding frame according to detection data output by the object detection model;
and a cross-over ratio calculation module: the intersection ratio of the object truth value bounding box and the object detection bounding box is determined according to the object truth value bounding box and the object detection bounding box;
a loss function calculation module: determining a loss function value according to the intersection ratio;
an optimization module: for optimizing the object detection model in dependence on the loss function values.
8. The apparatus of claim 7, wherein in the case that there is an overlap between the object true bounding box and the object detect bounding box, the loss function computation module is configured to:
determining a loss function value according to the following formula:
L=1-IoU;
where L is the loss function value and IoU is the cross-over ratio.
9. The apparatus of claim 7, wherein in the absence of overlap between the object truth bounding box and the object detection bounding box, the loss function computation module is configured to:
determining a loss function value according to the following formula:
L=1-GIoU;
wherein the content of the first and second substances,
Figure FDA0002137034300000021
wherein, AreaCFor containing the Area of the object truth value bounding box and the smallest bounding box of the object detection bounding box, AreaUThe area of the union of the object truth bounding box and the object detection bounding box is IoU, which is the intersection ratio.
10. The apparatus of claim 7, wherein the optimization module comprises:
a back propagation calculation unit: the system is used for carrying out back propagation calculation according to the loss function value to obtain a back propagation calculation gradient value;
a reverse processing unit: and the gradient value is calculated according to the back propagation, and the parameters of the object detection model are optimized.
11. The apparatus of claim 7, wherein the intersection ratio calculation module comprises:
an intersection set calculation unit: the intersection and union of the object detection bounding box and the object truth value bounding box are determined;
an intersection set processing unit: for determining the intersection ratio from the intersection and the union.
12. The apparatus according to any one of claims 7-11, wherein any edge of the object truth bounding box is non-parallel to any edge of the object detection bounding box.
13. An object detection model optimization apparatus, comprising:
one or more processors;
a memory for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-6.
14. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 6.
CN201910659672.6A 2019-07-19 2019-07-19 Object detection model training method and device Pending CN112241675A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910659672.6A CN112241675A (en) 2019-07-19 2019-07-19 Object detection model training method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910659672.6A CN112241675A (en) 2019-07-19 2019-07-19 Object detection model training method and device

Publications (1)

Publication Number Publication Date
CN112241675A true CN112241675A (en) 2021-01-19

Family

ID=74168009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910659672.6A Pending CN112241675A (en) 2019-07-19 2019-07-19 Object detection model training method and device

Country Status (1)

Country Link
CN (1) CN112241675A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023037156A1 (en) * 2021-09-13 2023-03-16 Sensetime International Pte. Ltd. Data processing methods, apparatuses and systems, media and computer devices

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117831A (en) * 2018-09-30 2019-01-01 北京字节跳动网络技术有限公司 The training method and device of object detection network
CN109145931A (en) * 2018-09-03 2019-01-04 百度在线网络技术(北京)有限公司 object detecting method, device and storage medium
CN109271984A (en) * 2018-07-24 2019-01-25 广东工业大学 A kind of multi-faceted license plate locating method based on deep learning
CN109285180A (en) * 2018-08-31 2019-01-29 电子科技大学 A kind of road vehicle tracking of 3D
CN109472264A (en) * 2018-11-09 2019-03-15 北京字节跳动网络技术有限公司 Method and apparatus for generating object detection model
CN109872366A (en) * 2019-02-25 2019-06-11 清华大学 Object dimensional method for detecting position and device based on depth fitting degree assessment network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109271984A (en) * 2018-07-24 2019-01-25 广东工业大学 A kind of multi-faceted license plate locating method based on deep learning
CN109285180A (en) * 2018-08-31 2019-01-29 电子科技大学 A kind of road vehicle tracking of 3D
CN109145931A (en) * 2018-09-03 2019-01-04 百度在线网络技术(北京)有限公司 object detecting method, device and storage medium
CN109117831A (en) * 2018-09-30 2019-01-01 北京字节跳动网络技术有限公司 The training method and device of object detection network
CN109472264A (en) * 2018-11-09 2019-03-15 北京字节跳动网络技术有限公司 Method and apparatus for generating object detection model
CN109872366A (en) * 2019-02-25 2019-06-11 清华大学 Object dimensional method for detecting position and device based on depth fitting degree assessment network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HAMID REZATOFIGHI 等: "Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression", ARXIV:1902.09630V1, pages 1 - 9 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023037156A1 (en) * 2021-09-13 2023-03-16 Sensetime International Pte. Ltd. Data processing methods, apparatuses and systems, media and computer devices

Similar Documents

Publication Publication Date Title
US10872439B2 (en) Method and device for verification
EP3620981B1 (en) Object detection method, device, apparatus and computer-readable storage medium
CN109214980B (en) Three-dimensional attitude estimation method, three-dimensional attitude estimation device, three-dimensional attitude estimation equipment and computer storage medium
CN108629231B (en) Obstacle detection method, apparatus, device and storage medium
US9135710B2 (en) Depth map stereo correspondence techniques
CN109946680B (en) External parameter calibration method and device of detection system, storage medium and calibration system
CN104025180B (en) There are five dimension rasterisations of conserved boundary
US20140152776A1 (en) Stereo Correspondence and Depth Sensors
WO2021051344A1 (en) Method and apparatus for determining lane lines in high-precision map
KR20170068462A (en) 3-Dimensional Model Generation Using Edges
CN110782517B (en) Point cloud labeling method and device, storage medium and electronic equipment
CN111145139A (en) Method, device and computer program for detecting 3D objects from 2D images
CN111179351B (en) Parameter calibration method and device and processing equipment thereof
CN114187589A (en) Target detection method, device, equipment and storage medium
CN112241675A (en) Object detection model training method and device
CN113759348A (en) Radar calibration method, device, equipment and storage medium
CN103837135A (en) Workpiece detecting method and system
CN112446374B (en) Method and device for determining target detection model
EP3605463B1 (en) Crossing point detector, camera calibration system, crossing point detection method, camera calibration method, and recording medium
Wiemann et al. An evaluation of open source surface reconstruction software for robotic applications
CN109583511B (en) Speed fusion method and device
CN114266879A (en) Three-dimensional data enhancement method, model training detection method, three-dimensional data enhancement equipment and automatic driving vehicle
CN113191279A (en) Data annotation method, device, equipment, storage medium and computer program product
EP3467764A1 (en) Image processing method and image processing apparatus
CN113129437B (en) Method and device for determining space coordinates of markers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210119