CN112053374A - 3D target bounding box estimation system based on GIoU - Google Patents

3D target bounding box estimation system based on GIoU Download PDF

Info

Publication number
CN112053374A
CN112053374A CN202010805891.3A CN202010805891A CN112053374A CN 112053374 A CN112053374 A CN 112053374A CN 202010805891 A CN202010805891 A CN 202010805891A CN 112053374 A CN112053374 A CN 112053374A
Authority
CN
China
Prior art keywords
giou
bounding box
max
point cloud
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010805891.3A
Other languages
Chinese (zh)
Inventor
杨武
孟涟肖
唐盖盖
苘大鹏
吕继光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN202010805891.3A priority Critical patent/CN112053374A/en
Publication of CN112053374A publication Critical patent/CN112053374A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • G06T2207/10044Radar image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of computer vision and pattern recognition, and particularly relates to a GIoU-based 3D target bounding box estimation system which comprises a radar point cloud preprocessing module, a 2D image preprocessing module and a GIoU-based multi-source fusion module. According to the method, the point cloud characteristics are obtained through the radar point cloud preprocessing module, the image characteristics are obtained through the 2D image preprocessing module, the point cloud characteristics and the image characteristics are subjected to fusion processing through the multi-source fusion module based on the GIoU, and finally the estimation result of the 3D target boundary frame is output. The method solves the problem of low estimation accuracy of the existing 3D target bounding box. The method can obviously improve the calibration accuracy of the 3D target and realize the high-accuracy 3D target boundary box estimation effect.

Description

3D target bounding box estimation system based on GIoU
Technical Field
The invention belongs to the technical field of computer vision and pattern recognition, and particularly relates to a 3D target bounding box estimation system based on a GIoU.
Background
In recent years, unmanned driving has been receiving attention from various enterprises, scholars, and the general public. Currently, there are two distinct ways to achieve unmanned driving: one is a progressive method adopted by traditional enterprises, namely starting from the existing auxiliary driving system, gradually increasing the functions of automatic steering, active collision avoidance and the like, realizing conditional unmanned driving and finally realizing unmanned driving when the cost and related technologies meet certain requirements; the other mode is a 'one-step-in-place' mode which is selected by high-tech IT enterprises as representatives and directly achieves the final aim of unmanned driving, namely the unmanned driving is not required to be achieved in a man-machine cooperation mode, and the participation of people cannot be relied on for ensuring the absolute safety of automatic driving. The latter selected technical route is relatively more challenging and risky, and thus needs innovative algorithms and efficient, robust systems to support. With this need, object detection and localization is particularly important because it corresponds to the ability of an intelligent unmanned system to more accurately "see" the scene in front of the eye and provide a large amount of useful information for unmanned system decision making or planning. 3D object detection is an important topic in automatic driving and robotics, where the accuracy of detection is the current difficulty of 3D object detection technology.
Bounding box regression is one of the most fundamental components in many 2D/3D computer vision tasks. Target location, multi-target detection, target tracking, etc. all rely on relevant bounding box regression. In recent years, the neural network technology is increasingly flourishing, and the strong nonlinear fitting capability of the neural network technology is very suitable for solving the problem of bounding box regression. The main trend towards improving application performance using neural networks is to propose a better framework backbone or better strategies to extract reliable local features. Nowadays, although the latest development of convolutional neural networks has realized 2D target detection in a complex environment, in practical application scenarios, common 2D target detection cannot provide all information required for sensing the environment, and only can provide the position of a target object in a 2D picture and the confidence of a corresponding category, in a real three-dimensional world, objects all have three-dimensional shapes, and most applications require information such as the length, width, height, deflection angle and the like of the target object. In recent years, researchers have proposed several 3D target bounding box estimation methods, but the accuracy of the target bounding box estimation result is low because the definition precision of the loss function of the neural network model adopted by the method is not high, so that it is still an open challenge to effectively improve the 3D target bounding box estimation accuracy.
Disclosure of Invention
The invention aims to solve the problem of low estimation accuracy of the existing 3D target bounding box, and provides a 3D target bounding box estimation system based on a GIoU (general object oriented unit).
The purpose of the invention is realized by the following technical scheme: the system comprises a radar point cloud preprocessing module, a 2D image preprocessing module and a multi-source fusion module based on a GIoU; the radar point cloud preprocessing module converts the input radar point cloud data into a digital feature representation form with fixed dimensionality and transmits the digital features of the radar point cloud data to the GIoU-based multi-source fusion module; the 2D image preprocessing module converts input 2D image data into a digital feature representation form with fixed dimensionality and transmits the digital features of the 2D image data to the GIoU-based multi-source fusion module; the GIoU-based multi-source fusion module fuses digital features of radar point cloud data and digital features of 2D image data into a 3D target boundary frame estimation result, specifically coordinates B of 8 vertexes of a predicted 3D boundary frameP=(xP 1,yP 1,zP 1,…,xP 8,yP 8,zP 8);
The multi-source fusion module based on the GIoU is a Dense neural network model, and the GIoU LOSS is used as a LOSS function; the GIoU LOSS calculation method comprises the following steps:
step 1: inputting 8 vertex coordinates B of a real 3D bounding boxT=(xT 1,yT 1,zT 1,…,xT 8,yT 8,zT 8);
Step 2: calculating the length of the real 3D bounding boxDegree LTWidth WTAnd height HT(ii) a Calculating the predicted 3D bounding box length LPWidth WPAnd height HP
And step 3: selecting a real 3D boundary frame and a 3D boundary frame with a center point closer to an original point from the predicted 3D boundary frame, and acquiring a vertex coordinate MAX (x) of the upper right corner of the 3D boundary frameMAX,yMAX,zMAX) And acquiring the coordinates MIN (x) of the vertex at the lower left corner of the 3D bounding box with the central point far away from the originMIN,yMIN,zMIN);
And 4, step 4: selecting the minimum x, y and z values x in the coordinates of all the vertexes of the real 3D bounding box and the predicted 3D bounding boxMIN、yMIN、zMINAnd the maximum x, y, z values xMAX、yMAX、zMAX
And 5: calculating a minimum bounding box B that can enclose the true 3D bounding box and the predicted 3D bounding boxcLength L ofCWidth WCAnd height HC
LC=xMAX-xMIN
WC=yMAX-yMIN
HC=zMAX-zMIN
Step 6: calculating the value of the GIoU;
Figure BDA0002629104380000021
Figure BDA0002629104380000022
wherein, VTVolume of true 3D bounding box, VT=LT*WT*HT;VPFor the predicted volume of the 3D bounding box, VP=LP*WP*HP;VcIs BcVolume of (V)c=Lc*Wc*Hc
And 7: calculating the value of the LOSS function GIoU LOSS;
GIoU LOSS=1-GIoU。
the present invention may further comprise:
the Dense neural network model comprises a three-layer structure; the first layer of the Dense neural network model is an input layer, the number of neurons is the same as the dimension of input features, each neuron corresponds to the input of one dimension of a vector in sequence and is directly transmitted to neurons of the second layer, and the input features comprise digital features of radar point cloud data and digital features of 2D image data; the second layer of the Dense neural network model is a Dence layer, which comprises superposition of a plurality of Dences and is used for realizing mapping from input variables to output variables; the third layer of the Dense neural network model is an output layer, and the output layer corresponds to the regression numerical value of the 3D bounding box.
The radar point cloud preprocessing module is a PointNet neural network model, and the PointNet network rate adopts a symmetric function max-firing to realize the replacement invariance of a disordered three-dimensional point set; the 2D image pre-processing module is a Resnet50 neural network model that learns the residual representation between input and output by using multiple layers of parameters.
The invention has the beneficial effects that:
according to the method, the point cloud characteristics are obtained through the radar point cloud preprocessing module, the image characteristics are obtained through the 2D image preprocessing module, the point cloud characteristics and the image characteristics are subjected to fusion processing through the multi-source fusion module based on the GIoU, and finally the estimation result of the 3D target boundary frame is output. The method solves the problem of low estimation accuracy of the existing 3D target bounding box. The method can obviously improve the calibration accuracy of the 3D target and realize the high-accuracy 3D target boundary box estimation effect.
Drawings
Fig. 1 is a general framework schematic of the present invention.
FIG. 2 is a schematic diagram of a GIoU-based multi-source fusion module structure according to the present invention.
Fig. 3 is a general operational flow diagram of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
The invention discloses a GIoU-based 3D target boundary frame estimation system which comprises a radar point cloud preprocessing module, a 2D image preprocessing module and a GIoU-based multi-source fusion module. And point cloud characteristics are obtained through a radar point cloud preprocessing module, image characteristics are obtained through a 2D image preprocessing module, and then the point cloud characteristics and the image characteristics are subjected to fusion processing through a multi-source fusion module based on a GIoU (general information unit), and finally an estimation result of a 3D target boundary frame is output. The method solves the problem of low estimation accuracy of the existing 3D target bounding box. The method can obviously improve the calibration accuracy of the 3D target and realize the high-accuracy 3D target boundary box estimation effect.
A system for estimating a 3D target boundary frame based on a GIoU (geographic information Unit) comprises a radar point cloud preprocessing module, a 2D image preprocessing module and a multi-source fusion module based on the GIoU; the radar point cloud preprocessing module converts the input radar point cloud data into a digital feature representation form with fixed dimensionality and transmits the digital features of the radar point cloud data to the GIoU-based multi-source fusion module; the 2D image preprocessing module converts input 2D image data into a digital feature representation form with fixed dimensionality and transmits the digital features of the 2D image data to the GIoU-based multi-source fusion module; the GIoU-based multi-source fusion module fuses digital features of radar point cloud data and digital features of 2D image data into a 3D target boundary frame estimation result, specifically coordinates B of 8 vertexes of a predicted 3D boundary frameP=(xP 1,yP 1,zP 1,…,xP 8,yP 8,zP 8);
The multi-source fusion module based on the GIoU is a Dense neural network model, and the GIoU LOSS is used as a LOSS function; the GIoU LOSS calculation method comprises the following steps:
step 1: inputting 8 vertex coordinates B of a real 3D bounding boxT=(xT 1,yT 1,zT 1,…,xT 8,yT 8,zT 8);
Step 2: calculating the length L of the real 3D bounding boxTWidth WTAnd height HT(ii) a Calculating the predicted 3D bounding box length LPWidth WPAnd height HP
And step 3: selecting a real 3D boundary frame and a 3D boundary frame with a center point closer to an original point from the predicted 3D boundary frame, and acquiring a vertex coordinate MAX (x) of the upper right corner of the 3D boundary frameMAX,yMAX,zMAX) And acquiring the coordinates MIN (x) of the vertex at the lower left corner of the 3D bounding box with the central point far away from the originMIN,yMIN,zMIN);
And 4, step 4: selecting the minimum x, y and z values x in the coordinates of all the vertexes of the real 3D bounding box and the predicted 3D bounding boxMIN、yMIN、zMINAnd the maximum x, y, Z values xMAX、yMAX、ZMAX
And 5: calculating a minimum bounding box B that can enclose the true 3D bounding box and the predicted 3D bounding boxcLength L ofCWidth WCAnd height HC
LC=xMAX-xMIN
WC=yMAX-yMIN
HC=zMAX-ZNIN
Step 6: calculating the value of the GIoU;
Figure BDA0002629104380000041
Figure BDA0002629104380000042
wherein, VTVolume of true 3D bounding box, VT=LT*WT*HT;VPFor the predicted volume of the 3D bounding box, VP=LP*WP*HP;VcIs BcVolume of (V)c=Lc*Wc*Hc
And 7: calculating the value of the LOSS function GIoU LOSS;
GIoU LOSS=1-GIoU。
example 1:
the technical scheme of the invention comprises the following steps:
with reference to fig. 3, the overall process of the present invention is:
firstly, a 3D target boundary box estimation device based on a GIoU is built. The system consists of a radar point cloud preprocessing module, a 2D image preprocessing module and a multi-source fusion module based on a GIoU. The radar point cloud preprocessing module is a PointNet neural network model, the 2D image preprocessing module is a Resnet50 neural network model, and the GIoU-based multi-source fusion module is a Dense neural network model.
The radar point cloud preprocessing module can convert point cloud data into a digital feature representation of fixed dimensions.
The 2D image pre-processing module may convert the point 2D image data into a fixed-dimension digital feature representation.
The multi-source fusion module based on the GIoU can fuse the digital features of the point cloud data and the 2D image data features, and finally output 3D target bounding box estimation.
Preferably, the PointNet network of the radar point cloud preprocessing module adopts a symmetric function (max-firing) to realize the displacement invariance of the disordered three-dimensional point set, so that the high-precision point cloud data fusion effect can be realized.
Preferably, the Resnet50 network of the 2D image preprocessing module learns the residual representation between the input and output by using a plurality of parameter layers, instead of directly trying to learn the mapping between the input and output by using parameter layers as in the general CNN network. The direct learning of the residual error by using the general parameter layer is faster and more effective than the direct learning of the mapping convergence speed between the input and the output.
Preferably, the GIoU is used as a loss function in the Dense network of the multi-source fusion module based on the GIoU, and compared with the conventional mean square error and absolute average error which are used as the loss function, the method can more accurately guide the network to learn in the direction of improving the estimation accuracy of the 3D target bounding box in the training process.
And secondly, initializing a 3D target bounding box estimation device based on the GIoU. The method comprises the following steps:
1) firstly, radar point cloud data are input into a radar point cloud preprocessing module to obtain a digital feature representation result of the point cloud.
2) And inputting the 2D image data into a 2D image preprocessing module to obtain a digital feature representation result of the 2D image.
3) And combining and inputting the digital feature representation result of the point cloud and the digital feature representation result of the 2D image into the GIoU-based multi-source fusion module.
And thirdly, receiving the source data file by the 3D target bounding box estimation device based on the GIoU. The method comprises the following steps: and (3) obtaining corresponding digital feature representation of a sample for 3D target detection through 1) and 2) in the second step, and then inputting the digital feature representation as a trained GIoU-based multi-source fusion module, wherein the module output is a 3D target bounding box estimation result.
And fourthly, finishing one-time 3D target bounding box estimation.
With reference to fig. 1, the estimation framework of the GIoU-based 3D target object bounding box according to the present invention includes a radar point cloud preprocessing module, a 2D image preprocessing module, and a GIoU-based multi-source fusion module.
The radar point cloud preprocessing module is a PointNet neural network model and can convert point cloud data into a digital feature representation form with fixed dimensionality. The PointNet network firstly adopts a symmetric function (max-firing) to realize the displacement invariance of the unordered three-dimensional point set, and can realize the fusion effect of high-precision point cloud data.
The 2D image pre-processing module is a Resnet50 neural network model that can convert point 2D image data into a fixed-dimension digital feature representation. Residual representations between input and output are learned by using multiple parametric layers, rather than directly attempting to learn mappings between input and output using the parametric layers as in a general CNN network. The direct learning of the residual error by using the general parameter layer is faster and more effective than the direct learning of the mapping convergence speed between the input and the output.
The multi-source fusion module based on the GIoU is a Dense neural network model, can fuse the digital characteristics of the point cloud data and the 2D image data characteristics, and finally outputs 3D target boundary frame estimation. Particularly, the Dense network takes GIoU as a loss function, and compared with the conventional mean square error and absolute average error as the loss function, the network can be more accurately guided to learn in the direction of improving the estimation accuracy of the 3D target bounding box in the training process.
Referring to fig. 2, the GIoU-based multi-source fusion module of the present invention includes a three-layer structure, in which,
the first layer is an input layer, the number of the neurons is the same as the dimension of the input feature, and each neuron corresponds to the input of one dimension of the vector in sequence and is directly transmitted to the neuron of the layer 2. The input features include point cloud digital features and digital features of the 2D image.
The second layer is a nonce layer, comprises superposition of a plurality of nonces and is used for realizing mapping from the input variable to the output variable;
the third layer is an output layer, and the output layer corresponds to a regression numerical value of the 3D bounding box, specifically, a central point coordinate and a length, a width and a height of the 3D bounding box.
In the model training process, the GIoU loss needs to be calculated after the output layer every time, and the network can be more accurately guided to learn in the direction of improving the estimation precision of the 3D target bounding box in the training process.
Wherein the pseudo code to compute GIOU is as follows:
1) the coordinate information of the vertices of the two 3D bounding boxes 8 is known: b isT=(xT 1,yT 1,zT 1,…,xT 8,yT 8,zT 8),BP=(xP 1,yP 1,zP 1,…,xP 8,yP 8,zP 8) In which B isTRepresenting the real bounding box, BPRepresenting the predicted bounding box.
2) Calculating the length, width and height of the two bounding boxes to obtain LT,WT,HTAnd LP,WP,HP
3) Calculating the volume of the two bounding boxes to obtain VT=LT*WT*HTAnd VP=LP*WP*HP
4) Get BTAnd BPThe coordinate MAX (x) of the top right corner of the bounding box with the center point close to the origin pointMAX,yMAX,zMAX) And the coordinates MIN of the lower left corner vertex of the other bounding box (x)MIN,yMIN,zMIN)。
5) Calculating the difference value of the corresponding coordinates of MIN and MAX: xI=xMIN-xMAX,YI=yMIN-yMAX,ZI=zMIN-zMAX
6) Calculation of BTAnd BPIntersection VI=XI*YI*ZI. If VIIf not more than 0, it means that there is no intersection, let VI=0。
7) Get BTAnd BPMinimum x, y, z value x in all point coordinatesMIN,yMIN,zMINAnd the maximum x, y, z value xMAX,yMAX,zMAX
8) The computation may surround BTAnd BPMinimum bounding box B ofcThe length, width and height are as follows: l isC=xMAX-xMIN,WC=yMAX-yMIN,HC=ZMAX-zMIN,BcVolume is Vc=Lc*Wc*Hc
9) Computing
Figure BDA0002629104380000061
The value range of GIoU is [ -1,1 [)]。
10) Calculate GIoU LOSS: GIoU LOSS ═ 1-GIoU. The value range of GIoU LOSS is [0,2 ].
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (3)

1. A GIoU-based 3D object bounding box estimation system, characterized in that: the system comprises a radar point cloud preprocessing module, a 2D image preprocessing module and a multi-source fusion module based on a GIoU; the radar point cloud preprocessing module converts the input radar point cloud data into a digital feature representation form with fixed dimensionality and transmits the digital features of the radar point cloud data to the GIoU-based multi-source fusion module; the 2D image preprocessing module converts input 2D image data into a digital feature representation form with fixed dimensionality and transmits the digital features of the 2D image data to the GIoU-based multi-source fusion module; the GIoU-based multi-source fusion module fuses digital features of radar point cloud data and digital features of 2D image data into a 3D target boundary frame estimation result, specifically coordinates B of 8 vertexes of a predicted 3D boundary frameP=(xP 1,yP 1,zP 1,…,xP 8,yP 8,zP 8);
The multi-source fusion module based on the GIoU is a Dense neural network model, and the GIoU LOSS is used as a LOSS function; the GIoU LOSS calculation method comprises the following steps:
step 1: inputting 8 vertex coordinates B of a real 3D bounding boxT=(xT 1,yT 1,zT 1,…,xT 8,yT 8,zT 8);
Step 2: calculating the length L of the real 3D bounding boxTWidth WTAnd height HT(ii) a Calculating the predicted 3D bounding box length LPWidth WPAnd height HP
And step 3: selecting a real 3D boundary frame and a 3D boundary frame with a center point closer to an original point from the predicted 3D boundary frame, and acquiring a vertex coordinate MAX (x) of the upper right corner of the 3D boundary frameMAX,yMAX,zMAX) And acquiring the coordinates MIN (x) of the vertex at the lower left corner of the 3D bounding box with the central point far away from the originMIN,yMIN,zMIN);
And 4, step 4: selecting the minimum x, y and z values x in the coordinates of all the vertexes of the real 3D bounding box and the predicted 3D bounding boxMIN、yMIN、zMINAnd the maximum x, y, z values xMAX、yMAX、zMAX
And 5: calculating a minimum bounding box B that can enclose the true 3D bounding box and the predicted 3D bounding boxcLength L ofCWidth WCAnd height HC
LC=xMAX-xMIN
WC=yMAX-yMIN
HC=zMAX-zMIN
Step 6: calculating the value of the GIoU;
Figure FDA0002629104370000011
Figure FDA0002629104370000012
wherein, VTVolume of true 3D bounding box, VT=LT*WT*HT;VPFor the predicted volume of the 3D bounding box, VP=LP*WP*HP;VcIs BcVolume of (V)c=Lc*Wc*Hc
And 7: calculating the value of the LOSS function GloU LOSS;
GIoU LOSS=1-GIoU。
2. the GIoU-based 3D object bounding box estimation system as claimed in claim 1, wherein: the Dense neural network model comprises a three-layer structure; the first layer of the Dense neural network model is an input layer, the number of neurons is the same as the dimension of input features, each neuron corresponds to the input of one dimension of a vector in sequence and is directly transmitted to neurons of the second layer, and the input features comprise digital features of radar point cloud data and digital features of 2D image data; the second layer of the Dense neural network model is a Dence layer, which comprises superposition of a plurality of Dences and is used for realizing mapping from input variables to output variables; the third layer of the Dense neural network model is an output layer, and the output layer corresponds to the regression numerical value of the 3D bounding box.
3. The GIoU-based 3D object bounding box estimation system as claimed in claim 1 or 2, wherein: the radar point cloud preprocessing module is a PointNet neural network model, and the PointNet network rate adopts a symmetric function max-firing to realize the replacement invariance of a disordered three-dimensional point set; the 2D image pre-processing module is a Resnet50 neural network model that learns the residual representation between input and output by using multiple layers of parameters.
CN202010805891.3A 2020-08-12 2020-08-12 3D target bounding box estimation system based on GIoU Pending CN112053374A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010805891.3A CN112053374A (en) 2020-08-12 2020-08-12 3D target bounding box estimation system based on GIoU

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010805891.3A CN112053374A (en) 2020-08-12 2020-08-12 3D target bounding box estimation system based on GIoU

Publications (1)

Publication Number Publication Date
CN112053374A true CN112053374A (en) 2020-12-08

Family

ID=73601727

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010805891.3A Pending CN112053374A (en) 2020-08-12 2020-08-12 3D target bounding box estimation system based on GIoU

Country Status (1)

Country Link
CN (1) CN112053374A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171217A (en) * 2018-01-29 2018-06-15 深圳市唯特视科技有限公司 A kind of three-dimension object detection method based on converged network
US20190147245A1 (en) * 2017-11-14 2019-05-16 Nuro, Inc. Three-dimensional object detection for autonomous robotic systems using image proposals
CN110988912A (en) * 2019-12-06 2020-04-10 中国科学院自动化研究所 Road target and distance detection method, system and device for automatic driving vehicle
CN111027401A (en) * 2019-11-15 2020-04-17 电子科技大学 End-to-end target detection method with integration of camera and laser radar
US20200160559A1 (en) * 2018-11-16 2020-05-21 Uatc, Llc Multi-Task Multi-Sensor Fusion for Three-Dimensional Object Detection
CN111242041A (en) * 2020-01-15 2020-06-05 江苏大学 Laser radar three-dimensional target rapid detection method based on pseudo-image technology
CN111339880A (en) * 2020-02-19 2020-06-26 北京市商汤科技开发有限公司 Target detection method and device, electronic equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190147245A1 (en) * 2017-11-14 2019-05-16 Nuro, Inc. Three-dimensional object detection for autonomous robotic systems using image proposals
CN108171217A (en) * 2018-01-29 2018-06-15 深圳市唯特视科技有限公司 A kind of three-dimension object detection method based on converged network
US20200160559A1 (en) * 2018-11-16 2020-05-21 Uatc, Llc Multi-Task Multi-Sensor Fusion for Three-Dimensional Object Detection
CN111027401A (en) * 2019-11-15 2020-04-17 电子科技大学 End-to-end target detection method with integration of camera and laser radar
CN110988912A (en) * 2019-12-06 2020-04-10 中国科学院自动化研究所 Road target and distance detection method, system and device for automatic driving vehicle
CN111242041A (en) * 2020-01-15 2020-06-05 江苏大学 Laser radar three-dimensional target rapid detection method based on pseudo-image technology
CN111339880A (en) * 2020-02-19 2020-06-26 北京市商汤科技开发有限公司 Target detection method and device, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JUN XU等: "3D-GIoU: 3D Generalized Intersection over Union for Object Detection in Point Cloud", 《SENEORS》 *
张爱武等: "道路三维点云多特征卷积神经网络语义分割方法", 《中国激光》 *

Similar Documents

Publication Publication Date Title
CN109597087B (en) Point cloud data-based 3D target detection method
Yu et al. Vehicle detection and localization on bird's eye view elevation images using convolutional neural network
CN113159151B (en) Multi-sensor depth fusion 3D target detection method for automatic driving
CN113111887B (en) Semantic segmentation method and system based on information fusion of camera and laser radar
CN103824050A (en) Cascade regression-based face key point positioning method
CN113111978B (en) Three-dimensional target detection system and method based on point cloud and image data
CN111998862B (en) BNN-based dense binocular SLAM method
Xu et al. Object detection based on fusion of sparse point cloud and image information
Zhang et al. Robust-FusionNet: Deep multimodal sensor fusion for 3-D object detection under severe weather conditions
GB2612029A (en) Lifted semantic graph embedding for omnidirectional place recognition
CN116612468A (en) Three-dimensional target detection method based on multi-mode fusion and depth attention mechanism
CN113031597A (en) Autonomous obstacle avoidance method based on deep learning and stereoscopic vision
CN117315025A (en) Mechanical arm 6D pose grabbing method based on neural network
Duan et al. A semantic robotic grasping framework based on multi-task learning in stacking scenes
CN113012191B (en) Laser mileage calculation method based on point cloud multi-view projection graph
CN114637295A (en) Robot intelligent obstacle avoidance method based on artificial potential field method and instance segmentation
CN113436239A (en) Monocular image three-dimensional target detection method based on depth information estimation
CN112950786A (en) Vehicle three-dimensional reconstruction method based on neural network
Sun et al. Automatic targetless calibration for LiDAR and camera based on instance segmentation
CN112053374A (en) 3D target bounding box estimation system based on GIoU
Cai et al. Deep representation and stereo vision based vehicle detection
Wirges et al. Self-supervised flow estimation using geometric regularization with applications to camera image and grid map sequences
Li et al. Detection and discrimination of obstacles to vehicle environment under convolutional neural networks
Li et al. Comparison of 3D object detection based on LiDAR point cloud
Chen et al. A Framework for 3D Object Detection and Pose Estimation in Unstructured Environment Using Single Shot Detector and Refined LineMOD Template Matching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201208

RJ01 Rejection of invention patent application after publication