CN115546295A - Target 6D attitude estimation model training method and target 6D attitude estimation method - Google Patents
Target 6D attitude estimation model training method and target 6D attitude estimation method Download PDFInfo
- Publication number
- CN115546295A CN115546295A CN202211030694.4A CN202211030694A CN115546295A CN 115546295 A CN115546295 A CN 115546295A CN 202211030694 A CN202211030694 A CN 202211030694A CN 115546295 A CN115546295 A CN 115546295A
- Authority
- CN
- China
- Prior art keywords
- key point
- coordinate
- training
- target
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012549 training Methods 0.000 title claims abstract description 220
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000013508 migration Methods 0.000 claims abstract description 78
- 230000005012 migration Effects 0.000 claims abstract description 78
- 238000010586 diagram Methods 0.000 claims description 165
- 230000006870 function Effects 0.000 claims description 115
- 230000003042 antagnostic effect Effects 0.000 claims description 30
- 230000000694 effects Effects 0.000 abstract description 5
- 238000012546 transfer Methods 0.000 abstract description 3
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 238000012545 processing Methods 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- 239000002131 composite material Substances 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- FEPMHVLSLDOMQC-UHFFFAOYSA-N virginiamycin-S1 Natural products CC1OC(=O)C(C=2C=CC=CC=2)NC(=O)C2CC(=O)CCN2C(=O)C(CC=2C=CC=CC=2)N(C)C(=O)C2CCCN2C(=O)C(CC)NC(=O)C1NC(=O)C1=NC=CC=C1O FEPMHVLSLDOMQC-UHFFFAOYSA-N 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 101150050759 outI gene Proteins 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/75—Determining position or orientation of objects or cameras using feature-based methods involving models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/006—Mixed reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computer Graphics (AREA)
- Computer Hardware Design (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The application relates to a training method of a target 6D attitude estimation model, which comprises the following steps: training the target 6D attitude estimation model based on the source domain training image to update parameters in the target 6D attitude estimation model to obtain a model after primary training; adding a counterregression device in the model after the primary training to form a migration training model; training the migration training model based on the source domain training image and the target domain training image to update parameters in the migration training model to obtain a trained migration training model; and determining a final trained target 6D posture estimation model based on the trained migration training model. The target 6D attitude estimation model training method at least has one of the following beneficial technical effects: and a mode of combining basic training and transfer training is adopted, and a mode of resisting a regressor is adopted in the transfer training, so that the estimation performance of the target 6D posture estimation model obtained by training is more accurate and reliable.
Description
Technical Field
The application relates to the field of image processing, in particular to a target 6D attitude estimation model training method and a target 6D attitude estimation method.
Background
The purpose of target 6D pose estimation is to detect and estimate the position of a target in a given image, and target 6D pose estimation is widely used in computer vision applications such as augmented reality, virtual reality, and unmanned driving. When an estimation model is trained, the existing target 6D posture estimation method usually trains on a data set with a real label, but in the actual application process, a large amount of training on the data set with the label causes the model to be over-fitted on the training set, so that the model performance is reduced in the actual application scene.
Disclosure of Invention
In order to overcome at least one defect in the prior art, embodiments of the present application provide a target 6D pose estimation model training method and a target 6D pose estimation method.
In a first aspect, an embodiment of the present application provides a method for training a target 6D pose estimation model: the method comprises the following steps:
training the target 6D posture estimation model based on the source domain training image to update parameters in the target 6D posture estimation model to obtain a model after primary training; the target 6D attitude estimation model comprises a feature extractor, a feature regressor and a proportional regressor;
adding a counterregression device in the model after the primary training to form a migration training model;
training the migration training model based on the source domain training image and the target domain training image to update parameters in the migration training model to obtain a trained migration training model;
and determining a final trained target 6D posture estimation model based on the trained migration training model.
In one embodiment, training the target 6D pose estimation model based on the source domain training image to update parameters in the target 6D pose estimation model to obtain a trained model, comprising:
inputting the source domain training image into a feature extractor to obtain a feature map;
inputting the feature maps into a feature regressor and a proportional regressor respectively to obtain a coordinate thermodynamic map and a proportional thermodynamic map corresponding to a plurality of key points of the source domain training image respectively;
obtaining a first loss function according to a coordinate thermodynamic diagram and a proportional thermodynamic diagram corresponding to the plurality of key points;
and updating parameters of the feature extractor, the feature regressor and the proportional regressor based on the first loss function to obtain a model after one training.
In one embodiment, obtaining the first loss function according to the coordinate thermodynamic diagram and the proportional thermodynamic diagram corresponding to the plurality of key points includes:
determining a key point coordinate loss function corresponding to each key point according to the coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
accumulating the key point coordinate loss functions corresponding to all key points to obtain an accumulated value of the key point coordinate loss functions;
determining a scale factor corresponding to each key point according to the proportional thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
determining a key point scale factor loss function according to the scale factor corresponding to each key point and the real scale factor corresponding to each key point;
and determining a first loss function according to the key point coordinate loss function accumulated value and the key point scale factor loss function.
In one embodiment, training a migration training model based on a source domain training image and a target domain training image to update parameters in the migration training model to obtain a trained migration training model, includes:
inputting the source domain training image into a migration training model for training, and updating parameters in a feature extractor, a proportional regression, a feature regression and a counterregression to obtain a primary migration training model;
inputting the target domain training image into the primary migration training model for training, and updating parameters in the counterregression to obtain a secondary migration training model;
and inputting the target domain training image into a secondary migration training model for training, and updating parameters in the feature extractor to obtain a trained migration training model.
In one embodiment, inputting a source domain training image into a migration training model for training, and updating parameters in a feature extractor, a proportional regression, a feature regression, and a counterregression to obtain a primary migration training model, including:
inputting the source domain training image into a feature extractor to obtain a feature map;
respectively inputting the feature maps into a feature regressor, a proportional regressor and an antagonistic regressor to respectively obtain a coordinate thermodynamic diagram, a proportional thermodynamic diagram and an antagonistic coordinate thermodynamic diagram corresponding to a plurality of key points of the source domain training image;
obtaining a second loss function according to the coordinate thermodynamic diagrams, the proportional thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating parameters of the feature extractor, the feature regressor, the proportional regressor and the counterregression based on the second loss function to obtain a primary migration training model.
In one embodiment, the obtaining the second loss function according to the coordinate thermodynamic diagram, the proportional thermodynamic diagram and the confrontation coordinate thermodynamic diagram corresponding to the plurality of key points comprises:
determining a key point coordinate loss function corresponding to each key point according to the coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
accumulating the key point coordinate loss functions corresponding to all key points to obtain an accumulated value of the key point coordinate loss functions;
determining a scale factor corresponding to each key point according to the proportional thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
determining a key point scale factor loss function according to the scale factor corresponding to each key point and the real scale factor corresponding to each key point;
determining an antagonistic key point coordinate loss function corresponding to each key point according to the antagonistic coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
accumulating the antagonistic key point coordinate loss functions corresponding to all key points to obtain an accumulated value of the antagonistic key point coordinate loss functions;
and determining a second loss function according to the key point coordinate loss function accumulated value, the key point scale factor loss function and the confrontation key point coordinate loss function accumulated value.
In one embodiment, inputting the target domain training image into a primary migration training model for training, and updating parameters in a counterregression to obtain a secondary migration training model, including:
inputting the target domain training image into a feature extractor to obtain a feature map;
respectively inputting the feature maps into a feature regressor and an antagonistic regressor to respectively obtain a coordinate thermodynamic diagram and an antagonistic coordinate thermodynamic diagram corresponding to a plurality of key points of the target domain training image;
obtaining a third loss function according to the coordinate thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating the parameters of the counterregression device based on the third loss function to obtain a secondary migration training model.
In one embodiment, the obtaining the third loss function according to the coordinate thermodynamic diagram and the antagonistic coordinate thermodynamic diagram corresponding to the plurality of key points includes:
aiming at each key point, determining a combination graph of coordinate thermodynamic diagrams corresponding to the key points except the current key point;
determining a key point coordinate loss function corresponding to the current key point according to a combined graph corresponding to the current key point and an confrontation coordinate thermodynamic diagram corresponding to the current key point;
and accumulating the key point coordinate loss functions corresponding to all the key points to obtain a third loss function.
In one embodiment, inputting the target domain training image into a quadratic migration training model for training, and updating parameters in the feature extractor to obtain a trained migration training model, including:
inputting the target domain training image into a feature extractor to obtain a feature map;
respectively inputting the feature maps into a feature regressor and an antagonistic regressor to respectively obtain a coordinate thermodynamic diagram and an antagonistic coordinate thermodynamic diagram corresponding to a plurality of key points of the target domain training image;
obtaining a fourth loss function according to the coordinate thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating the parameters of the feature extractor based on the fourth loss function to obtain a trained migration training model.
In a second aspect, an embodiment of the present application provides a target 6D pose estimation method, including:
inputting the target image into a feature extractor of a target 6D attitude estimation model to obtain a feature map, wherein the target 6D attitude estimation model comprises the feature extractor, a feature regressor and a proportional regressor;
respectively inputting the feature maps into a feature regression device and a proportional regression device to respectively obtain a coordinate thermodynamic diagram and a proportional thermodynamic diagram corresponding to a plurality of key points of the target image;
determining the key point coordinates corresponding to each key point according to the coordinate thermodynamic diagrams corresponding to the key points;
calculating a scale factor corresponding to each key point according to a coordinate thermodynamic diagram and a proportional thermodynamic diagram corresponding to the key points;
determining the three-dimensional coordinates of each key point of the target according to the key point coordinates corresponding to each key point and the scale factor corresponding to each key point;
obtaining a 6D posture of the target according to the three-dimensional coordinates of each key point of the target and the three-dimensional coordinates of the key points of the target three-dimensional model;
the target 6D attitude estimation model is obtained by applying the target 6D attitude estimation model training method.
Compared with the prior art, the method has the following beneficial effects: the method is characterized in that a mode of combining basic training and migration training is adopted, a mode of resisting a regressor is adopted in the migration training, the model is subjected to the migration training based on a source domain training image to ensure the accuracy of the three regressors in a source domain, the model is subjected to the migration training based on a target domain training image to ensure the effect of inaccurate prediction of the resisting regressor in a target domain, the prediction of the characteristic regressor is far away from the prediction of the resisting regressor as far as possible, namely the prediction of the characteristic regressor is correct, and therefore the estimation performance of the target 6D posture estimation model obtained through training is more accurate and reliable.
Drawings
The present application may be better understood by reference to the following description taken in conjunction with the accompanying drawings, which are incorporated in and form a part of this specification, along with the detailed description below. In the drawings:
FIG. 1 is a block flow diagram illustrating a method for training a target 6D pose estimation model according to an embodiment of the application;
FIG. 2 shows a block flow diagram of a target 6D pose estimation method according to an embodiment of the application.
Detailed Description
Exemplary embodiments of the present application will be described hereinafter with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual embodiment are described in the specification. It will of course be appreciated that in the development of any such actual embodiment, numerous implementation-specific decisions may be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Here, it should be further noted that, in order to avoid obscuring the present application with unnecessary details, only the device structure closely related to the solution according to the present application is shown in the drawings, and other details not so related to the present application are omitted.
It is to be understood that the application is not limited to the described embodiments, since the description proceeds with reference to the drawings. In this context, embodiments may be combined with each other, features may be replaced or borrowed between different embodiments, one or more features may be omitted in one embodiment, where feasible.
The 6D pose estimation of the target refers to detecting the target in a given image and estimating the position of the target, and the 6D pose refers to 6 degrees of freedom including 3 degrees of freedom of displacement and 3 degrees of freedom of spatial rotation. The application provides a target 6D posture estimation method for RGB images, the target 6D posture is estimated based on a target 6D posture estimation model, the target 6D posture estimation model is trained firstly, the training process comprises a basic training stage and a transfer training stage, and the accuracy of target 6D posture estimation can be greatly improved by the target 6D posture estimation model obtained by the training process.
FIG. 1 shows a block flow diagram of a method for training a target 6D pose estimation model according to an embodiment of the application. The method starts with step S110, training a target 6D pose estimation model based on a source domain training image to update parameters in the target 6D pose estimation model to obtain a model after one training; the target 6D pose estimation model includes a feature extractor, a feature regressor, and a proportional regressor. Here, the training image is acquired based on a Linemod data set, the Linemod data set includes a real image and a synthetic image, the real image may be an image obtained by shooting a target with an image acquisition device such as a camera, for example, and the synthetic image may be an image synthesized by using computer software based on a target three-dimensional model, for example. The source domain training image may be a composite image in the Linemod dataset. Here, the target 6D pose estimation model is trained based on the source domain training image to update the parameters in the entire target 6D pose estimation model, i.e., in the feature extractor, the feature regressor, and the proportional regressor. This step is the basic training phase.
Then, in step S120, adding a counterregression to the model after the primary training to form a migration training model; here, the migration training model includes a feature extractor, a feature regressor, and a proportional regressor in the model after one training, and an added countermeasure regressor. The characteristic regression device and the countermeasure regression device have the same structure, and only have different parameters.
Then, in step S130, training the migration training model based on the source domain training image and the target domain training image to update parameters in the migration training model, so as to obtain a trained migration training model; here, the target domain training image may be a real image in the Linemod dataset.
Then, in step S140, a final trained target 6D pose estimation model is determined based on the trained migration training model. Here, the trained migration training model includes a feature extractor, a feature regressor, a proportional regressor, and a counterregressor, and the final trained target 6D pose estimation model is obtained from the trained migration training model, and includes only the feature extractor, the feature regressor, and the proportional regressor in the trained migration training model.
In the embodiment, in the process of training the target 6D posture estimation model, a mode of combining basic training and migration training is adopted, a mode of resisting a regressor is adopted in the migration training, the model is trained on the basis of a source domain training image to ensure the accuracy of three regressors in a source domain, the model is trained on the basis of a target domain training image to ensure the effect of inaccurate prediction of the resisting regressor in a target domain, and further, the prediction of the characteristic regressor is far away from the prediction of the resisting regressor as far as possible, namely, the prediction of the characteristic regressor is correct, so that the estimation performance of the target 6D posture estimation model obtained by training is more accurate and reliable.
In one embodiment, training the target 6D pose estimation model based on the source domain training image to update parameters in the target 6D pose estimation model to obtain a trained model, comprising:
inputting the source domain training image into a feature extractor to obtain a feature map;
respectively inputting the feature maps into a feature regression device and a proportional regression device to respectively obtain a coordinate thermodynamic diagram and a proportional thermodynamic diagram corresponding to a plurality of key points of the source domain training image; here, the source domain training image has a plurality of key points, which can be obtained from the Linemod data set, and after the feature map is input to the feature regressor and the proportional regressor, a coordinate thermodynamic diagram and a proportional thermodynamic diagram are obtained for each key point. Here, the size of the element value of the element on the coordinate thermodynamic diagram reflects the possibility that the current key point may have the position of the element, and the position of the element with the largest element value is the position of the current key point; the size of the element value of a certain element in the proportional thermodynamic diagram represents the size of the scale factor when the current key point is at the position of the element.
Obtaining a first loss function according to a coordinate thermodynamic diagram and a proportional thermodynamic diagram corresponding to the plurality of key points; here, the first loss function is used to train the model and update the parameters of the model, and the model trained by the first loss function in this embodiment can ensure the accuracy of predicting the positions of the keypoints.
And updating parameters of the feature extractor, the feature regressor and the proportional regressor based on the first loss function to obtain a model after one training.
In one embodiment, obtaining the first loss function according to the coordinate thermodynamic diagram and the proportional thermodynamic diagram corresponding to the plurality of key points includes:
step S210, determining a key point coordinate loss function loss corresponding to each key point according to the coordinate thermodynamic diagram corresponding to each key point and the real thermodynamic diagram uv ′;
Here, the real thermodynamic diagram corresponding to each key point can be obtained according to a Linemod data set;
key point coordinate loss function loss corresponding to each key point uv ', can be expressed by the following formula:
wherein x is i Coordinate thermodynamic diagrams for keypoint correspondencesElement value of the i-th element, y i And N is the number of the elements in the coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to the key points.
Step S220, accumulating the key point coordinate loss functions corresponding to all key points to obtain the accumulated value loss of the key point coordinate loss functions uv ;
Step S230, determining a scale factor corresponding to each key point according to the proportional thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
in one implementation, the scale factor corresponding to each key point may be determined using the following method:
determining a probability distribution map according to the real thermodynamic diagrams corresponding to the key points; processing the real thermodynamic diagram corresponding to the key points by utilizing softmax to obtain a probability distribution diagram of one pixel level;
and multiplying the element value of each element in the proportional thermodynamic diagram corresponding to the key point by the element value of the corresponding element in the probability distribution diagram, and accumulating to obtain the proportional factor corresponding to the key point.
Determining a key point scale factor loss function loss _ s according to the scale factor corresponding to each key point and the real scale factor corresponding to each key point; here, the real scale factor corresponding to each key point can be obtained according to a Linemod data set;
the key point scale factor loss function loss _ s can be determined by adopting the following formula:
wherein s is j For the scaling factor corresponding to the jth keypoint,and M is the number of the key points.
Step S240, according to the key point coordinate loss function accumulated value loss uv And a key point scale factor loss function loss _ s, determining a first loss function loss1.
Here, the first loss function loss1 may be calculated using the following formula:
loss 1=αloss uv +loss_s
where α is a constant and may be set to 10.
In one embodiment, training a migration training model based on a source domain training image and a target domain training image to update parameters in the migration training model to obtain a trained migration training model, includes:
step S310, inputting the source domain training image into a migration training model for training, and updating parameters in a feature extractor, a proportional regression, a feature regression and a counterregression to obtain a primary migration training model; in this step, the parameters of the entire network are updated during the training process so that the predictions of the proportional regressor Z, the feature regressor H, and the counterregressor H' are lost as little as possible in the source domain. In the step, the model is trained by using the source domain training image, so that the accuracy of the three regressors in the source domain is ensured.
In one implementation, step 310 may specifically include:
inputting the source domain training image into a feature extractor to obtain a feature map;
respectively inputting the feature maps into a feature regressor, a proportional regressor and an antagonistic regressor to respectively obtain a coordinate thermodynamic diagram, a proportional thermodynamic diagram and an antagonistic coordinate thermodynamic diagram corresponding to a plurality of key points of the source domain training image;
obtaining a second loss function according to the coordinate thermodynamic diagrams, the proportional thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating parameters of the feature extractor, the feature regressor, the proportional regressor and the counterregression based on the second loss function to obtain a primary migration training model.
Step S320, inputting the target domain training image into the primary migration training model for training, and updating parameters in the counterregression device to obtain a secondary migration training model; in the step, a target domain training image is adopted to train the model, only the parameters of the countercheck regressor are updated, the effect that the countercheck regressor measures inaccurately in the target domain is ensured, and the prediction of the characteristic regressor is far away from the prediction of the countercheck regressor as far as possible, namely the prediction of the characteristic regressor is correct.
In one implementation, step S320 may specifically include:
inputting the target domain training image into a feature extractor to obtain a feature map;
respectively inputting the feature maps into a feature regressor and an antagonistic regressor to respectively obtain a coordinate thermodynamic diagram and an antagonistic coordinate thermodynamic diagram corresponding to a plurality of key points of the target domain training image;
obtaining a third loss function according to the coordinate thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating the parameters of the counterregression device based on the third loss function to obtain a secondary migration training model.
And step S330, inputting the target domain training image into a secondary migration training model for training, and updating parameters in the feature extractor to obtain a trained migration training model. In this step, only the feature extractor is updated, and the target domain training image is used to limit the error effect generated in step S320 to the counterregressor, without affecting the feature extractor.
In one implementation, step 330 may specifically include:
inputting the target domain training image into a feature extractor to obtain a feature map;
inputting the feature graphs into a feature regressor and an antagonistic regressor respectively to obtain a coordinate thermodynamic diagram and an antagonistic coordinate thermodynamic diagram corresponding to a plurality of key points of the target domain training image;
obtaining a fourth loss function according to the coordinate thermodynamic diagrams and the antagonistic coordinate thermodynamic diagrams corresponding to the key points;
and updating the parameters of the feature extractor based on the fourth loss function to obtain a trained migration training model.
In one embodiment, obtaining the second loss function according to the coordinate thermodynamic diagram, the proportional thermodynamic diagram and the confrontation coordinate thermodynamic diagram corresponding to the plurality of key points may include:
step S410, determining a key point coordinate loss function loss corresponding to each key point according to the coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point uv1 ′;
Here, the key point coordinate loss function loss corresponding to each key point uv1 ', can be expressed by the following formula:
wherein x is 1i Element value, y, of the ith element in a coordinate thermodynamic diagram corresponding to a key point 1i And N is the number of the elements in the coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to the key points.
Step S420, accumulating the key point coordinate loss functions corresponding to all key points to obtain the accumulated value loss of the key point coordinate loss functions uv1 ;
Step S430, determining a scale factor corresponding to each key point according to the proportional thermodynamic diagram and the real thermodynamic diagram corresponding to each key point; the specific manner of determining the scale factor is consistent with the foregoing description, and is not repeated herein;
step S440, determining a key point scale factor loss function according to the scale factor corresponding to each key point and the real scale factor corresponding to each key point; the key point scale factor loss function loss _ s1 can be determined by adopting the following formula:
wherein s is 1j The scaling factor corresponding to the jth key point,true ratio for jth keypointFor example, M is the number of key points.
Step S450, determining the countermand point coordinate loss function loss corresponding to each key point according to the countermand coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point uv1_adv '; here, the opposing keypoint coordinate loss function loss corresponding to each keypoint uv1_adv ', can be expressed by the following formula:
wherein x is adv1i Element value, y, of the ith element in the antagonistic coordinate thermodynamic diagram corresponding to the key point 1i And N is the number of the elements in the confrontation coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to the key points.
Step S460, accumulating the antagonistic key point coordinate loss functions corresponding to all key points to obtain an accumulative value loss of the antagonistic key point coordinate loss function uv1_adv ;
Step S470, according to the accumulated value loss of the key point coordinate loss function uv1 A key point scale factor loss function loss _ s1 and an accumulated value loss of a counterkey point coordinate loss function uv1_adv Determining a second loss function loss2;
here, the second loss function loss2 may be determined using the following formula:
loss2=loss uv1 +βloss uv1_adv +γloss_s1
where β may be set to 1 and γ may be set to 20.
In one embodiment, the obtaining the third loss function according to the coordinate thermodynamic diagram and the antagonistic coordinate thermodynamic diagram corresponding to the plurality of key points includes:
aiming at each key point, determining a combination graph of coordinate thermodynamic diagrams corresponding to the key points except the current key point;
determining a key point coordinate loss function corresponding to the current key point according to a combined graph corresponding to the current key point and an confrontation coordinate thermodynamic diagram corresponding to the current key point;
here, the key point coordinate loss function loss corresponding to each key point uv2 ', can be expressed by the following formula:
wherein x is adv2i Element value, y, of the ith element in the antagonistic coordinate thermodynamic diagram corresponding to the key point 2i And N is the coordinate thermodynamic diagram corresponding to the key points and the number of the elements in the composite diagram.
And accumulating the coordinate loss functions of the key points corresponding to all the key points to obtain a third loss function loss3.
In one embodiment, obtaining the fourth loss function according to the coordinate thermodynamic diagram and the confrontation coordinate thermodynamic diagram corresponding to the plurality of key points includes:
determining a key point coordinate loss function corresponding to each key point according to the coordinate thermodynamic diagram and the confrontation coordinate thermodynamic diagram corresponding to each key point;
here, the key point coordinate loss function loss corresponding to each key point uv3 ', can be expressed by the following formula:
wherein x is 2i Element value, y, of the ith element in a coordinate thermodynamic diagram corresponding to a key point adv2i And N is the number of the elements in the coordinate thermodynamic diagram and the antagonistic coordinate thermodynamic diagram corresponding to the key points.
And accumulating the coordinate loss functions of the key points corresponding to all the key points to obtain a fourth loss function loss4.
Fig. 2 shows a block flow diagram of a target 6D pose estimation method according to an embodiment of the application, the target 6D pose estimation method comprising:
step S510, inputting a target image into a feature extractor of a target 6D attitude estimation model to obtain a feature map; here, the target image may be an image obtained by shooting a target with an image acquisition device such as a camera; the target 6D attitude estimation model is obtained by applying the training method of the target 6D attitude estimation model in the embodiment, and the target 6D attitude estimation model comprises a feature extractor, a feature regressor and a proportional regressor;
step S520, respectively inputting the feature maps into a feature regressor and a proportional regressor of the target 6D attitude estimation model to respectively obtain a coordinate thermodynamic diagram and a proportional thermodynamic diagram corresponding to a plurality of key points of the target image; here, the key points of the target image are generally selected from points with a prominent or marked target surface, and the key points of the target image can be determined according to the Linemod data set.
Step S530, determining the key point coordinates (u, v) corresponding to each key point according to the coordinate thermodynamic diagrams corresponding to the key points; here, the coordinates of the element having the largest element value in the coordinate thermodynamic diagram corresponding to the key point may be determined as the key point coordinates;
step S540, calculating a scale factor S corresponding to each key point according to the coordinate thermodynamic diagrams and the scale thermodynamic diagrams corresponding to the key points;
in one implementation, the scale factor corresponding to each key point may be determined using the following method:
determining a probability distribution map according to a coordinate thermodynamic diagram corresponding to the key points; processing the coordinate thermodynamic diagrams corresponding to the key points by utilizing softmax to obtain a probability distribution diagram of one pixel level;
and multiplying the element value of each element in the proportional thermodynamic diagram corresponding to the key point by the element value of the corresponding element in the probability distribution diagram, and accumulating to obtain the proportional factor corresponding to the key point.
Step S550, determining the three-dimensional coordinates of each key point of the target according to the key point coordinates (u, v) corresponding to each key point and the scale factor S corresponding to each key point;
here, will find outIs represented by the formulaAnd solving the three-dimensional coordinate xyz of the target object in the image under the camera coordinate system, wherein K is the camera internal parameter.
Step S560, obtaining a 6D posture of the target according to the three-dimensional coordinates of each key point of the target and the three-dimensional coordinates of the key points of the target three-dimensional model; here, the three-dimensional coordinates of the key points of the target three-dimensional model may be obtained from a Linemod data set;
here, the three-dimensional coordinates of each key point of the target and the three-dimensional coordinates of the key points of the target three-dimensional model are processed based on a least square method (least square), and the 6D pose of the target can be obtained.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative and, for example, the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.
The above description is only for various embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present application, and all such changes or substitutions are included in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (10)
1. A method for training a target 6D attitude estimation model is characterized by comprising the following steps:
training a target 6D attitude estimation model based on a source domain training image to update parameters in the target 6D attitude estimation model to obtain a model after primary training; the target 6D attitude estimation model comprises a feature extractor, a feature regressor and a proportional regressor;
adding a countercheck regressor in the model after the primary training to form a migration training model;
training the migration training model based on the source domain training image and the target domain training image to update parameters in the migration training model to obtain a trained migration training model;
and determining a final trained target 6D posture estimation model based on the trained migration training model.
2. The method of claim 1, wherein the training of the target 6D pose estimation model based on the source domain training image to update parameters in the target 6D pose estimation model, resulting in a trained model, comprises:
inputting the source domain training image into the feature extractor to obtain a feature map;
inputting the feature maps into the feature regressor and the proportion regressor respectively to obtain a coordinate thermodynamic diagram and a proportion thermodynamic diagram corresponding to a plurality of key points of the source domain training image respectively;
obtaining a first loss function according to the coordinate thermodynamic diagrams and the proportional thermodynamic diagrams corresponding to the plurality of key points;
and updating the parameters of the feature extractor, the feature regressor and the proportion regressor based on the first loss function to obtain the model after the primary training.
3. The method of claim 2, wherein the deriving a first loss function from a coordinate thermodynamic diagram and a proportional thermodynamic diagram corresponding to the plurality of key points comprises:
determining a key point coordinate loss function corresponding to each key point according to the coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
accumulating the key point coordinate loss functions corresponding to all key points to obtain an accumulated value of the key point coordinate loss functions;
determining a scale factor corresponding to each key point according to the proportional thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
determining a key point scale factor loss function according to the scale factor corresponding to each key point and the real scale factor corresponding to each key point;
and determining the first loss function according to the key point coordinate loss function accumulated value and the key point scale factor loss function.
4. The method of claim 1, wherein the training the migration training model based on the source domain training image and the target domain training image to update parameters in the migration training model to obtain a trained migration training model, comprises:
inputting the source domain training image into the migration training model for training, and updating parameters in the feature extractor, the proportional regression, the feature regression and the countermeasure regression to obtain a primary migration training model;
inputting the target domain training image into the primary migration training model for training, and updating parameters in the counterregression device to obtain a secondary migration training model;
and inputting the target domain training image into the secondary migration training model for training, and updating the parameters in the feature extractor to obtain a trained migration training model.
5. The method of claim 4, wherein the inputting the source domain training images into the migration training model for training, updating parameters in the feature extractor, the proportional regressor, the feature regressor, and the counterregressor to obtain a primary migration training model comprises:
inputting the source domain training image into the feature extractor to obtain a feature map;
inputting the feature graph into the feature regressor, the proportion regressor and the countermand regressor respectively to obtain a coordinate thermodynamic diagram, a proportion thermodynamic diagram and a countermand thermodynamic diagram corresponding to a plurality of key points of the source domain training image respectively;
obtaining a second loss function according to the coordinate thermodynamic diagrams, the proportional thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating the parameters of the feature extractor, the feature regressor, the proportion regressor and the counterregression based on the second loss function to obtain the primary migration training model.
6. The method of claim 5, wherein deriving a second loss function from the coordinate thermodynamic diagram, the proportional thermodynamic diagram, and the antagonistic coordinate thermodynamic diagram corresponding to the plurality of key points comprises:
determining a key point coordinate loss function corresponding to each key point according to the coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
accumulating the key point coordinate loss functions corresponding to all key points to obtain an accumulated value of the key point coordinate loss functions;
determining a scale factor corresponding to each key point according to the proportional thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
determining a key point scale factor loss function according to the scale factor corresponding to each key point and the real scale factor corresponding to each key point;
determining an antagonistic key point coordinate loss function corresponding to each key point according to the antagonistic coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
accumulating the antagonistic key point coordinate loss functions corresponding to all key points to obtain an accumulated value of the antagonistic key point coordinate loss functions;
and determining the second loss function according to the key point coordinate loss function accumulated value, the key point scale factor loss function and the confrontation key point coordinate loss function accumulated value.
7. The method of claim 4, wherein the inputting the target domain training image into the primary migration training model for training, updating parameters in the counterregression, and obtaining a secondary migration training model comprises:
inputting the target domain training image into the feature extractor to obtain a feature map;
inputting the feature maps into the feature regressor and the countercheck regressor respectively to obtain a coordinate thermodynamic diagram and a countercheck coordinate thermodynamic diagram corresponding to a plurality of key points of the target domain training image respectively;
obtaining a third loss function according to the coordinate thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating the parameters of the countervailing regressor based on the third loss function to obtain the secondary migration training model.
8. The method of claim 7, wherein the deriving a third loss function from the coordinate thermodynamic diagram and the antagonistic coordinate thermodynamic diagram corresponding to the plurality of keypoints comprises:
determining a combination graph of coordinate thermodynamic diagrams corresponding to the key points except the current key point for each key point;
determining a key point coordinate loss function corresponding to the current key point according to the combined graph corresponding to the current key point and the confrontation coordinate thermodynamic diagram corresponding to the current key point;
and accumulating the key point coordinate loss functions corresponding to all the key points to obtain the third loss function.
9. The method of claim 4, wherein the inputting the target domain training image into the quadratic migration training model for training, updating parameters in the feature extractor, and obtaining a trained migration training model comprises:
inputting the target domain training image into the feature extractor to obtain a feature map;
inputting the feature maps into the feature regressor and the countercheck regressor respectively to obtain a coordinate thermodynamic diagram and a countercheck coordinate thermodynamic diagram corresponding to a plurality of key points of the target domain training image respectively;
obtaining a fourth loss function according to the coordinate thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating the parameters of the feature extractor based on the fourth loss function to obtain the trained migration training model.
10. A method for estimating a 6D attitude of a target, comprising:
inputting a target image into a feature extractor of a target 6D attitude estimation model to obtain a feature map, wherein the target 6D attitude estimation model comprises the feature extractor, a feature regressor and a proportion regressor;
inputting the feature maps into the feature regressor and the proportion regressor respectively to obtain a coordinate thermodynamic map and a proportion thermodynamic map corresponding to a plurality of key points of the target image respectively;
determining the key point coordinates corresponding to each key point according to the coordinate thermodynamic diagrams corresponding to the key points;
calculating a scale factor corresponding to each key point according to the coordinate thermodynamic diagram and the proportional thermodynamic diagram corresponding to the key points;
determining the three-dimensional coordinates of each key point of the target according to the key point coordinates corresponding to each key point and the scale factor corresponding to each key point;
obtaining the 6D posture of the target according to the three-dimensional coordinates of each key point of the target and the three-dimensional coordinates of the key points of the target three-dimensional model;
the target 6D attitude estimation model is obtained by applying the method of any one of claims 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211030694.4A CN115546295B (en) | 2022-08-26 | 2022-08-26 | Target 6D gesture estimation model training method and target 6D gesture estimation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211030694.4A CN115546295B (en) | 2022-08-26 | 2022-08-26 | Target 6D gesture estimation model training method and target 6D gesture estimation method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115546295A true CN115546295A (en) | 2022-12-30 |
CN115546295B CN115546295B (en) | 2023-11-07 |
Family
ID=84726482
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211030694.4A Active CN115546295B (en) | 2022-08-26 | 2022-08-26 | Target 6D gesture estimation model training method and target 6D gesture estimation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115546295B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109063301A (en) * | 2018-07-24 | 2018-12-21 | 杭州师范大学 | Gestures of object estimation method in a kind of single image room based on thermodynamic chart |
CN113095129A (en) * | 2021-03-01 | 2021-07-09 | 北京迈格威科技有限公司 | Attitude estimation model training method, attitude estimation device and electronic equipment |
CN113283598A (en) * | 2021-06-11 | 2021-08-20 | 清华大学 | Model training method and device, storage medium and electronic equipment |
CN114742890A (en) * | 2022-03-16 | 2022-07-12 | 西北大学 | 6D attitude estimation data set migration method based on image content and style decoupling |
-
2022
- 2022-08-26 CN CN202211030694.4A patent/CN115546295B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109063301A (en) * | 2018-07-24 | 2018-12-21 | 杭州师范大学 | Gestures of object estimation method in a kind of single image room based on thermodynamic chart |
CN113095129A (en) * | 2021-03-01 | 2021-07-09 | 北京迈格威科技有限公司 | Attitude estimation model training method, attitude estimation device and electronic equipment |
CN113283598A (en) * | 2021-06-11 | 2021-08-20 | 清华大学 | Model training method and device, storage medium and electronic equipment |
CN114742890A (en) * | 2022-03-16 | 2022-07-12 | 西北大学 | 6D attitude estimation data set migration method based on image content and style decoupling |
Non-Patent Citations (1)
Title |
---|
JUNGUANG JIANG ETC.: ""Reegressive Domain Adaptation for Unsupervised Keypoint Detection"", 《ARXIV:2103.06175V2[CS.CV]》, pages 3 * |
Also Published As
Publication number | Publication date |
---|---|
CN115546295B (en) | 2023-11-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109643383B (en) | Domain split neural network | |
US8755630B2 (en) | Object pose recognition apparatus and object pose recognition method using the same | |
EP3373248A1 (en) | Method, control device, and system for tracking and photographing target | |
KR100816607B1 (en) | Image collation system and image collation method | |
CN104346811B (en) | Object real-time tracking method and its device based on video image | |
CN111667001B (en) | Target re-identification method, device, computer equipment and storage medium | |
KR102557049B1 (en) | Image Feature Matching Method and System Using The Labeled Keyframes In SLAM-Based Camera Tracking | |
CN109063549B (en) | High-resolution aerial video moving target detection method based on deep neural network | |
JP6946255B2 (en) | Learning device, estimation device, learning method and program | |
US11974050B2 (en) | Data simulation method and device for event camera | |
CN111914878A (en) | Feature point tracking training and tracking method and device, electronic equipment and storage medium | |
CN111160229A (en) | Video target detection method and device based on SSD (solid State disk) network | |
WO2012133371A1 (en) | Image capture position and image capture direction estimation device, image capture device, image capture position and image capture direction estimation method and program | |
CN110111341B (en) | Image foreground obtaining method, device and equipment | |
CN114742112A (en) | Object association method and device and electronic equipment | |
JP2016129309A (en) | Object linking method, device and program | |
CN113361582A (en) | Method and device for generating countermeasure sample | |
CN111753729B (en) | False face detection method and device, electronic equipment and storage medium | |
CN111612827B (en) | Target position determining method and device based on multiple cameras and computer equipment | |
JP6713422B2 (en) | Learning device, event detection device, learning method, event detection method, program | |
CN112990009A (en) | End-to-end-based lane line detection method, device, equipment and storage medium | |
CN115546295B (en) | Target 6D gesture estimation model training method and target 6D gesture estimation method | |
CN110414845B (en) | Risk assessment method and device for target transaction | |
US10896333B2 (en) | Method and device for aiding the navigation of a vehicle | |
CN117197193B (en) | Swimming speed estimation method, swimming speed estimation device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |