CN115546295A - Target 6D attitude estimation model training method and target 6D attitude estimation method - Google Patents

Target 6D attitude estimation model training method and target 6D attitude estimation method Download PDF

Info

Publication number
CN115546295A
CN115546295A CN202211030694.4A CN202211030694A CN115546295A CN 115546295 A CN115546295 A CN 115546295A CN 202211030694 A CN202211030694 A CN 202211030694A CN 115546295 A CN115546295 A CN 115546295A
Authority
CN
China
Prior art keywords
key point
coordinate
training
target
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211030694.4A
Other languages
Chinese (zh)
Other versions
CN115546295B (en
Inventor
彭进业
寇希栋
赵万青
张少博
彭先霖
汪霖
张晓丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest University
Original Assignee
Northwest University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest University filed Critical Northwest University
Priority to CN202211030694.4A priority Critical patent/CN115546295B/en
Publication of CN115546295A publication Critical patent/CN115546295A/en
Application granted granted Critical
Publication of CN115546295B publication Critical patent/CN115546295B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a training method of a target 6D attitude estimation model, which comprises the following steps: training the target 6D attitude estimation model based on the source domain training image to update parameters in the target 6D attitude estimation model to obtain a model after primary training; adding a counterregression device in the model after the primary training to form a migration training model; training the migration training model based on the source domain training image and the target domain training image to update parameters in the migration training model to obtain a trained migration training model; and determining a final trained target 6D posture estimation model based on the trained migration training model. The target 6D attitude estimation model training method at least has one of the following beneficial technical effects: and a mode of combining basic training and transfer training is adopted, and a mode of resisting a regressor is adopted in the transfer training, so that the estimation performance of the target 6D posture estimation model obtained by training is more accurate and reliable.

Description

Target 6D attitude estimation model training method and target 6D attitude estimation method
Technical Field
The application relates to the field of image processing, in particular to a target 6D attitude estimation model training method and a target 6D attitude estimation method.
Background
The purpose of target 6D pose estimation is to detect and estimate the position of a target in a given image, and target 6D pose estimation is widely used in computer vision applications such as augmented reality, virtual reality, and unmanned driving. When an estimation model is trained, the existing target 6D posture estimation method usually trains on a data set with a real label, but in the actual application process, a large amount of training on the data set with the label causes the model to be over-fitted on the training set, so that the model performance is reduced in the actual application scene.
Disclosure of Invention
In order to overcome at least one defect in the prior art, embodiments of the present application provide a target 6D pose estimation model training method and a target 6D pose estimation method.
In a first aspect, an embodiment of the present application provides a method for training a target 6D pose estimation model: the method comprises the following steps:
training the target 6D posture estimation model based on the source domain training image to update parameters in the target 6D posture estimation model to obtain a model after primary training; the target 6D attitude estimation model comprises a feature extractor, a feature regressor and a proportional regressor;
adding a counterregression device in the model after the primary training to form a migration training model;
training the migration training model based on the source domain training image and the target domain training image to update parameters in the migration training model to obtain a trained migration training model;
and determining a final trained target 6D posture estimation model based on the trained migration training model.
In one embodiment, training the target 6D pose estimation model based on the source domain training image to update parameters in the target 6D pose estimation model to obtain a trained model, comprising:
inputting the source domain training image into a feature extractor to obtain a feature map;
inputting the feature maps into a feature regressor and a proportional regressor respectively to obtain a coordinate thermodynamic map and a proportional thermodynamic map corresponding to a plurality of key points of the source domain training image respectively;
obtaining a first loss function according to a coordinate thermodynamic diagram and a proportional thermodynamic diagram corresponding to the plurality of key points;
and updating parameters of the feature extractor, the feature regressor and the proportional regressor based on the first loss function to obtain a model after one training.
In one embodiment, obtaining the first loss function according to the coordinate thermodynamic diagram and the proportional thermodynamic diagram corresponding to the plurality of key points includes:
determining a key point coordinate loss function corresponding to each key point according to the coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
accumulating the key point coordinate loss functions corresponding to all key points to obtain an accumulated value of the key point coordinate loss functions;
determining a scale factor corresponding to each key point according to the proportional thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
determining a key point scale factor loss function according to the scale factor corresponding to each key point and the real scale factor corresponding to each key point;
and determining a first loss function according to the key point coordinate loss function accumulated value and the key point scale factor loss function.
In one embodiment, training a migration training model based on a source domain training image and a target domain training image to update parameters in the migration training model to obtain a trained migration training model, includes:
inputting the source domain training image into a migration training model for training, and updating parameters in a feature extractor, a proportional regression, a feature regression and a counterregression to obtain a primary migration training model;
inputting the target domain training image into the primary migration training model for training, and updating parameters in the counterregression to obtain a secondary migration training model;
and inputting the target domain training image into a secondary migration training model for training, and updating parameters in the feature extractor to obtain a trained migration training model.
In one embodiment, inputting a source domain training image into a migration training model for training, and updating parameters in a feature extractor, a proportional regression, a feature regression, and a counterregression to obtain a primary migration training model, including:
inputting the source domain training image into a feature extractor to obtain a feature map;
respectively inputting the feature maps into a feature regressor, a proportional regressor and an antagonistic regressor to respectively obtain a coordinate thermodynamic diagram, a proportional thermodynamic diagram and an antagonistic coordinate thermodynamic diagram corresponding to a plurality of key points of the source domain training image;
obtaining a second loss function according to the coordinate thermodynamic diagrams, the proportional thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating parameters of the feature extractor, the feature regressor, the proportional regressor and the counterregression based on the second loss function to obtain a primary migration training model.
In one embodiment, the obtaining the second loss function according to the coordinate thermodynamic diagram, the proportional thermodynamic diagram and the confrontation coordinate thermodynamic diagram corresponding to the plurality of key points comprises:
determining a key point coordinate loss function corresponding to each key point according to the coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
accumulating the key point coordinate loss functions corresponding to all key points to obtain an accumulated value of the key point coordinate loss functions;
determining a scale factor corresponding to each key point according to the proportional thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
determining a key point scale factor loss function according to the scale factor corresponding to each key point and the real scale factor corresponding to each key point;
determining an antagonistic key point coordinate loss function corresponding to each key point according to the antagonistic coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
accumulating the antagonistic key point coordinate loss functions corresponding to all key points to obtain an accumulated value of the antagonistic key point coordinate loss functions;
and determining a second loss function according to the key point coordinate loss function accumulated value, the key point scale factor loss function and the confrontation key point coordinate loss function accumulated value.
In one embodiment, inputting the target domain training image into a primary migration training model for training, and updating parameters in a counterregression to obtain a secondary migration training model, including:
inputting the target domain training image into a feature extractor to obtain a feature map;
respectively inputting the feature maps into a feature regressor and an antagonistic regressor to respectively obtain a coordinate thermodynamic diagram and an antagonistic coordinate thermodynamic diagram corresponding to a plurality of key points of the target domain training image;
obtaining a third loss function according to the coordinate thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating the parameters of the counterregression device based on the third loss function to obtain a secondary migration training model.
In one embodiment, the obtaining the third loss function according to the coordinate thermodynamic diagram and the antagonistic coordinate thermodynamic diagram corresponding to the plurality of key points includes:
aiming at each key point, determining a combination graph of coordinate thermodynamic diagrams corresponding to the key points except the current key point;
determining a key point coordinate loss function corresponding to the current key point according to a combined graph corresponding to the current key point and an confrontation coordinate thermodynamic diagram corresponding to the current key point;
and accumulating the key point coordinate loss functions corresponding to all the key points to obtain a third loss function.
In one embodiment, inputting the target domain training image into a quadratic migration training model for training, and updating parameters in the feature extractor to obtain a trained migration training model, including:
inputting the target domain training image into a feature extractor to obtain a feature map;
respectively inputting the feature maps into a feature regressor and an antagonistic regressor to respectively obtain a coordinate thermodynamic diagram and an antagonistic coordinate thermodynamic diagram corresponding to a plurality of key points of the target domain training image;
obtaining a fourth loss function according to the coordinate thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating the parameters of the feature extractor based on the fourth loss function to obtain a trained migration training model.
In a second aspect, an embodiment of the present application provides a target 6D pose estimation method, including:
inputting the target image into a feature extractor of a target 6D attitude estimation model to obtain a feature map, wherein the target 6D attitude estimation model comprises the feature extractor, a feature regressor and a proportional regressor;
respectively inputting the feature maps into a feature regression device and a proportional regression device to respectively obtain a coordinate thermodynamic diagram and a proportional thermodynamic diagram corresponding to a plurality of key points of the target image;
determining the key point coordinates corresponding to each key point according to the coordinate thermodynamic diagrams corresponding to the key points;
calculating a scale factor corresponding to each key point according to a coordinate thermodynamic diagram and a proportional thermodynamic diagram corresponding to the key points;
determining the three-dimensional coordinates of each key point of the target according to the key point coordinates corresponding to each key point and the scale factor corresponding to each key point;
obtaining a 6D posture of the target according to the three-dimensional coordinates of each key point of the target and the three-dimensional coordinates of the key points of the target three-dimensional model;
the target 6D attitude estimation model is obtained by applying the target 6D attitude estimation model training method.
Compared with the prior art, the method has the following beneficial effects: the method is characterized in that a mode of combining basic training and migration training is adopted, a mode of resisting a regressor is adopted in the migration training, the model is subjected to the migration training based on a source domain training image to ensure the accuracy of the three regressors in a source domain, the model is subjected to the migration training based on a target domain training image to ensure the effect of inaccurate prediction of the resisting regressor in a target domain, the prediction of the characteristic regressor is far away from the prediction of the resisting regressor as far as possible, namely the prediction of the characteristic regressor is correct, and therefore the estimation performance of the target 6D posture estimation model obtained through training is more accurate and reliable.
Drawings
The present application may be better understood by reference to the following description taken in conjunction with the accompanying drawings, which are incorporated in and form a part of this specification, along with the detailed description below. In the drawings:
FIG. 1 is a block flow diagram illustrating a method for training a target 6D pose estimation model according to an embodiment of the application;
FIG. 2 shows a block flow diagram of a target 6D pose estimation method according to an embodiment of the application.
Detailed Description
Exemplary embodiments of the present application will be described hereinafter with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual embodiment are described in the specification. It will of course be appreciated that in the development of any such actual embodiment, numerous implementation-specific decisions may be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Here, it should be further noted that, in order to avoid obscuring the present application with unnecessary details, only the device structure closely related to the solution according to the present application is shown in the drawings, and other details not so related to the present application are omitted.
It is to be understood that the application is not limited to the described embodiments, since the description proceeds with reference to the drawings. In this context, embodiments may be combined with each other, features may be replaced or borrowed between different embodiments, one or more features may be omitted in one embodiment, where feasible.
The 6D pose estimation of the target refers to detecting the target in a given image and estimating the position of the target, and the 6D pose refers to 6 degrees of freedom including 3 degrees of freedom of displacement and 3 degrees of freedom of spatial rotation. The application provides a target 6D posture estimation method for RGB images, the target 6D posture is estimated based on a target 6D posture estimation model, the target 6D posture estimation model is trained firstly, the training process comprises a basic training stage and a transfer training stage, and the accuracy of target 6D posture estimation can be greatly improved by the target 6D posture estimation model obtained by the training process.
FIG. 1 shows a block flow diagram of a method for training a target 6D pose estimation model according to an embodiment of the application. The method starts with step S110, training a target 6D pose estimation model based on a source domain training image to update parameters in the target 6D pose estimation model to obtain a model after one training; the target 6D pose estimation model includes a feature extractor, a feature regressor, and a proportional regressor. Here, the training image is acquired based on a Linemod data set, the Linemod data set includes a real image and a synthetic image, the real image may be an image obtained by shooting a target with an image acquisition device such as a camera, for example, and the synthetic image may be an image synthesized by using computer software based on a target three-dimensional model, for example. The source domain training image may be a composite image in the Linemod dataset. Here, the target 6D pose estimation model is trained based on the source domain training image to update the parameters in the entire target 6D pose estimation model, i.e., in the feature extractor, the feature regressor, and the proportional regressor. This step is the basic training phase.
Then, in step S120, adding a counterregression to the model after the primary training to form a migration training model; here, the migration training model includes a feature extractor, a feature regressor, and a proportional regressor in the model after one training, and an added countermeasure regressor. The characteristic regression device and the countermeasure regression device have the same structure, and only have different parameters.
Then, in step S130, training the migration training model based on the source domain training image and the target domain training image to update parameters in the migration training model, so as to obtain a trained migration training model; here, the target domain training image may be a real image in the Linemod dataset.
Then, in step S140, a final trained target 6D pose estimation model is determined based on the trained migration training model. Here, the trained migration training model includes a feature extractor, a feature regressor, a proportional regressor, and a counterregressor, and the final trained target 6D pose estimation model is obtained from the trained migration training model, and includes only the feature extractor, the feature regressor, and the proportional regressor in the trained migration training model.
In the embodiment, in the process of training the target 6D posture estimation model, a mode of combining basic training and migration training is adopted, a mode of resisting a regressor is adopted in the migration training, the model is trained on the basis of a source domain training image to ensure the accuracy of three regressors in a source domain, the model is trained on the basis of a target domain training image to ensure the effect of inaccurate prediction of the resisting regressor in a target domain, and further, the prediction of the characteristic regressor is far away from the prediction of the resisting regressor as far as possible, namely, the prediction of the characteristic regressor is correct, so that the estimation performance of the target 6D posture estimation model obtained by training is more accurate and reliable.
In one embodiment, training the target 6D pose estimation model based on the source domain training image to update parameters in the target 6D pose estimation model to obtain a trained model, comprising:
inputting the source domain training image into a feature extractor to obtain a feature map;
respectively inputting the feature maps into a feature regression device and a proportional regression device to respectively obtain a coordinate thermodynamic diagram and a proportional thermodynamic diagram corresponding to a plurality of key points of the source domain training image; here, the source domain training image has a plurality of key points, which can be obtained from the Linemod data set, and after the feature map is input to the feature regressor and the proportional regressor, a coordinate thermodynamic diagram and a proportional thermodynamic diagram are obtained for each key point. Here, the size of the element value of the element on the coordinate thermodynamic diagram reflects the possibility that the current key point may have the position of the element, and the position of the element with the largest element value is the position of the current key point; the size of the element value of a certain element in the proportional thermodynamic diagram represents the size of the scale factor when the current key point is at the position of the element.
Obtaining a first loss function according to a coordinate thermodynamic diagram and a proportional thermodynamic diagram corresponding to the plurality of key points; here, the first loss function is used to train the model and update the parameters of the model, and the model trained by the first loss function in this embodiment can ensure the accuracy of predicting the positions of the keypoints.
And updating parameters of the feature extractor, the feature regressor and the proportional regressor based on the first loss function to obtain a model after one training.
In one embodiment, obtaining the first loss function according to the coordinate thermodynamic diagram and the proportional thermodynamic diagram corresponding to the plurality of key points includes:
step S210, determining a key point coordinate loss function loss corresponding to each key point according to the coordinate thermodynamic diagram corresponding to each key point and the real thermodynamic diagram uv ′;
Here, the real thermodynamic diagram corresponding to each key point can be obtained according to a Linemod data set;
key point coordinate loss function loss corresponding to each key point uv ', can be expressed by the following formula:
Figure BDA0003817074250000111
wherein x is i Coordinate thermodynamic diagrams for keypoint correspondencesElement value of the i-th element, y i And N is the number of the elements in the coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to the key points.
Step S220, accumulating the key point coordinate loss functions corresponding to all key points to obtain the accumulated value loss of the key point coordinate loss functions uv
Step S230, determining a scale factor corresponding to each key point according to the proportional thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
in one implementation, the scale factor corresponding to each key point may be determined using the following method:
determining a probability distribution map according to the real thermodynamic diagrams corresponding to the key points; processing the real thermodynamic diagram corresponding to the key points by utilizing softmax to obtain a probability distribution diagram of one pixel level;
and multiplying the element value of each element in the proportional thermodynamic diagram corresponding to the key point by the element value of the corresponding element in the probability distribution diagram, and accumulating to obtain the proportional factor corresponding to the key point.
Determining a key point scale factor loss function loss _ s according to the scale factor corresponding to each key point and the real scale factor corresponding to each key point; here, the real scale factor corresponding to each key point can be obtained according to a Linemod data set;
the key point scale factor loss function loss _ s can be determined by adopting the following formula:
Figure BDA0003817074250000112
wherein s is j For the scaling factor corresponding to the jth keypoint,
Figure BDA0003817074250000121
and M is the number of the key points.
Step S240, according to the key point coordinate loss function accumulated value loss uv And a key point scale factor loss function loss _ s, determining a first loss function loss1.
Here, the first loss function loss1 may be calculated using the following formula:
loss 1=αloss uv +loss_s
where α is a constant and may be set to 10.
In one embodiment, training a migration training model based on a source domain training image and a target domain training image to update parameters in the migration training model to obtain a trained migration training model, includes:
step S310, inputting the source domain training image into a migration training model for training, and updating parameters in a feature extractor, a proportional regression, a feature regression and a counterregression to obtain a primary migration training model; in this step, the parameters of the entire network are updated during the training process so that the predictions of the proportional regressor Z, the feature regressor H, and the counterregressor H' are lost as little as possible in the source domain. In the step, the model is trained by using the source domain training image, so that the accuracy of the three regressors in the source domain is ensured.
In one implementation, step 310 may specifically include:
inputting the source domain training image into a feature extractor to obtain a feature map;
respectively inputting the feature maps into a feature regressor, a proportional regressor and an antagonistic regressor to respectively obtain a coordinate thermodynamic diagram, a proportional thermodynamic diagram and an antagonistic coordinate thermodynamic diagram corresponding to a plurality of key points of the source domain training image;
obtaining a second loss function according to the coordinate thermodynamic diagrams, the proportional thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating parameters of the feature extractor, the feature regressor, the proportional regressor and the counterregression based on the second loss function to obtain a primary migration training model.
Step S320, inputting the target domain training image into the primary migration training model for training, and updating parameters in the counterregression device to obtain a secondary migration training model; in the step, a target domain training image is adopted to train the model, only the parameters of the countercheck regressor are updated, the effect that the countercheck regressor measures inaccurately in the target domain is ensured, and the prediction of the characteristic regressor is far away from the prediction of the countercheck regressor as far as possible, namely the prediction of the characteristic regressor is correct.
In one implementation, step S320 may specifically include:
inputting the target domain training image into a feature extractor to obtain a feature map;
respectively inputting the feature maps into a feature regressor and an antagonistic regressor to respectively obtain a coordinate thermodynamic diagram and an antagonistic coordinate thermodynamic diagram corresponding to a plurality of key points of the target domain training image;
obtaining a third loss function according to the coordinate thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating the parameters of the counterregression device based on the third loss function to obtain a secondary migration training model.
And step S330, inputting the target domain training image into a secondary migration training model for training, and updating parameters in the feature extractor to obtain a trained migration training model. In this step, only the feature extractor is updated, and the target domain training image is used to limit the error effect generated in step S320 to the counterregressor, without affecting the feature extractor.
In one implementation, step 330 may specifically include:
inputting the target domain training image into a feature extractor to obtain a feature map;
inputting the feature graphs into a feature regressor and an antagonistic regressor respectively to obtain a coordinate thermodynamic diagram and an antagonistic coordinate thermodynamic diagram corresponding to a plurality of key points of the target domain training image;
obtaining a fourth loss function according to the coordinate thermodynamic diagrams and the antagonistic coordinate thermodynamic diagrams corresponding to the key points;
and updating the parameters of the feature extractor based on the fourth loss function to obtain a trained migration training model.
In one embodiment, obtaining the second loss function according to the coordinate thermodynamic diagram, the proportional thermodynamic diagram and the confrontation coordinate thermodynamic diagram corresponding to the plurality of key points may include:
step S410, determining a key point coordinate loss function loss corresponding to each key point according to the coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point uv1 ′;
Here, the key point coordinate loss function loss corresponding to each key point uv1 ', can be expressed by the following formula:
Figure BDA0003817074250000141
wherein x is 1i Element value, y, of the ith element in a coordinate thermodynamic diagram corresponding to a key point 1i And N is the number of the elements in the coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to the key points.
Step S420, accumulating the key point coordinate loss functions corresponding to all key points to obtain the accumulated value loss of the key point coordinate loss functions uv1
Step S430, determining a scale factor corresponding to each key point according to the proportional thermodynamic diagram and the real thermodynamic diagram corresponding to each key point; the specific manner of determining the scale factor is consistent with the foregoing description, and is not repeated herein;
step S440, determining a key point scale factor loss function according to the scale factor corresponding to each key point and the real scale factor corresponding to each key point; the key point scale factor loss function loss _ s1 can be determined by adopting the following formula:
Figure BDA0003817074250000151
wherein s is 1j The scaling factor corresponding to the jth key point,
Figure BDA0003817074250000152
true ratio for jth keypointFor example, M is the number of key points.
Step S450, determining the countermand point coordinate loss function loss corresponding to each key point according to the countermand coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point uv1_adv '; here, the opposing keypoint coordinate loss function loss corresponding to each keypoint uv1_adv ', can be expressed by the following formula:
Figure BDA0003817074250000153
wherein x is adv1i Element value, y, of the ith element in the antagonistic coordinate thermodynamic diagram corresponding to the key point 1i And N is the number of the elements in the confrontation coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to the key points.
Step S460, accumulating the antagonistic key point coordinate loss functions corresponding to all key points to obtain an accumulative value loss of the antagonistic key point coordinate loss function uv1_adv
Step S470, according to the accumulated value loss of the key point coordinate loss function uv1 A key point scale factor loss function loss _ s1 and an accumulated value loss of a counterkey point coordinate loss function uv1_adv Determining a second loss function loss2;
here, the second loss function loss2 may be determined using the following formula:
loss2=loss uv1 +βloss uv1_adv +γloss_s1
where β may be set to 1 and γ may be set to 20.
In one embodiment, the obtaining the third loss function according to the coordinate thermodynamic diagram and the antagonistic coordinate thermodynamic diagram corresponding to the plurality of key points includes:
aiming at each key point, determining a combination graph of coordinate thermodynamic diagrams corresponding to the key points except the current key point;
determining a key point coordinate loss function corresponding to the current key point according to a combined graph corresponding to the current key point and an confrontation coordinate thermodynamic diagram corresponding to the current key point;
here, the key point coordinate loss function loss corresponding to each key point uv2 ', can be expressed by the following formula:
Figure BDA0003817074250000161
wherein x is adv2i Element value, y, of the ith element in the antagonistic coordinate thermodynamic diagram corresponding to the key point 2i And N is the coordinate thermodynamic diagram corresponding to the key points and the number of the elements in the composite diagram.
And accumulating the coordinate loss functions of the key points corresponding to all the key points to obtain a third loss function loss3.
In one embodiment, obtaining the fourth loss function according to the coordinate thermodynamic diagram and the confrontation coordinate thermodynamic diagram corresponding to the plurality of key points includes:
determining a key point coordinate loss function corresponding to each key point according to the coordinate thermodynamic diagram and the confrontation coordinate thermodynamic diagram corresponding to each key point;
here, the key point coordinate loss function loss corresponding to each key point uv3 ', can be expressed by the following formula:
Figure BDA0003817074250000171
wherein x is 2i Element value, y, of the ith element in a coordinate thermodynamic diagram corresponding to a key point adv2i And N is the number of the elements in the coordinate thermodynamic diagram and the antagonistic coordinate thermodynamic diagram corresponding to the key points.
And accumulating the coordinate loss functions of the key points corresponding to all the key points to obtain a fourth loss function loss4.
Fig. 2 shows a block flow diagram of a target 6D pose estimation method according to an embodiment of the application, the target 6D pose estimation method comprising:
step S510, inputting a target image into a feature extractor of a target 6D attitude estimation model to obtain a feature map; here, the target image may be an image obtained by shooting a target with an image acquisition device such as a camera; the target 6D attitude estimation model is obtained by applying the training method of the target 6D attitude estimation model in the embodiment, and the target 6D attitude estimation model comprises a feature extractor, a feature regressor and a proportional regressor;
step S520, respectively inputting the feature maps into a feature regressor and a proportional regressor of the target 6D attitude estimation model to respectively obtain a coordinate thermodynamic diagram and a proportional thermodynamic diagram corresponding to a plurality of key points of the target image; here, the key points of the target image are generally selected from points with a prominent or marked target surface, and the key points of the target image can be determined according to the Linemod data set.
Step S530, determining the key point coordinates (u, v) corresponding to each key point according to the coordinate thermodynamic diagrams corresponding to the key points; here, the coordinates of the element having the largest element value in the coordinate thermodynamic diagram corresponding to the key point may be determined as the key point coordinates;
step S540, calculating a scale factor S corresponding to each key point according to the coordinate thermodynamic diagrams and the scale thermodynamic diagrams corresponding to the key points;
in one implementation, the scale factor corresponding to each key point may be determined using the following method:
determining a probability distribution map according to a coordinate thermodynamic diagram corresponding to the key points; processing the coordinate thermodynamic diagrams corresponding to the key points by utilizing softmax to obtain a probability distribution diagram of one pixel level;
and multiplying the element value of each element in the proportional thermodynamic diagram corresponding to the key point by the element value of the corresponding element in the probability distribution diagram, and accumulating to obtain the proportional factor corresponding to the key point.
Step S550, determining the three-dimensional coordinates of each key point of the target according to the key point coordinates (u, v) corresponding to each key point and the scale factor S corresponding to each key point;
here, will find outIs represented by the formula
Figure BDA0003817074250000181
And solving the three-dimensional coordinate xyz of the target object in the image under the camera coordinate system, wherein K is the camera internal parameter.
Step S560, obtaining a 6D posture of the target according to the three-dimensional coordinates of each key point of the target and the three-dimensional coordinates of the key points of the target three-dimensional model; here, the three-dimensional coordinates of the key points of the target three-dimensional model may be obtained from a Linemod data set;
here, the three-dimensional coordinates of each key point of the target and the three-dimensional coordinates of the key points of the target three-dimensional model are processed based on a least square method (least square), and the 6D pose of the target can be obtained.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative and, for example, the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.
The above description is only for various embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present application, and all such changes or substitutions are included in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method for training a target 6D attitude estimation model is characterized by comprising the following steps:
training a target 6D attitude estimation model based on a source domain training image to update parameters in the target 6D attitude estimation model to obtain a model after primary training; the target 6D attitude estimation model comprises a feature extractor, a feature regressor and a proportional regressor;
adding a countercheck regressor in the model after the primary training to form a migration training model;
training the migration training model based on the source domain training image and the target domain training image to update parameters in the migration training model to obtain a trained migration training model;
and determining a final trained target 6D posture estimation model based on the trained migration training model.
2. The method of claim 1, wherein the training of the target 6D pose estimation model based on the source domain training image to update parameters in the target 6D pose estimation model, resulting in a trained model, comprises:
inputting the source domain training image into the feature extractor to obtain a feature map;
inputting the feature maps into the feature regressor and the proportion regressor respectively to obtain a coordinate thermodynamic diagram and a proportion thermodynamic diagram corresponding to a plurality of key points of the source domain training image respectively;
obtaining a first loss function according to the coordinate thermodynamic diagrams and the proportional thermodynamic diagrams corresponding to the plurality of key points;
and updating the parameters of the feature extractor, the feature regressor and the proportion regressor based on the first loss function to obtain the model after the primary training.
3. The method of claim 2, wherein the deriving a first loss function from a coordinate thermodynamic diagram and a proportional thermodynamic diagram corresponding to the plurality of key points comprises:
determining a key point coordinate loss function corresponding to each key point according to the coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
accumulating the key point coordinate loss functions corresponding to all key points to obtain an accumulated value of the key point coordinate loss functions;
determining a scale factor corresponding to each key point according to the proportional thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
determining a key point scale factor loss function according to the scale factor corresponding to each key point and the real scale factor corresponding to each key point;
and determining the first loss function according to the key point coordinate loss function accumulated value and the key point scale factor loss function.
4. The method of claim 1, wherein the training the migration training model based on the source domain training image and the target domain training image to update parameters in the migration training model to obtain a trained migration training model, comprises:
inputting the source domain training image into the migration training model for training, and updating parameters in the feature extractor, the proportional regression, the feature regression and the countermeasure regression to obtain a primary migration training model;
inputting the target domain training image into the primary migration training model for training, and updating parameters in the counterregression device to obtain a secondary migration training model;
and inputting the target domain training image into the secondary migration training model for training, and updating the parameters in the feature extractor to obtain a trained migration training model.
5. The method of claim 4, wherein the inputting the source domain training images into the migration training model for training, updating parameters in the feature extractor, the proportional regressor, the feature regressor, and the counterregressor to obtain a primary migration training model comprises:
inputting the source domain training image into the feature extractor to obtain a feature map;
inputting the feature graph into the feature regressor, the proportion regressor and the countermand regressor respectively to obtain a coordinate thermodynamic diagram, a proportion thermodynamic diagram and a countermand thermodynamic diagram corresponding to a plurality of key points of the source domain training image respectively;
obtaining a second loss function according to the coordinate thermodynamic diagrams, the proportional thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating the parameters of the feature extractor, the feature regressor, the proportion regressor and the counterregression based on the second loss function to obtain the primary migration training model.
6. The method of claim 5, wherein deriving a second loss function from the coordinate thermodynamic diagram, the proportional thermodynamic diagram, and the antagonistic coordinate thermodynamic diagram corresponding to the plurality of key points comprises:
determining a key point coordinate loss function corresponding to each key point according to the coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
accumulating the key point coordinate loss functions corresponding to all key points to obtain an accumulated value of the key point coordinate loss functions;
determining a scale factor corresponding to each key point according to the proportional thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
determining a key point scale factor loss function according to the scale factor corresponding to each key point and the real scale factor corresponding to each key point;
determining an antagonistic key point coordinate loss function corresponding to each key point according to the antagonistic coordinate thermodynamic diagram and the real thermodynamic diagram corresponding to each key point;
accumulating the antagonistic key point coordinate loss functions corresponding to all key points to obtain an accumulated value of the antagonistic key point coordinate loss functions;
and determining the second loss function according to the key point coordinate loss function accumulated value, the key point scale factor loss function and the confrontation key point coordinate loss function accumulated value.
7. The method of claim 4, wherein the inputting the target domain training image into the primary migration training model for training, updating parameters in the counterregression, and obtaining a secondary migration training model comprises:
inputting the target domain training image into the feature extractor to obtain a feature map;
inputting the feature maps into the feature regressor and the countercheck regressor respectively to obtain a coordinate thermodynamic diagram and a countercheck coordinate thermodynamic diagram corresponding to a plurality of key points of the target domain training image respectively;
obtaining a third loss function according to the coordinate thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating the parameters of the countervailing regressor based on the third loss function to obtain the secondary migration training model.
8. The method of claim 7, wherein the deriving a third loss function from the coordinate thermodynamic diagram and the antagonistic coordinate thermodynamic diagram corresponding to the plurality of keypoints comprises:
determining a combination graph of coordinate thermodynamic diagrams corresponding to the key points except the current key point for each key point;
determining a key point coordinate loss function corresponding to the current key point according to the combined graph corresponding to the current key point and the confrontation coordinate thermodynamic diagram corresponding to the current key point;
and accumulating the key point coordinate loss functions corresponding to all the key points to obtain the third loss function.
9. The method of claim 4, wherein the inputting the target domain training image into the quadratic migration training model for training, updating parameters in the feature extractor, and obtaining a trained migration training model comprises:
inputting the target domain training image into the feature extractor to obtain a feature map;
inputting the feature maps into the feature regressor and the countercheck regressor respectively to obtain a coordinate thermodynamic diagram and a countercheck coordinate thermodynamic diagram corresponding to a plurality of key points of the target domain training image respectively;
obtaining a fourth loss function according to the coordinate thermodynamic diagrams and the confrontation coordinate thermodynamic diagrams corresponding to the plurality of key points;
and updating the parameters of the feature extractor based on the fourth loss function to obtain the trained migration training model.
10. A method for estimating a 6D attitude of a target, comprising:
inputting a target image into a feature extractor of a target 6D attitude estimation model to obtain a feature map, wherein the target 6D attitude estimation model comprises the feature extractor, a feature regressor and a proportion regressor;
inputting the feature maps into the feature regressor and the proportion regressor respectively to obtain a coordinate thermodynamic map and a proportion thermodynamic map corresponding to a plurality of key points of the target image respectively;
determining the key point coordinates corresponding to each key point according to the coordinate thermodynamic diagrams corresponding to the key points;
calculating a scale factor corresponding to each key point according to the coordinate thermodynamic diagram and the proportional thermodynamic diagram corresponding to the key points;
determining the three-dimensional coordinates of each key point of the target according to the key point coordinates corresponding to each key point and the scale factor corresponding to each key point;
obtaining the 6D posture of the target according to the three-dimensional coordinates of each key point of the target and the three-dimensional coordinates of the key points of the target three-dimensional model;
the target 6D attitude estimation model is obtained by applying the method of any one of claims 1-9.
CN202211030694.4A 2022-08-26 2022-08-26 Target 6D gesture estimation model training method and target 6D gesture estimation method Active CN115546295B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211030694.4A CN115546295B (en) 2022-08-26 2022-08-26 Target 6D gesture estimation model training method and target 6D gesture estimation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211030694.4A CN115546295B (en) 2022-08-26 2022-08-26 Target 6D gesture estimation model training method and target 6D gesture estimation method

Publications (2)

Publication Number Publication Date
CN115546295A true CN115546295A (en) 2022-12-30
CN115546295B CN115546295B (en) 2023-11-07

Family

ID=84726482

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211030694.4A Active CN115546295B (en) 2022-08-26 2022-08-26 Target 6D gesture estimation model training method and target 6D gesture estimation method

Country Status (1)

Country Link
CN (1) CN115546295B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109063301A (en) * 2018-07-24 2018-12-21 杭州师范大学 Gestures of object estimation method in a kind of single image room based on thermodynamic chart
CN113095129A (en) * 2021-03-01 2021-07-09 北京迈格威科技有限公司 Attitude estimation model training method, attitude estimation device and electronic equipment
CN113283598A (en) * 2021-06-11 2021-08-20 清华大学 Model training method and device, storage medium and electronic equipment
CN114742890A (en) * 2022-03-16 2022-07-12 西北大学 6D attitude estimation data set migration method based on image content and style decoupling

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109063301A (en) * 2018-07-24 2018-12-21 杭州师范大学 Gestures of object estimation method in a kind of single image room based on thermodynamic chart
CN113095129A (en) * 2021-03-01 2021-07-09 北京迈格威科技有限公司 Attitude estimation model training method, attitude estimation device and electronic equipment
CN113283598A (en) * 2021-06-11 2021-08-20 清华大学 Model training method and device, storage medium and electronic equipment
CN114742890A (en) * 2022-03-16 2022-07-12 西北大学 6D attitude estimation data set migration method based on image content and style decoupling

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JUNGUANG JIANG ETC.: ""Reegressive Domain Adaptation for Unsupervised Keypoint Detection"", 《ARXIV:2103.06175V2[CS.CV]》, pages 3 *

Also Published As

Publication number Publication date
CN115546295B (en) 2023-11-07

Similar Documents

Publication Publication Date Title
CN109643383B (en) Domain split neural network
US8755630B2 (en) Object pose recognition apparatus and object pose recognition method using the same
EP3373248A1 (en) Method, control device, and system for tracking and photographing target
KR100816607B1 (en) Image collation system and image collation method
CN104346811B (en) Object real-time tracking method and its device based on video image
CN111667001B (en) Target re-identification method, device, computer equipment and storage medium
KR102557049B1 (en) Image Feature Matching Method and System Using The Labeled Keyframes In SLAM-Based Camera Tracking
CN109063549B (en) High-resolution aerial video moving target detection method based on deep neural network
JP6946255B2 (en) Learning device, estimation device, learning method and program
US11974050B2 (en) Data simulation method and device for event camera
CN111914878A (en) Feature point tracking training and tracking method and device, electronic equipment and storage medium
CN111160229A (en) Video target detection method and device based on SSD (solid State disk) network
WO2012133371A1 (en) Image capture position and image capture direction estimation device, image capture device, image capture position and image capture direction estimation method and program
CN110111341B (en) Image foreground obtaining method, device and equipment
CN114742112A (en) Object association method and device and electronic equipment
JP2016129309A (en) Object linking method, device and program
CN113361582A (en) Method and device for generating countermeasure sample
CN111753729B (en) False face detection method and device, electronic equipment and storage medium
CN111612827B (en) Target position determining method and device based on multiple cameras and computer equipment
JP6713422B2 (en) Learning device, event detection device, learning method, event detection method, program
CN112990009A (en) End-to-end-based lane line detection method, device, equipment and storage medium
CN115546295B (en) Target 6D gesture estimation model training method and target 6D gesture estimation method
CN110414845B (en) Risk assessment method and device for target transaction
US10896333B2 (en) Method and device for aiding the navigation of a vehicle
CN117197193B (en) Swimming speed estimation method, swimming speed estimation device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant