CN115049737A

CN115049737A - Pose marking method, device and system and storage medium

Info

Publication number: CN115049737A
Application number: CN202110219672.1A
Authority: CN
Inventors: 蒋星; 石瑞宇
Original assignee: Guangdong Bozhilin Robot Co Ltd
Current assignee: Guangdong Bozhilin Robot Co Ltd
Priority date: 2021-02-26
Filing date: 2021-02-26
Publication date: 2022-09-13

Abstract

The embodiment of the invention discloses a pose marking method, a pose marking device, a pose marking system and a storage medium. The method comprises the steps that an obtained sequence frame image comprises a reference object and a target object, the sequence frame image comprises a reference frame image and a current frame image, and a first pose conversion relation is obtained based on feature points of the reference object, projection points in the reference frame image and projection points in the current frame image; and determining a second position and posture conversion relation based on the characteristic points of the target object, the projection points in the reference frame image and the first position and posture conversion relation. The characteristic points of the reference object and the characteristic points of the target object are stable and reliable, and a pose transformation matrix can be accurately calculated; based on the calculated second pose conversion relation and the projection point of the target object in any frame of image, automatically determining and marking the pose of the feature point of the target object without manual marking; when the target object is positioned and analyzed based on the pose of the target object, for example, the robot is used for grabbing the target object, so that grabbing efficiency and precision can be improved.

Description

Pose marking method, device and system and storage medium

Technical Field

The embodiment of the invention relates to an intelligent detection technology of a robot, in particular to a pose marking method, a pose marking device, a pose marking system and a storage medium.

Background

The gesture of a target object is often used as an important technology in the field of intelligent detection of robots. At present, a manual marking method is often adopted to mark the pose data of the target object. In the robot working process, a large amount of coordinate data of a target object needs to be acquired. For example, a robot is used to identify a target object and grab the target object, and acquire a large amount of coordinate data of the target object. The pose of the target object is marked manually based on massive coordinate data, a large amount of labor cost and time cost are consumed, and marking accuracy of the pose data is poor.

Disclosure of Invention

The embodiment of the invention provides a pose marking method, a pose marking device, a pose marking system and a storage medium, and aims to achieve the effect of improving pose marking efficiency and accuracy.

In a first aspect, an embodiment of the present invention provides a pose labeling method, where the method includes:

receiving sequence frame images acquired by a shooting device at different angles, and determining a reference frame image and a current frame image in the sequence frame images, wherein each frame image of the sequence frame images comprises a reference object and a target object;

determining a first pose conversion relation between a current frame image coordinate system and a reference frame image coordinate system based on the characteristic points of the reference object, the projection points of the characteristic points of the reference object in the reference frame image and the projection points of the characteristic points of the reference object in the current frame image;

determining a second pose conversion relation between a target object coordinate system where the target object is located and a current frame image coordinate system based on the feature points of the target object, the projection points of the feature points of the target object in the reference frame image and the first pose conversion relation;

and carrying out pose labeling on the target object based on the second pose transformation relation and the projection point of the target object in any frame of image shot by the shooting device.

In a second aspect, an embodiment of the present invention further provides a pose labeling apparatus, where the apparatus includes:

the image determining module is used for receiving sequence frame images acquired by a shooting device at different angles and determining a reference frame image and a current frame image in the sequence frame images, wherein each frame image of the sequence frame images comprises a reference object and a target object;

the first pose conversion relation determining module is used for determining a first pose conversion relation between a current frame image coordinate system and a reference frame image coordinate system based on the feature points of the reference object, the projection points of the feature points of the reference object in the reference frame image and the projection points of the feature points of the reference object in the current frame image;

the second pose conversion relation determining module is used for determining a second pose conversion relation between a target object coordinate system where the target object is located and a current frame image coordinate system based on the feature point of the target object, the projection point of the feature point of the target object in the reference frame image and the first pose conversion relation;

and the pose marking module is used for marking the pose of the target object based on the second pose transformation relation and the projection point of the target object in any frame of image shot by the shooting device.

In a third aspect, an embodiment of the present invention further provides a pose marking system, including a robot, a target object, and a reference object, where the robot includes a mechanical arm, a shooting device, a memory, and a processor; the mechanical arm drives the shooting device to move so that the shooting device can acquire sequence frame images at a plurality of angles; each frame of image of the sequence frame image comprises a reference object and a target object, the target object comprises a plurality of characteristic points, and the reference object comprises a plurality of characteristic points;

wherein a computer program is present in the memory and is executable on the processor, and the processor implements the pose labeling method according to any one of the first aspect when executing the computer program.

In a fourth aspect, the embodiment of the present invention further provides a storage medium containing computer executable instructions, where the computer executable instructions, when executed by a computer processor, implement the pose annotation method according to any one of the first aspect.

According to the technical scheme of the embodiment of the invention, sequence frame images acquired by a shooting device at different angles are received, and each frame image of the sequence frame images comprises a reference object and a target object, so that the reference object assists the shooting device to shoot; determining a reference frame image and a current frame image in the sequence frame images, and determining a first pose conversion relation between a current frame image coordinate system and a reference frame image coordinate system based on the feature points of the reference object, the projection points of the feature points of the reference object in the reference frame image and the projection points of the feature points of the reference object in the current frame image; and determining a second posture conversion relation between a target object coordinate system where the target object is located and a current frame image coordinate system based on the feature points of the target object, the projection points of the feature points of the target object in the reference frame image and the first posture conversion relation. The characteristic points of the reference object and the target object are stable and reliable, so that the first attitude transformation matrix and the second attitude transformation matrix can be accurately calculated; when any frame of image shot by the shooting device is obtained, based on the calculated second pose conversion relation and the projection point of the target object in any frame of image, the poses of the feature points of the target object are automatically determined, and the poses of the feature points of the target object are automatically labeled, so that manual labeling is not needed, the labor cost is reduced, and the reliability of pose calculation is improved; furthermore, when the target object is positioned and analyzed based on the pose of the target object, the positioning precision can be improved, and particularly when the robot is applied to the field of grabbing the target object, the grabbing efficiency and the grabbing precision of the target object can be greatly improved.

Drawings

Fig. 1 is a schematic flow chart of a pose marking method according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of a pose marking method according to a second embodiment of the present invention;

fig. 3 is a schematic diagram of pose transformation relationships between coordinate systems according to a second embodiment of the present invention;

fig. 4 is a schematic logical diagram of pose labeling according to a second embodiment of the present invention;

fig. 5 is a schematic structural diagram of a pose marking apparatus provided in the third embodiment of the present invention;

fig. 6 is a schematic structural diagram of a pose marking system according to a fourth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some structures related to the present invention are shown in the drawings, not all of them.

Example one

Fig. 1 is a schematic flowchart of a pose labeling method according to an embodiment of the present invention, where the pose labeling method is applicable to a case of automatically labeling a pose, and the method can be executed by a pose labeling apparatus, where the system can be implemented by software and/or hardware and is generally integrated in a robot or an electronic device with a pose labeling function, and the embodiment is explained by taking a robot as an example. Referring specifically to fig. 1, the method may include the steps of:

and S110, receiving sequence frame images acquired by the shooting device at different angles, and determining a reference frame image and a current frame image in the sequence frame images.

The shooting device is installed on a mechanical arm of the robot and used for collecting sequence frame images in the operation process of the robot. The shooting device can be a camera, a camera or a laser, and can also be other equipment with an image acquisition function. It should be noted that, when the sequence frame images are acquired by using the shooting device, the mechanical arm drives the shooting device to move to different shooting points according to a pre-planned path, so that the shooting device acquires the sequence frame images at different angles.

In one possible embodiment, the pre-planned path may be determined based on the working path of the robot, the length of the robot arm, the mounting location of the camera, and/or the distance the robot moves between adjacent working points. In another possible embodiment, the path may be determined from coordinate data of a plurality of feature points in the target object to be labeled and/or coordinate data of a plurality of feature points in the reference object. The planning method of the path of the robot arm is not limited to the above two methods, and may be generated based on other methods.

A sequence frame image is understood to be a sequence of frames consisting of several images in time. Each frame image of the sequence frame images includes a reference object and a target object. The target object may be understood as an object to be grasped or identified, and may be a regular object or an irregular object. For example, in the building field, when a wall brick or wallpaper is laid by a robot, the target object may be a wall brick or wallpaper to be grabbed; in the logistics transportation field, the robot is used for classifying the express, and the target object can be the express to be identified. One or more stable feature points may be included on the target object.

It should be noted that the reference object is generally located under a world coordinate system and is stable, and the reference object includes one or more stable feature points for assisting the camera to acquire the sequential frame images at different angles, so that each acquired frame image includes the reference object and the target object. Optionally, the reference object is a two-dimensional code or a bar code.

Before acquiring the sequential frame images, at least one feature point of the reference object and at least one feature point of the target object are respectively determined, so that each frame image of the acquired sequential frame images comprises a projection point of the reference object and a projection point of the target object. In an alternative embodiment, the characteristic points of the reference object may be determined from a pre-planned path of the robotic arm. In another alternative embodiment, the feature points of the reference object may be randomly distributed, or may be determined according to other rules. The feature points of the target object may include a vertex of the target object, a center point of each edge, a center point of each face, and may further include other feature points, such as a point of three-thirds of each edge, a non-face center point of each face, and the like.

The reference frame and the current frame can be two arbitrarily selected frame images in the sequence frame images. As can be seen from the foregoing description, the current frame image and the reference frame image each include a reference object and a target object. The current frame image comprises a projection point of a reference object and a projection point of a target object, and the reference frame image comprises the projection point of the reference object and the projection point of the target object.

And S120, determining a first posture conversion relation between the coordinate system of the current frame image and the coordinate system of the reference frame image based on the characteristic points of the reference object, the projection points of the characteristic points of the reference object in the reference frame image and the projection points of the characteristic points of the reference object in the current frame image.

Before the first pose conversion relation is calculated, a coordinate system where a reference object is located, namely a world coordinate system, is determined, the coordinate system where a reference frame image is located is defined as a reference frame image coordinate system, and the coordinate system where a current frame image is located is defined as a current frame image coordinate system.

Optionally, the method for determining the first pose conversion relationship includes: determining a third pose conversion relation between a world coordinate system where the reference object is located and a reference frame image coordinate system based on the feature points of the reference object and the projection points of the feature points of the reference object in the reference frame image; determining a fourth pose conversion relation between a world coordinate system and a current frame image coordinate system based on the feature points of the reference object and the projection points of the feature points of the reference object in the current frame image; and determining the first posture conversion relation according to the third posture conversion relation and the fourth posture conversion relation.

It should be noted that the projection point of the reference object refers to a projection of the feature point of the reference object in the image acquired by the photographing device, that is, the feature point of the reference object corresponds to the projection point of each frame of image in the sequence frame of images.

Specifically, the method for determining the third posture-conversion relationship includes: acquiring the pose of the characteristic point of the reference object in a world coordinate system and first coordinate data of the projection point of the reference object in a reference frame image coordinate system; and determining a third pose conversion relation between the world coordinate system and the reference frame image coordinate system according to the pose of the characteristic point of the reference object in the world coordinate system and the first coordinate data.

Specifically, the method for determining the fourth pose conversion relationship includes: acquiring the pose of the characteristic point of the reference object in a world coordinate system and second coordinate data of the projection point of the reference object in a current frame image coordinate system; and determining a fourth pose conversion relation between the world coordinate system and the current frame image coordinate system according to the pose of the characteristic point of the reference object in the world coordinate system and the second coordinate data.

It should be noted that the pose refers to a pose with six degrees of freedom of the feature point of the reference object in the world coordinate system, and the pose with six degrees of freedom may include a degree of freedom of movement of the feature point of the reference object along three coordinate axes x, y, and z of the world coordinate system and a degree of freedom of rotation around the three coordinate axes. The first coordinate data refers to two-dimensional coordinates of the projection point of the reference object in a reference frame image coordinate system, and the second coordinate data refers to two-dimensional coordinates of the projection point of the reference object in a current frame image coordinate system. The third pose conversion relationship may be understood as a conversion relationship between a world coordinate system and a reference frame image coordinate system, and the fourth pose conversion relationship may be understood as a conversion relationship between the world coordinate system and a current frame image coordinate system. Therefore, the world coordinate system is used as a conversion reference object, and the first pose conversion relation between the reference frame image coordinate system and the current frame image coordinate system is determined according to the third pose conversion relation and the fourth pose conversion relation.

S130, determining a second position and posture conversion relation between a target object coordinate system where the target object is located and a current frame image coordinate system based on the feature point of the target object, the projection point of the feature point of the target object in the reference frame image and the first position and posture conversion relation.

Before calculating the second pose conversion relationship, a coordinate system in which the target object is located is defined as a target object coordinate system. Optionally, the second method for determining a position-posture conversion relationship includes: determining a fifth pose conversion relation between a target object coordinate system and a reference frame image coordinate system according to the feature points of the target object and the projection points of the feature points of the target object in the reference frame image; and determining a second pose transformation relation based on the first pose transformation relation and the fifth pose transformation relation.

Specifically, the method for determining the fifth pose transformation relationship includes: and calculating a fifth pose conversion relation between the target object coordinate system and the reference frame coordinate system based on the pose of the characteristic point of the target object in the target object coordinate system and the third coordinate data of the projection point of the target object in the reference frame coordinate system.

It should be noted that the pose refers to a pose with six degrees of freedom of the feature point of the target object in the target object coordinate system, and the pose with six degrees of freedom may include a degree of freedom of movement of the feature point of the target object along three coordinate axes x, y, and z of the target object coordinate system and a degree of freedom of rotation around the three coordinate axes. The third coordinate data refers to two-dimensional coordinates of the projection point of the target object in the reference frame image coordinate system. The fifth pose transformation relationship may be understood as a transformation relationship between a reference frame coordinate system and a target object coordinate system, and the first pose transformation relationship may be understood as a transformation relationship between a reference frame image coordinate system and a current frame image coordinate system. Therefore, the reference frame image coordinate system is used as a conversion reference object, and the second pose conversion relation between the target object coordinate system and the current frame image coordinate system is determined according to the first pose conversion relation and the fifth pose conversion relation.

And S140, carrying out pose labeling on the target object based on the second pose conversion relation and the projection point of the target object in any frame of image shot by the shooting device.

Any frame of image shot by the shooting device is located under the current frame of image coordinate system. And under the current frame image coordinate system, determining coordinate data of a projection point of the target object in the image shot by the shooting device, and determining the pose of the feature point of the target object corresponding to the projection point of the target object in the target object coordinate system according to the second pose conversion matrix and the coordinate data of the projection point of the target object in any one shot frame image.

It should be noted that the coordinate data of the projection point of the target object in the captured image refers to a two-dimensional coordinate of the projection point of the target object in the current frame image coordinate system, and the pose refers to a six-degree-of-freedom pose of the feature point of the target object in the target object coordinate system. Therefore, according to the coordinate data of the projection point of the target object in the shot image and the second pose transformation matrix, the six-degree-of-freedom pose of the feature point of the target object in the target object coordinate system can be determined, and the pose of the feature point of the target object can be automatically labeled.

According to the technical scheme provided by the embodiment, sequence frame images acquired by a shooting device at different angles are received, and each frame image of the sequence frame images comprises a reference object and a target object, so that the reference object assists the shooting device to shoot; determining a reference frame image and a current frame image in the sequence frame images, and determining a first posture conversion relation between a current frame image coordinate system and a reference frame image coordinate system based on the feature points of the reference object, the projection points of the feature points of the reference object in the reference frame image and the projection points of the feature points of the reference object in the current frame image; and determining a second posture conversion relation between a target object coordinate system where the target object is located and a current frame image coordinate system based on the feature points of the target object, the projection points of the feature points of the target object in the reference frame image and the first posture conversion relation. The characteristic points of the reference object and the target object are stable and reliable, so that the first attitude transformation matrix and the second attitude transformation matrix can be accurately calculated; when any frame of image shot by the shooting device is acquired, the pose of the feature point of the target object is automatically determined and labeled based on the calculated second pose transformation relation and the projection point of the target object in any frame of image, manual labeling is not needed, the labor cost is reduced, and the reliability of pose calculation is improved; furthermore, when the target object is positioned and analyzed based on the pose of the target object, the positioning precision can be improved, and particularly when the robot is applied to the field of grabbing the target object, the grabbing efficiency and the grabbing precision of the target object can be greatly improved.

Example two

Fig. 2 is a schematic flow chart of a pose marking method according to a second embodiment of the present invention. The technical scheme of the embodiment is refined on the basis of the embodiment. Specifically, a determination method of each pose conversion relation and a pose marking method are refined. In the method, reference is made to the above-described embodiments for those parts which are not described in detail. Referring specifically to fig. 2, the method may include the steps of:

s210, receiving sequence frame images acquired by a shooting device at different angles, and determining a reference frame image and a current frame image in the sequence frame images.

And S220, determining a third pose conversion relation between the world coordinate system of the reference object and the coordinate system of the reference frame image based on the feature points of the reference object and the projection points of the feature points of the reference object in the reference frame image.

Optionally, the method for determining the third posture-conversion relationship includes: obtaining an internal reference matrix obtained by pre-calibrating a shooting device; calculating a first relation equation according to the internal reference matrix, the three-dimensional characteristic points of the reference object, the two-dimensional projection points of the characteristic points of the reference object in the reference frame image and the current pose conversion relation between the world coordinate system and the reference frame image coordinate system; and performing iterative solution on the first relation equation, and if the current pose conversion relation under the current iteration times is converged, taking the current pose conversion relation under the current iteration times as a third pose conversion relation.

It should be noted that the internal reference matrix may be determined according to parameters in the specification of the shooting device, may also be determined according to the accuracy of the pose labeling, and may also be determined in other manners. Specifically, two-dimensional projection points of N reference objects are extracted from the reference frame image, and the ith two-dimensional projection point is recorded as

Three-dimensional of reference object in world coordinate system corresponding to ith two-dimensional projection pointThe characteristic points are recorded as

And if the current pose conversion relation between the world coordinate system and the reference frame image coordinate system is recorded as T _ w2r, the expression of the first relational equation is as follows:

further, solving the formula 1 by using a nonlinear optimization method, and solving the initial value of T _ w2r to be T under the RANSAC (RANdom sample consensus) framework ₀ And performing iterative solution on the first relation equation by adopting an LM (Levenberg-Marquardt based on Levenberg-Marquardt) algorithm, namely optimizing T _ w2r, and if the current pose conversion relation under the current iteration times reaches a stable state, taking the current pose conversion relation under the current iteration times as a third pose conversion relation.

Optionally, the calculation formula for optimizing T _ w2r by using the LM algorithm is:

wherein x is _k Is the current pose transformation relation T _ w2r for the k-th iteration, if k is 0,

i.e. deltax _k Is the variation quantity, or delta x, of the current pose conversion relation T _ w2r at the kth and the k-1 th generation _k Is an optimization variable; m is the radius of the confidence interval.

Wherein the content of the first and second substances,

ρ is an evaluation index. If ρ > 0.75, m is 2m, and if ρ < 0.25, m is 0.5 m. And, if ρ is greater than the first threshold, it is considered that Δ x can be solved based on equation 2 _k Let x _k ＝x _k -Δx _k-1 Solving for Δ x according to equation 2 _k Solving the variation of the current pose conversion relation T _ w2r under the k-th iteration number and the k-1-th iteration number according to the formula 2, if the variation is larger than or equal to a second threshold value, determining that the current pose conversion relation under the current iteration number is not converged, and enabling x _k ＝x _k+1 -Δx _k And (3) recalculating rho, solving the variation of the current position and posture conversion relation T _ w2r under the k +1 th and kth generations based on the formula 2, determining whether the current position and posture conversion relation under the current iteration times (k + 1) is converged based on the newly determined variation, if not, continuing iterative calculation until the current position and posture conversion relation under the current iteration times is converged, and converting the current position and posture conversion relation under the current iteration times.

And S230, determining a fourth pose conversion relation between the world coordinate system and the current frame image coordinate system based on the feature points of the reference object and the projection points of the feature points of the reference object in the current frame image.

Optionally, the method for determining the fourth pose conversion relationship includes: acquiring an internal reference matrix obtained by pre-calibrating a shooting device; calculating a second relation equation according to the internal reference matrix, the three-dimensional characteristic points of the reference object, the two-dimensional projection points of the characteristic points of the reference object in the current frame image and the current pose conversion relation between the world coordinate system and the current frame image coordinate system; and carrying out iterative solution on the second relation equation, and if the current pose conversion relation under the current iteration times is converged, taking the current pose conversion relation under the current iteration times as the fourth pose conversion relation.

The internal reference matrix can be determined according to parameters in the specification of the shooting device, can also be determined according to the accuracy of the pose marking, and can also be determined in other modes. Specifically, two-dimensional projection points of N reference objects are extracted from the current frame image, and the ith two-dimensional projection point is recorded as

Three-dimensional characteristic of reference object in world coordinate system corresponding to ith two-dimensional projection pointThe characteristic points are marked as

And if the current pose conversion relationship between the world coordinate system and the current frame image coordinate system is recorded as T _ w2c, the expression of the second relational equation is:

similar to the previous steps, the equation 3 is solved by using a nonlinear optimization method, and the initial value of T _ w2c is solved to be T under the RANSAC (random sample Consensus) framework ₀ And performing iterative solution on the second relation equation by adopting an LM (Levenberg-Marquardt based on Levenberg-Marquardt) algorithm, namely optimizing T _ w2c, and taking the current pose conversion relation under the current iteration times as a fourth pose conversion relation if the current pose conversion relation under the current iteration times reaches a stable state. It should be noted that the calculation formula of the LM algorithm optimization T _ w2c is consistent with formula 2, and x in the LM algorithm in this step is _k And if the current pose conversion relation under the current iteration times is converged, taking the current pose conversion relation under the current iteration times as the fourth pose conversion relation.

S240, determining a first posture conversion relation according to the third posture conversion relation and the fourth posture conversion relation.

Optionally, the third pose conversion relation and the fourth pose conversion relation are subjected to vector multiplication to obtain a first pose conversion relation.

Fig. 3 is a schematic diagram of a pose transformation relationship between coordinate systems, where the reference object in fig. 3 is a two-dimensional code, and the coordinate system of the two-dimensional code is a world coordinate system. The third pose conversion relation between the world coordinate system and the reference frame image coordinate system is T _ w2r, the fourth pose conversion relation between the world coordinate system and the current frame image coordinate system is T _ w2c, and the current frame image coordinate system and the reference frame image coordinate systemFirst pose conversion relationship between image coordinate systems T _ c2r ═ T _ w2r ═ T _ w2c ^-1 。

And S250, determining a fifth pose conversion relation between the target object coordinate system and the reference frame image coordinate system according to the feature points of the target object and the projection points of the feature points of the target object in the reference frame image.

Optionally, the fifth pose transformation relation determining method includes: obtaining an internal reference matrix obtained by pre-calibrating a shooting device; calculating a third relation equation according to the internal reference matrix, the three-dimensional characteristic points of the target object and the two-dimensional projection points of the characteristic points of the target object in the reference frame image; and performing iterative solution on the third relation equation, and if the current pose conversion relation under the current iteration times is converged, taking the current pose conversion relation under the current iteration times as the fifth pose conversion relation.

Specifically, two-dimensional projection points of M target objects are extracted from the reference frame image, and the jth two-dimensional projection point is recorded as

Recording the three-dimensional characteristic points of the target object in the target object coordinate system corresponding to the jth two-dimensional projection point as

And if the current pose conversion relation between the world coordinate system and the reference frame image coordinate system is recorded as T _ o2r, the expression of the third relation equation is as follows:

similar to the previous steps, the equation 4 is solved by using a nonlinear optimization method, and the initial value of T _ o2r is solved to be T under the RANSAC (random sample Consensus) framework ₀ And an LM (Levenberg-Marquardt based on Levenberg-Marquardt) algorithm is adopted to iteratively solve the third relation equation, namely T _ o2r is optimized, if the current pose conversion relation under the current iteration times reachesAnd when the current iteration times are in a stable state, taking the current pose conversion relation under the current iteration times as a fifth pose conversion relation. It should be noted that the calculation formula of LM algorithm optimization T _ o2r is consistent with formula 2, where x in LM algorithm in this step _k And if the current pose conversion relation under the current iteration times is converged, taking the current pose conversion relation under the current iteration times as the fifth pose conversion relation.

And S260, determining a second pose transformation relation based on the first pose transformation relation and the fifth pose transformation relation.

Optionally, the determining the second pose transformation relationship based on the first pose transformation relationship and the fifth pose transformation relationship includes: performing vector multiplication on the first pose transformation relation and the fifth pose transformation relation to obtain a current pose transformation relation between a target object coordinate system and a current frame image coordinate system; calculating a fourth relation equation according to a reference matrix obtained by calibrating in advance by a camera, the three-dimensional characteristic point of the target object, the two-dimensional projection point of the characteristic point of the target object in the current frame image, the target object coordinate system and the current pose conversion relation between the current frame image coordinate system and the target object coordinate system; and carrying out iterative solution on the fourth relation equation, and if the current pose conversion relation under the current iteration times is converged, taking the current pose conversion relation under the current iteration times as the second pose conversion relation.

With reference to fig. 3, the fifth pose transformation relationship between the target object coordinate system and the reference frame image coordinate system is T _ o2r, the first pose transformation relationship between the reference frame image coordinate system and the current frame image coordinate system is T _ c2r, and the second pose transformation relationship between the current frame image coordinate system and the target object coordinate system is T _ o2c ═ T _ o2r ═ T _ c2r ^-1 。

Specifically, two-dimensional projection points of M target objects are extracted from the current frame image, and the jth two-dimensional projection point is recorded as

And if the current pose conversion relationship between the world coordinate system and the current frame image coordinate system is recorded as T _ o2c, the expression of the fourth relational equation is:

the expression of the fourth relational equation may also be in the form:

similar to the previous steps, the formula 5 or the formula 6 is solved by using a nonlinear optimization method, and the initial value of the solution T _ o2c is T under the RANSAC (RANdom Sample Consensus) framework ₀ And performing iterative solution on the third relation equation by adopting an LM (Levenberg-Marquardt based on Levenberg-Marquardt) algorithm, namely optimizing T _ o2c, and if the current pose conversion relation under the current iteration times reaches a stable state, taking the current pose conversion relation under the current iteration times as a fifth pose conversion relation. It should be noted that the calculation formula of the LM algorithm optimized T _ o2c is consistent with formula 2, and x in the LM algorithm in this step is _k And if the current pose conversion relation under the current iteration times is converged, taking the current pose conversion relation under the current iteration times as the second pose conversion relation.

And S270, carrying out pose labeling on the target object based on the second pose conversion relation and the projection point of the target object in any frame of image shot by the shooting device. Optionally, pose labeling is performed on the target object based on the second pose transformation relationship and a projection point of the target object in any frame of image shot by the shooting device, and the pose labeling includes: and obtaining the pose of the characteristic point of the target object in the image shot by the shooting device according to the projection point of the target object in any frame of image shot by the shooting device, the second pose conversion relation and the internal reference matrix obtained by pre-calibration of the shooting device, and carrying out pose marking on the characteristic point of the target object.

Specifically, any one of the frame images captured by the capturing device is located under the current frame image coordinate system. And under the current frame image coordinate system, multiplying the projection point of the target object in the shot image according to the projection point of the target object in any one of the shot frames of images, the second pose conversion relation and an internal reference matrix obtained by pre-calibration of the shooting device to obtain the pose of the feature point of the target object in the image shot by the shooting device, namely obtaining the six-degree-of-freedom pose of the feature point of the target object under the target object coordinate system, and carrying out pose marking on the target object.

Fig. 4 is a logic diagram of pose labeling. The above process is explained in conjunction with fig. 4 as a whole. The method comprises the steps that a path of a mechanical arm is predetermined, an internal reference matrix of a shooting device is obtained, the mechanical arm drives the shooting device to collect sequence frame images according to the pre-planned path, and a reference frame image and a current frame are selected from the collected sequence frame images; the current frame image and the sequence frame image comprise a target object and a reference object, and the pose conversion between a target object coordinate system and a current frame image coordinate system is calculated by utilizing a RANSAC frame and an LM algorithm according to the projection point of the target object and the projection point of the reference object in the current frame image and the sequence frame image, and the feature point of the target object and the feature point of the reference object; when the shooting device shoots a new image, determining the six-degree-of-freedom pose of the target object according to the projection point, the internal reference matrix and the pose relation in the newly shot image until the shooting device finishes collecting all images and finishing pose marking.

According to the technical scheme provided by the embodiment, the pose conversion relation between the target object coordinate system and the current frame image coordinate system is accurately calculated according to the internal reference matrix, the characteristic points of the reference object, the characteristic points of the target object and the projection points in the sequence frame image; the six-degree-of-freedom pose of the feature point of the target object is accurately calculated and labeled based on the projection point of the target object in any frame of image shot by the shooting device and according to the internal reference matrix and pose conversion of the shooting device, manual labeling is not needed, and the labor cost is reduced; furthermore, when the target object is positioned and analyzed based on the pose of the target object, the positioning precision can be improved, and particularly when the robot is applied to the field of grabbing the target object, the grabbing efficiency and the grabbing precision of the target object can be greatly improved.

EXAMPLE III

Fig. 5 is a schematic structural diagram of a pose labeling apparatus according to a third embodiment of the present invention. Referring to fig. 5, the system includes: an image determination module 310, a first pose transformation relationship determination module 310, a second pose transformation relationship determination module 330, and a pose annotation module 340.

The image determining module 310 is configured to receive sequence frame images acquired by a camera at different angles, and determine a reference frame image and a current frame image in the sequence frame images, where each frame image of the sequence frame images includes a reference object and a target object;

a first pose conversion relationship determining module 320, configured to determine a first pose conversion relationship between the current frame image coordinate system and the reference frame image coordinate system based on the feature points of the reference object, the projection points of the feature points of the reference object in the reference frame image, and the projection points of the feature points of the reference object in the current frame image;

a second pose conversion relationship determining module 330, configured to determine, based on the feature points of the target object, the projection points of the feature points of the target object in the reference frame image, and the first pose conversion relationship, a second pose conversion relationship between a target object coordinate system where the target object is located and a current frame image coordinate system;

and the pose labeling module 340 is configured to label a pose of the target object based on the second pose conversion relationship and a projection point of the target object in any frame of image captured by the capturing device.

According to the technical scheme provided by the embodiment, the sequence frame images acquired by the shooting device at different angles are received, and each frame image of the sequence frame images comprises a reference object and a target object, so that the reference object assists the shooting device to shoot; determining a reference frame image and a current frame image in the sequence frame images, and determining a first posture conversion relation between a current frame image coordinate system and a reference frame image coordinate system based on the feature points of the reference object, the projection points of the feature points of the reference object in the reference frame image and the projection points of the feature points of the reference object in the current frame image; and determining a second posture conversion relation between a target object coordinate system where the target object is located and a current frame image coordinate system based on the feature points of the target object, the projection points of the feature points of the target object in the reference frame image and the first posture conversion relation. The characteristic points of the reference object and the characteristic points of the target object are stable and reliable, so that the first attitude transformation matrix and the second attitude transformation matrix can be accurately calculated; when any frame of image shot by the shooting device is obtained, based on the calculated second pose conversion relation and the projection point of the target object in any frame of image, the poses of the feature points of the target object are automatically determined, and the poses of the feature points of the target object are automatically labeled, so that manual labeling is not needed, the labor cost is reduced, and the reliability of pose calculation is improved; furthermore, when the target object is positioned and analyzed based on the pose of the target object, the positioning precision can be improved, and especially when the method is applied to the field of grabbing the target object by a robot, the grabbing efficiency and the grabbing precision of the target object can be greatly improved.

Optionally, the first pose transformation relation determining module 320 is further configured to determine, based on the feature points of the reference object and the projection points of the feature points of the reference object in the reference frame image, a third pose transformation relation between the world coordinate system where the reference object is located and the coordinate system of the reference frame image;

determining a fourth pose conversion relation between the world coordinate system and the current frame image coordinate system based on the feature points of the reference object and the projection points of the feature points of the reference object in the current frame image;

and determining the first attitude conversion relation according to the third attitude conversion relation and the fourth attitude conversion relation.

Optionally, the first pose conversion relationship determining module 320 is further configured to obtain an internal reference matrix obtained by calibrating the shooting device in advance;

calculating a first relation equation according to the internal reference matrix, the three-dimensional characteristic points of the reference object, the two-dimensional projection points of the characteristic points of the reference object in the reference frame image and the current pose conversion relation between the world coordinate system and the reference frame image coordinate system;

and carrying out iterative solution on the first relation equation, and if the current pose conversion relation under the current iteration times is converged, taking the current pose conversion relation under the current iteration times as the third pose conversion relation.

calculating a second relation equation according to the internal reference matrix, the three-dimensional characteristic points of the reference object, the two-dimensional projection points of the characteristic points of the reference object in the current frame image and the current pose conversion relation between the world coordinate system and the current frame image coordinate system;

and carrying out iterative solution on the second relation equation, and if the current pose conversion relation under the current iteration times is converged, taking the current pose conversion relation under the current iteration times as the fourth pose conversion relation.

Optionally, the second pose transformation relation determining module 330 is further configured to determine a fifth pose transformation relation between the target object coordinate system and the reference frame image coordinate system according to the feature points of the target object and the projection points of the feature points of the target object in the reference frame image;

determining the second pose transformation relationship based on the first pose transformation relationship and the fifth pose transformation relationship.

Optionally, the second pose conversion relationship determining module 330 is further configured to obtain an internal reference matrix obtained by calibrating the shooting device in advance;

calculating a third relation equation according to the internal reference matrix, the three-dimensional characteristic points of the target object and the two-dimensional projection points of the characteristic points of the target object in the reference frame image;

and performing iterative solution on the third relation equation, and if the current pose conversion relation under the current iteration times is converged, taking the current pose conversion relation under the current iteration times as the fifth pose conversion relation.

Optionally, the second pose transformation relation determining module 330 is further configured to perform vector multiplication on the first pose transformation relation and the fifth pose transformation relation to obtain a current pose transformation relation between the target object coordinate system and the current frame image coordinate system;

calculating a fourth relation equation according to a reference matrix obtained by calibrating in advance by a camera, the three-dimensional characteristic point of the target object, the two-dimensional projection point of the characteristic point of the target object in the current frame image, the target object coordinate system and the current pose conversion relation between the current frame image coordinate system and the target object coordinate system;

and performing iterative solution on the fourth relation equation, and if the current position and posture conversion relation under the current iteration times is converged, taking the current position and posture conversion relation under the current iteration times as the second position and posture conversion relation.

Optionally, the first pose transformation relation determining module 320 is further configured to perform vector multiplication on the third pose transformation relation and the fourth pose transformation relation to obtain the first pose transformation relation.

Optionally, the pose labeling module 340 is further configured to obtain a pose of the feature point of the target object in the image captured by the capturing device according to the projection point of the target object in any frame of image captured by the capturing device, the second pose conversion relationship, and an internal reference matrix obtained by pre-calibration by the capturing device, and perform pose labeling on the feature point of the target object.

Optionally, the shooting device includes any one of: a camera, a video camera and a laser; the reference object comprises any one of: two-dimensional codes and bar codes.

Example four

Fig. 6 is a schematic structural diagram of a pose marking system according to a fourth embodiment of the present invention. FIG. 6 illustrates a block diagram of an exemplary robot suitable for use in implementing embodiments of the present invention. The robot shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.

The pose labeling system shown in fig. 6 includes a robot, a target object, and a reference object. The robot includes a robot arm, a camera, a memory (not shown), and a processor (not shown). The mechanical arm drives the shooting device to move so that the shooting device can acquire sequence frame images at a plurality of angles; each frame of image of the sequence frame image comprises a reference object and a target object, the target object comprises a plurality of characteristic points, and the reference object comprises a plurality of characteristic points. As shown in fig. 6, the robot further includes: a base and a tip tool; the arm is installed on the base, end tool is installed at the end of the arm, and one side of the end tool is provided with the shooting device.

The processor executes various functional applications and data processing by running the program stored in the memory, for example, implementing a pose marking method provided by an embodiment of the present invention, the method including:

and marking the pose of the target object based on the second pose conversion relation and the projection point of the target object in any frame of image shot by the shooting device.

Of course, those skilled in the art can understand that the processor may also implement the technical solution of the pose marking method provided in any embodiment of the present invention.

EXAMPLE five

Fifth embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a pose annotation provided in an embodiment of the present invention, where the method includes:

determining a first pose conversion relation between a current frame image coordinate system and a reference frame image coordinate system based on the feature points of the reference object, the projection points of the feature points of the reference object in the reference frame image and the projection points of the feature points of the reference object in the current frame image;

Of course, the computer program stored on the computer-readable storage medium provided by the embodiments of the present invention is not limited to the above method operations, and may also perform related operations in a pose labeling method provided by any embodiment of the present invention.

Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, or device.

The computer readable signal medium may include a first pose translation relationship, a second pose translation relationship, a projected point of a target object in a captured image, etc., having computer readable program code embodied therein. The first posture conversion relation, the second posture conversion relation, the projection point of the target object in the shot image and the like of the propagation. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

It should be noted that, in the embodiment of the pose marking apparatus, the included modules are only divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be realized; in addition, the specific names of the functional units are only for the convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A pose labeling method is characterized by comprising the following steps:

2. The method according to claim 1, wherein determining the first pose conversion relationship between the coordinate system of the current frame image and the coordinate system of the reference frame image based on the feature points of the reference object, the projection points of the feature points of the reference object in the reference frame image, and the projection points of the feature points of the reference object in the current frame image comprises:

determining a third pose conversion relation between a world coordinate system where the reference object is located and a reference frame image coordinate system based on the feature points of the reference object and the projection points of the feature points of the reference object in the reference frame image;

3. The method according to claim 2, wherein the determining a third pose transformation relationship between the world coordinate system of the reference object and the reference frame image coordinate system based on the feature points of the reference object and the projection points of the feature points of the reference object in the reference frame image comprises:

obtaining an internal reference matrix obtained by pre-calibrating a shooting device;

4. The method according to claim 2, wherein the determining a fourth pose conversion relationship between the world coordinate system and the current frame image coordinate system based on the feature points of the reference object and the projection points of the feature points of the reference object in the current frame image comprises:

acquiring an internal reference matrix obtained by pre-calibrating a shooting device;

and carrying out iterative solution on the second relation equation, and if the current position and posture conversion relation under the current iteration times is converged, taking the current position and posture conversion relation under the current iteration times as the fourth position and posture conversion relation.

5. The method according to claim 1, wherein the determining a second pose transformation relationship between the target object coordinate system in which the target object is located and the current frame image coordinate system based on the feature points of the target object, the projection points of the feature points of the target object in the reference frame image, and the first pose transformation relationship comprises:

determining a fifth pose conversion relation between the target object coordinate system and the reference frame image coordinate system according to the feature points of the target object and the projection points of the feature points of the target object in the reference frame image;

6. The method according to claim 5, wherein the determining a fifth pose conversion relationship between the target object coordinate system and the reference frame image coordinate system according to the feature points of the target object and the projection points of the feature points of the target object in the reference frame image comprises:

and carrying out iterative solution on the third relation equation, and if the current pose conversion relation under the current iteration times is converged, taking the current pose conversion relation under the current iteration times as the fifth pose conversion relation.

7. The method of claim 5, wherein the determining the second pose translation relationship based on the first pose translation relationship and the fifth pose translation relationship comprises:

performing vector multiplication on the first pose conversion relation and the fifth pose conversion relation to obtain a current pose conversion relation between a target object coordinate system and a current frame image coordinate system;

and carrying out iterative solution on the fourth relation equation, and if the current pose conversion relation under the current iteration times is converged, taking the current pose conversion relation under the current iteration times as the second pose conversion relation.

8. The method of claim 2, wherein the determining the first pose translation relationship based on the third pose translation relationship and the fourth pose translation relationship comprises:

and carrying out vector multiplication on the third attitude conversion relation and the fourth attitude conversion relation to obtain the first attitude conversion relation.

9. The method according to claim 1, wherein the pose labeling of the target object based on the second pose transformation relation and the projection point of the target object in any frame of image captured by the capturing device comprises:

and obtaining the pose of the characteristic point of the target object in the image shot by the shooting device according to the projection point of the target object in any frame of image shot by the shooting device, the second pose transformation relation and the internal reference matrix obtained by pre-calibration of the shooting device, and carrying out pose marking on the characteristic point of the target object.

10. The method of claim 1, wherein the camera comprises any one of: a camera, a camera and a laser; and/or, the reference object comprises any one of: two-dimensional codes and bar codes.

11. A pose labeling apparatus, comprising:

the image determining module is used for receiving sequence frame images acquired by a shooting device under different angles and determining a reference frame image and a current frame image in the sequence frame images, wherein each frame image of the sequence frame images comprises a reference object and a target object;

and the pose marking module is used for marking the pose of the target object based on the second pose conversion relation and the projection point of the target object in any frame of image shot by the shooting device.

12. A pose marking system comprises a robot, a target object and a reference object, wherein the robot comprises a mechanical arm, a shooting device, a memory and a processor; the method is characterized in that the mechanical arm drives the shooting device to move so that the shooting device can acquire sequence frame images at a plurality of angles; each frame of image of the sequence frame image comprises a reference object and a target object, the target object comprises a plurality of characteristic points, and the reference object comprises a plurality of characteristic points;

a computer program stored in a memory and executable on a processor, the processor implementing the pose labeling method according to any one of claims 1 to 10 when executing the computer program.

13. The system of claim 12, wherein the robot further comprises: a base and a tip tool;

the arm is installed on the base, end tool is installed at the end of the arm, and one side of the end tool is provided with the shooting device.

14. A storage medium containing computer-executable instructions, which when executed by a computer processor implement the pose annotation method according to any one of claims 1-10.