CN113012226A

CN113012226A - Camera pose estimation method and device, electronic equipment and computer storage medium

Info

Publication number: CN113012226A
Application number: CN202110304172.8A
Authority: CN
Inventors: 王求元; 王楠
Original assignee: Zhejiang Shangtang Technology Development Co Ltd
Current assignee: Zhejiang Shangtang Technology Development Co Ltd; Zhejiang Sensetime Technology Development Co Ltd
Priority date: 2021-03-22
Filing date: 2021-03-22
Publication date: 2021-06-22

Abstract

The application discloses a camera pose estimation method and device, electronic equipment and a computer storage medium. The method comprises the following steps: acquiring position information of a projected image in an input image shot by a camera, wherein the position information comprises projection corner coordinates which are coordinates of a vertex of a registered image projected onto the input image; determining the coordinates of two vanishing points of the registered image according to the projection corner coordinates; based on geometric constraints between the vanishing points, a camera pose of the camera relative to the registered image is determined. By means of the method, the camera pose can be estimated quickly and accurately.

Description

Camera pose estimation method and device, electronic equipment and computer storage medium

Technical Field

The present disclosure relates to the field of computer vision technologies, and in particular, to a camera pose estimation method, a camera pose estimation device, an electronic device, and a computer storage medium.

Background

Computer vision is a branch of the rapid development of artificial intelligence, and the instant positioning and reconstruction technology is an important research hotspot in computer vision. Instant positioning and reconstruction techniques have wide applications, such as robotics, unmanned aerial vehicles, augmented/virtual reality, and the like. The technology obtains real-time self-positioning information and a reconstruction result of a digital surrounding environment through input of a monocular/monocular camera, and endows a machine with the capability of sensing the surrounding environment.

The main problem of the instant positioning and reconstruction technology is to solve the pose of the camera, and in the related technology, a large number of characteristic point pairs are needed to be used for calculating the pose of the camera, so that the calculation is long in time consumption and large in error.

Disclosure of Invention

The application provides an estimation method of a camera pose, an estimation device of the camera pose, electronic equipment and a computer storage medium, and aims to solve the problems that time consumption and errors are large when the camera pose is calculated in the prior art.

In order to solve the technical problem, the application provides an estimation method of a camera pose. The method comprises the following steps: acquiring position information of a projected image in an input image shot by a camera, wherein the position information comprises projection corner coordinates which are coordinates of a vertex of a registered image projected onto the input image; determining the coordinates of two vanishing points of the registered image according to the projection corner coordinates; based on geometric constraints between the vanishing points, a camera pose of the camera relative to the registered image is determined.

Wherein the camera pose comprises a rotation matrix, the vanishing points comprise a first vanishing point and a second vanishing point, and the camera pose of the camera relative to the registration image is determined based on geometric constraints between the vanishing points, comprising: calculating to obtain a first three-dimensional vector of parallel lines corresponding to the first vanishing point and a second three-dimensional vector of parallel lines corresponding to the second vanishing point based on the internal parameters of the camera and the coordinates of the vanishing points; a rotation matrix is determined based on the first three-dimensional vector and the second three-dimensional vector.

Wherein determining a relative rotation matrix based on the first three-dimensional vector and the second three-dimensional vector comprises: respectively carrying out vector normalization on the first three-dimensional vector and the second three-dimensional vector to obtain a first unit vector and a second unit vector; performing cross product correction twice on the first unit vector and the second unit vector to obtain three mutually perpendicular first unit vectors, third unit vectors and fourth unit vectors; and taking the first unit vector, the third unit vector and the fourth unit vector as a rotation matrix.

Wherein, the two cross product corrections are performed on the first unit vector and the second unit vector, including: performing cross product operation on the first unit vector and the second unit vector to obtain a third unit vector perpendicular to the first unit vector and the second unit vector; and performing cross product operation on the first unit vector and the third unit vector to obtain a fourth unit vector perpendicular to the first unit vector and the third unit vector.

Wherein the camera pose further comprises a translation matrix, and determining the camera pose of the camera relative to the registered image based on geometric constraints between the vanishing points further comprises: and determining a translation matrix of the camera relative to the registration image according to the projection corner coordinates, the three-dimensional point coordinates of the registration corner corresponding to the projection corner coordinates on the registration image and the rotation matrix.

The method for determining the translation matrix of the camera relative to the registration image according to the projection corner coordinates, the three-dimensional point coordinates of the registration corner corresponding to the projection corner coordinates on the registration image and the rotation matrix comprises the following steps: inputting the projection corner point coordinates, the three-dimensional point coordinates and the rotation matrix into a target function of a translation matrix, and solving the target function of the translation matrix to obtain the translation matrix; wherein the objective function of the translation matrix is constructed based on a reprojection constraint between the projection point and the three-dimensional point.

The estimation method of the camera pose further comprises the following steps: and solving an objective function of the translation matrix by using at least two groups of projection corner coordinates and three-dimensional point coordinates to obtain an optimal translation matrix.

In order to solve the technical problem, the application provides an estimation device of a camera pose. The device comprises an acquisition module, a vanishing point calculation module and a camera pose estimation module. The acquisition module is used for acquiring position information of a projected image in an input image shot by a camera, wherein the position information comprises projection corner coordinates which are coordinates of a vertex of a registered image projected onto the input image; the vanishing point calculating module is used for determining the coordinates of two vanishing points of the registered image according to the projection corner coordinates, wherein the vanishing points are the intersection points of two lines of two parallel lines of a corresponding rectangle in the quadrangle; the camera pose estimation module is used for calculating the camera pose of the camera relative to the registration image based on the geometric constraint between the vanishing points.

In order to solve the technical problem, the application provides an electronic device. The electronic equipment comprises a processor, a memory and a camera module; the processor is coupled with the memory and the camera module, and executes the instruction during working so as to realize the estimation method of the camera pose by matching with the memory and the camera module.

To solve the above technical problem, the present application provides a computer storage medium. The computer storage medium stores a computer program executed by a processor to implement the steps of the above-described method of estimating the pose of the camera.

According to the method and the device, the coordinates of the vanishing points in the projection image are determined according to the coordinates of the projection corner points, and then the rotation matrix of the registration image is calculated according to the geometric constraint of the vanishing points, so that the accuracy of camera pose estimation can be improved. The pose of the camera is calculated by using the corner points of the registered image and the projected image, so that the data volume of the pose of the camera can be reduced, the calculation efficiency of the pose of the camera is improved, and the time consumption is reduced.

Drawings

Fig. 1 is a schematic flowchart of a first embodiment of a camera pose estimation method provided in the present application;

FIG. 2 is a schematic diagram of a registered image in a world coordinate system projected to an input image plane via a camera pose transformation provided by the present application;

FIG. 3 is a schematic diagram of parallel lines in three-dimensional space intersecting at a vanishing point in an input image as provided by the present application;

fig. 4 is a flowchart illustrating a second embodiment of a camera pose estimation method provided by the present application;

FIG. 5 is a schematic structural diagram of an embodiment of an apparatus for estimating pose of a camera provided by the present application;

FIG. 6 is a schematic structural diagram of an embodiment of an electronic device provided in the present application;

FIG. 7 is a schematic structural diagram of an embodiment of a computer storage medium provided in the present application.

Detailed Description

In order to make those skilled in the art better understand the technical solution of the present invention, the method for estimating a camera pose, the apparatus for estimating a camera pose, the electronic device, and the computer storage medium provided in the present application are described in further detail below with reference to the accompanying drawings and the detailed description.

Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a first embodiment of a method for estimating a pose of a camera according to the present application. The embodiment comprises the following steps:

s101: position information of a projected image in an input image taken by a camera is acquired.

The input image taken by the camera contains the content in the registered image. The registered image is projected to the input image to form a projected image through pose transformation of the camera.

The position information comprises projection corner coordinates, and the projection corner coordinates are coordinates of a vertex of the registered image projected onto the input image.

Referring to fig. 2, fig. 2 is a schematic diagram of a registered image in a world coordinate system projected to an input image plane through pose transformation of a camera according to the present application. Point O in the figure is the camera center (optic).

Specifically, the registered image in the world coordinate system is converted into a registered image in the camera coordinate system through rotation and translation, the registered image in the camera coordinate system is perspective-projected to a plane where the input image is located through the camera, and the registered image forms a projected image on the input image.

In this embodiment, the registered image may be a rectangle, and the registered image has two parallel sides according to the geometric property of the rectangle, and the two parallel sides are perpendicular to each other. The projection corner coordinates may be coordinates of 4 vertices of the registered image projected onto the input image. Of course, the registered image may also be another polygon having at least two parallel sides, and the at least two parallel sides are perpendicular to each other, such as a regular octagon or a regular hexadecimal, which is not limited in this application.

The four vertices of the registered image correspond to three-dimensional points A, B, C and D in the world coordinate system. The coordinates of the three-dimensional points A, B, C and D can be obtained from the actual physical length of the registered image. The actual physical width of the registered image can be measured as w_pActual physical height of h_pThen, the coordinates of three-dimensional points a, B, C, D of the registered image in the world coordinate system are obtained:

wherein the world coordinate system takes the center of the registered image as O_mThe origin, the direction passing through the origin and parallel to the line segments AB and CD is x_mThe axial direction, the direction passing through the origin and parallel to the line segments BC, DA is y_mAxial direction, determining z according to the rule of right-handed system_mConstructed in the axial direction, i.e. coordinate system O_mx_my_mz_mIs the world coordinate system of the present embodiment. According to the geometrical property of the rectangle, two groups of parallel lines exist in the registered image, and the two groups of parallel lines are perpendicular to each other, namely the four sides AB, BC, CD and DA of the registered image satisfy the following geometrical relationship:

AB||CD,BC||DA,AB⊥BC.

the projected image is a quadrangular image corresponding to the rectangle, and each side of the projected image corresponds to the side of the registered image of the three-dimensional space one by one. In particular, the projected image has four edges l₂、l₁、l₀And l₃Wherein l is₂The corresponding line segment in three-dimensional space is AB, l₁The line segment corresponding to the three-dimensional space is BC, l₀The line segment corresponding to the three-dimensional space is CD, l₃The line segment corresponding to the three-dimensional space is DA. The position information includes projection corner coordinates of the projection image in the input image, the projection corners being, for example, projection points of vertices of the registration image in the input image. Since the registered image is rectangular, the number of projection corner points is four: a is^O、b^O、c^OAnd d^O. Wherein, a^oThe corresponding three-dimensional points are A, b^oThe corresponding three-dimensional points are B, c^oThe corresponding three-dimensional points are C, d^oThe corresponding three-dimensional point is D. The homogeneous coordinates of the four projection angular points are respectively:

a^o＝(x_ao,y_ao,1)^T,b^o＝(x_bo,y_bo,1)^T,

c^o＝(x_co,y_co,1)^T,d^o＝(x_do,y_do,1)^T.

the position information of the projected image in the input image taken by the camera is calculated using, for example, an image registration algorithm. The image registration algorithm is, for example, orb (organized Fast and Rotated brief) algorithm, Fast (features from acquired Segments test) algorithm, brief (binary route Independent element features) algorithm or RANSAC algorithm (RANdom SAmple Consensus algorithm), and the like, and modified algorithms thereof.

S102: and determining the coordinates of two vanishing points of the registered image according to the coordinates of the projection corner points.

Based on the coordinates of the projection corner points, the vanishing points of the registered image on the input image can be calculated.

In three-dimensional space, a set of parallel straight lines do not intersect, or intersect, at the same point of infinity, as it were. The infinity point will be imaged in the image plane under the perspective projection of the camera, and the imaged point is called a vanishing point. The vanishing point may lie in the image plane, may be out of the image plane, or even at infinity. A vanishing point on the image corresponds to a group of parallel straight lines, namely, a group of straight lines passing through the same vanishing point on the image must have a parallel relation. The vanishing points represent a set of parallel straight three-dimensional direction information.

In general, since a projected image projected onto an input image is distorted due to a pose change of a camera with respect to a registered image, the projected image may not be a rectangle or a parallelogram but may be formed as an irregular quadrangle. The extensions of the two line segments corresponding to each group of parallel lines of the registered image in the projected image may intersect at one point, that is, the intersection point of the two lines corresponding to the two parallel lines of the rectangle in the quadrangle is a vanishing point. The rectangular registered image has two sets of parallel lines, so that 2 vanishing points can be determined from the corresponding quadrangular projected image.

The coordinates of the vanishing points can be determined from the projection corner coordinates. Specifically, four line segments corresponding to four edges of the projected image can be determined according to the coordinates of the projection corner points, and then 2 corresponding vanishing points are obtained according to the four line segments.

Referring to fig. 3, fig. 3 is a schematic diagram of parallel lines in a three-dimensional space intersecting at a vanishing point in an input image.

Performing vector cross product operation on every two adjacent projection angular points to obtain a line segment l formed by the two adjacent projection angular points₂、l₁、l₀And l₃. The calculation formula of the line segment is as follows:

l₂＝a^o×b^o,l₁＝b^o×c^o,l₀＝c^o×d^o,l₃＝d^o×a^o (1)

in the formula (1), x represents a vector cross product operation.

Four line segments l are obtained₂、l₁、l₀And l₃And then, carrying out vector cross product operation on the two line segments corresponding to the two parallel lines of the registered image to obtain the coordinate of the vanishing point. The registration image has two sets of parallel lines, so that two sets of line segments l of the projection image can be used₀And l₂、l₁And l₃And calculating to obtain a first vanishing point and a second vanishing point.

The calculation formula of the coordinates of the first vanishing point and the coordinates of the second vanishing point is as follows:

v₁＝l₀×l₂,v₂＝l₁×l₃ (2)

in the formula (2), v₁Denotes the first vanishing point, v₂Represents the second vanishing point and x represents the vector cross product operation.

S103: based on geometric constraints between the vanishing points, a camera pose of the camera relative to the registered image is determined.

Wherein the camera pose comprises a rotation matrix.

In this embodiment, the geometric constraint between the vanishing points means that the coordinates of the vanishing points are proportional to the internal parameters K of the camera and the three-dimensional vector V of the vanishing points in the three-dimensional space^CThe dot product of (1), i.e. vanishing point v ═ KV^C. And, since the parallel line of the registered image corresponding to the first vanishing point is perpendicular to the parallel line of the registered image corresponding to the second vanishing point, the first vanishing point corresponds toThe first three-dimensional vector of (2) and the second three-dimensional vector corresponding to the second vanishing point are perpendicular to each other in the three-dimensional space.

Thus, the first three-dimensional vector and the second three-dimensional vector can represent x of the camera coordinate system_cAxial direction and y_cThe axial direction.

Specifically, the internal parameters of the camera and the formula v ═ KV are utilized^CAnd calculating to obtain a first three-dimensional vector corresponding to the first vanishing point in the three-dimensional space and a second three-dimensional vector corresponding to the second vanishing point in the three-dimensional space. The formula for calculating the first three-dimensional vector and the second three-dimensional vector is as follows:

in the formula (3), the reaction mixture is,

a first three-dimensional vector is represented,

representing a second three-dimensional vector.

After the first three-dimensional vector and the second three-dimensional vector are obtained, vector normalization processing is respectively carried out on the first three-dimensional vector and the second three-dimensional vector to obtain a first unit vector and a second unit vector. A rotation matrix of the camera relative to the registered image is then determined based on the first unit vector and the second unit vector.

In the formula (4, | V₁ ^CI represents a first three-dimensional vector

The die of (a) is used,

representing a second three-dimensional vector

The die of (a) is used,

a first unit vector is represented by a first unit vector,

representing a second unit vector.

Determining z in camera coordinate system to avoid introduction_cThe trouble of the axial sign and the influence of noise reduction are solved, and the embodiment performs cross product correction twice on the first unit vector and the second unit vector to obtain three mutually perpendicular first unit vector, third unit vector and fourth unit vector. The first unit vector and x of the camera coordinate system_cThe same axial direction, the third unit vector and the y of the camera coordinate system_cThe axial direction is the same, and the fourth unit vector is the same as the z of the camera coordinate system_cThe axial directions are the same. The world coordinate system uses the center of the registered image as the origin and the plane z of the registered image_mThe first unit vector, the third unit vector, and the fourth unit vector are relative rotation amounts with respect to the registered image in the world coordinate system, that is, the first unit vector, the third unit vector, and the fourth unit vector constitute a rotation matrix of the camera with respect to the registered image.

Specifically, a cross product operation is performed on the first unit vector and the second unit vector to obtain a third unit vector perpendicular to the first unit vector and the second unit vector. Since the first unit vector and the second unit vector are not necessarily strictly orthogonal due to the influence of noise, the present embodiment performs a cross product operation on the first unit vector and the third unit vector to obtain a fourth unit vector perpendicular to both the first unit vector and the third unit vector. The formula for calculating the third unit vector and the fourth unit vector is expressed as follows:

in the formula (5), the reaction mixture is,

a third unit vector is represented by a third unit vector,

denotes the fourth unit vector, and x denotes the vector cross product operation.

Rotation matrix of camera relative to registered image

Can be expressed as:

of course, it may also be a first three-dimensional vector

And a second three-dimensional vector

Obtaining a third three-dimensional vector after vector cross product operation

Then the first three-dimensional vector is put into

And a third three-dimensional vector

Performing vector cross product operation to obtain a first three-dimensional vector

Third three-dimensional vector

And a fourth three-dimensional vector

Respectively carrying out vector normalization processing to obtain first unit vectors

Third unit vector

And a fourth unit vector

In the embodiment, the pose is estimated by using the coordinates of the four projection corner points of the projection image, so that the time consumption for estimating the pose of the camera can be reduced, and the efficiency for estimating the pose of the camera is improved. The vanishing points of the parallel lines in the three-dimensional space on the input image plane are calculated according to the projection angular points, the camera rotation matrix is calculated based on the vanishing points, and the vanishing points are only related to the directions of the parallel lines corresponding to the vanishing points in the three-dimensional space and the internal parameters of the camera and are not influenced by the projection angular points and the positions of the three-dimensional points, so that the robustness of an algorithm for calculating the rotation matrix can be improved. In addition, geometric constraint of vanishing points is used in the embodiment, so that the camera pose estimation efficiency can be improved, and meanwhile, the camera pose estimation precision can be improved.

After the rotation matrix is calculated, the translation matrix of the camera can be further calculated by using the rotation matrix, the projection corner coordinates and the three-dimensional point coordinates. Referring to fig. 4, fig. 4 is a flowchart illustrating a second embodiment of a method for estimating a pose of a camera according to the present application. The present embodiment is a first embodiment of a camera pose-based estimation method, and therefore, the same steps are not described herein again. The embodiment comprises the following steps:

s401: position information of a projected image in an input image taken by a camera is acquired.

S402: and determining the coordinates of two vanishing points of the registered image according to the coordinates of the projection corner points.

S403: based on geometric constraints between vanishing points, a rotation matrix of the camera relative to the registered image is determined.

S404: and determining a translation matrix of the camera relative to the registration image according to the projection corner coordinates, the three-dimensional point coordinates of the registration corner corresponding to the projection corner coordinates on the registration image and the rotation matrix.

Specifically, an objective function of the translation matrix is constructed based on reprojection constraint between the projection point and the three-dimensional point, the projection corner point coordinates, the three-dimensional point coordinates and the rotation matrix are input into the objective function of the translation matrix, and the objective function of the translation matrix is solved to obtain the translation matrix. The formula is expressed as follows:

constructing an objective function of the translation matrix:

in the formula (7), w_iFor the projection factor, the observation coordinate is (u)_i,v_i) The second coordinate of the observation coordinate is (u)_i,v_i1) the observation coordinates are the projection angular points a^o、b^o、c^oAnd d^oAny one of (1), X^mIs a three-dimensional point corresponding to the observation coordinate (three-dimensional point A, B, C in the world coordinate system and the three-dimensional point corresponding to the observation coordinate in D). In the formula (7), the right side of the equation represents that three-dimensional points in the world coordinate system are rotated and translated into three-dimensional points in the camera coordinate system, the three-dimensional points in the camera coordinate system are projected to an image plane through the camera internal reference to obtain two-dimensional points in the image coordinate system, and the left side of the equation represents that two-dimensional points (pixel points corresponding to the three-dimensional points) in the pixel coordinate system are converted into two-dimensional points in the image coordinate system through coordinate conversion of the projection factor, so that two sides of the formula in the formula (7) are always true.

In the target function of the translation matrix, the internal parameter K of the camera and the rotation matrix

The coordinates of the observation point and the three-dimensional point are known quantities, and a projection factor w_iAnd translation matrix

Is the unknown quantity to be solved. The translation matrix includes 3 variables: along x_cAmount of translation of shaft

Along y_cAmount of translation of shaft

And along z_cAmount of translation of shaft

Projection factor w_iIt can be eliminated by means of an in-band. The specific process is as follows:

unfolding the objective function of the translation matrix:

further unfolding:

in the formula (9), a1, a2 and a3 are constants, and a1, a2 and a3 are rotation matrixes according to internal parameters K of known quantities

And calculating the coordinates of the three-dimensional points.

The projection factor can be obtained

Will be provided with

Into an objective function of a translation matrix and will

And extracting to obtain:

therefore, in this embodiment, each projection corner point and its corresponding three-dimensional point can provide 2 sets of constraints, and the number of variables to be solved is 3

And solving an objective function of the translation matrix by using at least two groups of projection corner coordinates and three-dimensional point coordinates to obtain an optimal translation matrix.

The 2 groups of projection angular points and three-dimensional point pairs can provide 4 groups of constraints, the 3 groups of projection angular points and three-dimensional point pairs can provide 6 groups of constraints, the 4 groups of projection angular points and three-dimensional point pairs can provide 8 groups of constraints, the 4 groups of projection angular points and three-dimensional point pairs all belong to an over-definite equation set, and the optimal translation matrix can be solved by using singular value decomposition. The smaller the number of projection angular points and three-dimensional point pairs of the target function input into the translation matrix is, the faster the rate of calculating the translation matrix is; and conversely, the higher the precision of the translation matrix obtained by calculation. In the embodiment, the camera pose is estimated by using at most 4 groups of projection angular points and three-dimensional points, so that the estimation efficiency of the camera pose is improved, and the estimation accuracy of the camera pose is improved.

In the embodiment, the pose is estimated by using the coordinates of the four projection corner points of the projection image, so that the time consumption for estimating the pose of the camera can be reduced, and the efficiency for estimating the pose of the camera is improved. The vanishing points of the parallel lines in the three-dimensional space on the input image plane are calculated according to the projection angular points, the camera rotation matrix is calculated based on the vanishing points, and the vanishing points are only related to the directions of the parallel lines corresponding to the vanishing points in the three-dimensional space and the internal parameters of the camera and are not influenced by the projection angular points and the positions of the three-dimensional points, so that the robustness of an algorithm for calculating the rotation matrix can be improved. In addition, geometric constraint of vanishing points is used in the embodiment, so that the camera pose estimation efficiency can be improved, and meanwhile, the camera pose estimation precision can be improved. Furthermore, an overdetermined equation set of the translation matrix can be established by constructing an objective function of the translation matrix based on at least two groups of projection corner points and three-dimensional points corresponding to the projection corner points, so that the optimal solution of the translation matrix can be calculated, and the precision of the translation matrix can be improved.

Referring to fig. 5, fig. 5 is a schematic structural diagram of an embodiment of an apparatus for estimating a pose of a camera according to the present application. The camera pose estimation apparatus 500 includes:

an obtaining module 501, configured to obtain position information of a projected image in an input image captured by a camera. The position information comprises projection corner coordinates, and the projection corner coordinates are coordinates of the top point of the registered image projected onto the input image.

And a vanishing point calculating module 502, configured to determine coordinates of two vanishing points of the registered image according to the coordinates of the projection corner points.

A camera pose estimation module 503 for calculating a camera pose of the camera with respect to the registered image based on geometric constraints between the vanishing points.

The camera pose estimation module 503 is specifically configured to calculate a first three-dimensional vector of parallel lines corresponding to the first vanishing point and a second three-dimensional vector of parallel lines corresponding to the second vanishing point according to the internal parameters of the camera and the coordinates of the vanishing point, and determine a rotation matrix based on the first three-dimensional vector and the second three-dimensional vector.

In addition, the camera pose estimation module 503 is further configured to determine a translation matrix of the camera with respect to the registered image according to the projection corner coordinates, the three-dimensional point coordinates of the registration corner corresponding to the projection corner coordinates on the registered image, and the rotation matrix.

In some embodiments, the camera pose estimation apparatus may be configured to perform the camera pose estimation method described in the foregoing embodiments, and may of course include units or modules configured to perform any procedures and/or steps of the camera pose estimation method described in the foregoing embodiments, which are not described again for brevity.

The above description of the apparatus embodiments, similar to the above description of the method embodiments, has similar beneficial effects as the method embodiments. For technical details not disclosed in the embodiments of the apparatus according to the invention, reference is made to the description of the embodiments of the method according to the invention for understanding.

Referring to fig. 6, fig. 6 is a schematic structural diagram of an embodiment of an electronic device provided in the present application. The electronic device 600 includes:

a processor 601, a memory 602 and a camera module 603. The processor 601 is coupled to the memory 602 and the camera module 603, and executes the instructions during operation to implement the above-mentioned method for estimating the pose of the camera in cooperation with the memory 602 and the camera module 603.

Optionally, the electronic device is a mobile phone, a camera, a tablet computer, or a mobile robot.

The camera module 603 is used for capturing an input image. The memory 602 is used for storing the input image and storing pose information of the registration image in the input image.

The processor 601 is configured to determine coordinates of two vanishing points of the registered image according to the coordinates of the projection corner point, where the vanishing points are intersections of two lines of two parallel lines of a corresponding rectangle in the quadrangle, and determine a camera pose of the camera with respect to the registered image based on geometric constraints between the vanishing points.

Specifically, the processor 601 is configured to calculate a first three-dimensional vector of a parallel line corresponding to the first vanishing point and a second three-dimensional vector of a parallel line corresponding to the second vanishing point based on the internal parameters of the camera and the coordinates of the vanishing points; a rotation matrix is determined based on the first three-dimensional vector and the second three-dimensional vector. The processor 601 is further configured to determine a translation matrix of the camera with respect to the registered image according to the projection corner coordinates, the three-dimensional point coordinates of the registration corner corresponding to the projection corner coordinates on the registered image, and the rotation matrix.

The processor 601 may be an integrated circuit chip having signal processing capability. The processor 601 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

For the method of the above embodiment, it may exist in the form of a computer program, so that the present application provides a computer storage medium, please refer to fig. 7, and fig. 7 is a schematic structural diagram of an embodiment of the computer storage medium provided in the present application. The computer storage medium 700 of the present embodiment stores therein a computer program 701 that can be executed to implement the method in the above-described embodiments.

The computer storage medium 700 of this embodiment may be a medium that can store program instructions, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, or may also be a server that stores the program instructions, and the server may send the stored program instructions to other devices for operation, or may self-operate the stored program instructions.

In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only for the purpose of illustrating embodiments of the present application and is not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application or are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims

1. A camera pose estimation method is characterized by comprising the following steps:

acquiring position information of a projected image in an input image shot by a camera, wherein the position information comprises projection corner coordinates which are coordinates of a vertex of a registered image projected onto the input image;

determining the coordinates of two vanishing points of the registered image according to the projection corner coordinates;

determining a camera pose of the camera relative to the registration image based on geometric constraints between the vanishing points.

2. The estimation method of claim 1, wherein the camera pose comprises a rotation matrix, wherein the vanishing points comprise a first vanishing point and a second vanishing point, and wherein determining the camera pose of the camera relative to the registration image based on geometric constraints between the vanishing points comprises:

calculating to obtain a first three-dimensional vector of parallel lines corresponding to the first vanishing point and a second three-dimensional vector of parallel lines corresponding to the second vanishing point based on the internal parameters of the camera and the coordinates of the vanishing points;

determining the rotation matrix based on the first three-dimensional vector and the second three-dimensional vector.

3. The estimation method according to claim 2, wherein said determining the rotation matrix of the camera relative to the first three-dimensional vector and the second three-dimensional vector comprises:

respectively carrying out vector normalization on the first three-dimensional vector and the second three-dimensional vector to obtain a first unit vector and a second unit vector;

performing cross product correction twice on the first unit vector and the second unit vector to obtain three mutually perpendicular first unit vectors, third unit vectors and fourth unit vectors;

and taking the first unit vector, the third unit vector and the fourth unit vector as the rotation matrix.

4. The estimation method according to claim 3, wherein the performing twice cross product corrections on the first unit vector and the second unit vector comprises:

performing cross product operation on the first unit vector and the second unit vector to obtain a third unit vector perpendicular to the first unit vector and the second unit vector;

performing cross product operation on the first unit vector and the third unit vector to obtain a fourth unit vector perpendicular to the first unit vector and the third unit vector.

5. The estimation method according to claim 2, wherein the camera pose further comprises a translation matrix, and wherein determining the camera pose of the camera relative to the registration image based on the geometric constraints between the vanishing points comprises:

and determining a translation matrix of the camera relative to the registration image according to the projection corner point coordinates, the three-dimensional point coordinates of the registration corner point corresponding to the projection corner point coordinates on the registration image and the rotation matrix.

6. The estimation method according to claim 5, wherein the determining a translation matrix of the camera with respect to the registration image according to the projection corner coordinates, three-dimensional point coordinates of registration corners corresponding to the projection corner coordinates on the registration image, and the rotation matrix comprises:

inputting the projection corner point coordinates, the three-dimensional point coordinates and the rotation matrix into a target function of a translation matrix, and solving the target function of the translation matrix to obtain the translation matrix; wherein an objective function of the translation matrix is constructed based on a reprojection constraint between the projection point and the three-dimensional point.

7. The estimation method according to claim 6, characterized in that the estimation method further comprises:

and solving an objective function of the translation matrix by using at least two groups of the projection corner coordinates and the three-dimensional point coordinates to obtain an optimal translation matrix.

8. An apparatus for estimating a camera pose, the apparatus comprising:

the acquisition module is used for acquiring position information of a projected image in an input image shot by the camera, wherein the position information comprises projection corner coordinates which are coordinates of a vertex of a registered image projected onto the input image;

the vanishing point calculation module is used for determining the coordinates of two vanishing points of the registered image according to the projection corner point coordinates;

a camera pose estimation module to calculate a camera pose of the camera relative to the registration image based on geometric constraints between the vanishing points.

9. An electronic device, comprising a processor, a memory and a camera module; the processor is coupled with the memory and the camera module and executes instructions in work so as to realize the camera pose estimation method according to any one of claims 1 to 7 by matching the memory and the camera module.

10. A computer storage medium characterized in that the computer storage medium stores a computer program executed by a processor to implement the steps of the method of estimating the pose of a camera according to any one of claims 1 to 7.