WO2012160787A1

WO2012160787A1 - Position/posture estimation device, position/posture estimation method and position/posture estimation program

Info

Publication number: WO2012160787A1
Application number: PCT/JP2012/003240
Authority: WO
Inventors: 中野　学
Original assignee: 日本電気株式会社
Priority date: 2011-05-20
Filing date: 2012-05-17
Publication date: 2012-11-29

Abstract

A position/posture estimation device is provided with: an optimum solution candidate calculation means (101) which, with three or more object-related pairs of three-dimensional coordinates and two-dimensional coordinates corresponding to the three-dimensional coordinates on an image as inputs, and with respect to a predetermined error function that indicates the transform relationship between the three-dimensional coordinates and the two-dimensional coordinates with a posture having three degrees of freedom as a variable, calculates all solutions that satisfy a simultaneous polynomial in which the gradient of the error function is zero, using the inputted pairs of the three-dimensional coordinates and the two-dimensional coordinates; and a position/posture calculation means (102) which extracts an optimum solution in which the error function becomes minimum from all the solutions of the simultaneous polynomial calculated by the optimum solution candidate calculation means (101), and on the basis of the extracted optimum solution, calculates the position/posture of a camera which has captured an image of the object.

Description

Position / orientation estimation apparatus, position / orientation estimation method, and position / orientation estimation program

The present invention relates to a position / orientation estimation apparatus, a position / orientation estimation method, and a position / orientation estimation program for estimating the position and orientation of a camera or a subject (hereinafter collectively referred to as “position and orientation”).

To synthesize computer graphics naturally with images taken with the camera, or to restore the 3D shape of the subject and the surrounding environment from the images taken with the camera, the camera position or posture of the subject Must be estimated with high accuracy. Here, estimating the position and orientation of a camera that moves around a stationary subject and estimating the position and orientation of a subject that moves in front of a stationary camera are equivalent problems. Therefore, the former case will be described as an example.

As one method of estimating the position and orientation of the camera when the internal parameters of the camera are known, the known 3D coordinates for a plurality of points and the 2D coordinates corresponding to these 3D coordinates on the image are input. There is a method for obtaining the position and orientation of the camera. This method is called a PnP (Perspective n Points) problem, and the position and orientation of the camera can be obtained from a minimum of three input points. However, it is known that a plurality of solutions can be obtained from three points. Thus, a unique method for calculating a solution using four or more points has been proposed.

For example, Non-Patent Document 1 describes a method for estimating a position and orientation using a DLT (Direct Linear Transform) method. The method described in Non-Patent Document 1 ignores the constraint condition relating to the posture rotation matrix and treats each posture variable independently to convert a non-linear reprojection error into a linear algebraic error. In this method, the position and orientation is obtained as an eigenvector corresponding to the absolute minimum eigenvalue of the coefficient matrix, and post-correction is performed so as to satisfy the constraint condition of the orientation. The reprojection error is the Euclidean distance between the two-dimensional coordinates obtained by projecting the three-dimensional coordinates onto the image based on the estimated position and orientation and the input two-dimensional coordinates. When the distribution of the three-dimensional coordinates is a plane, the plane DLT method using four or more points is used. On the other hand, in the case of a non-planar surface, a non-planar DLT method using 6 points or more is used.

Also, for example, Non-Patent Document 2 describes a method of estimating the distance from the camera position to each point and estimating the position and orientation from the distance. The method described in Non-Patent Document 2 defines a plurality of quaternary expressions related to the depth of an appropriate one-dimensional three-dimensional coordinate. Then, the method treats each term of the quartic equation independently, and estimates the distance from a singular vector obtained by singular value decomposition of the coefficient matrix. Since the three-dimensional coordinates in the camera coordinate system are obtained from the distance, the position and orientation are estimated by the least square method from the input three-dimensional coordinates and the three-dimensional coordinates in the camera coordinate system.

However, the methods described in Non-Patent Document 1 and Non-Patent Document 2 calculate by ignoring part or all of the constraint conditions between variables. Therefore, the obtained position and orientation is not a global optimum solution that minimizes the reprojection error.

As a method for minimizing the reprojection error, Non-Patent Document 3 describes that the solution obtained by the position and orientation estimation method as described above is used as an initial value so that the reprojection error is minimized by the bundle adjustment method. A method of optimization is described. The bundle adjustment method is a method of performing nonlinear optimization of the position and orientation so that the sum of squares of the reprojection error is minimized. As the nonlinear optimization method, Newton method, Levenberg-Marquardt method or the like is used.

Since the position / orientation estimation method described in Non-Patent Document 1 needs to change the calculation method according to the distribution of three-dimensional coordinates and the number of input points, there is a problem that the estimation accuracy of the position / orientation may be lowered. It was. For example, when the distribution of three-dimensional coordinates is close to a plane with 6 points or more, it may be more accurate to use the planar DLT method instead of the non-planar DLT method.

The position / orientation estimation method described in Non-Patent Document 2 does not depend on the distribution of the three-dimensional coordinates, but the numerical calculation becomes unstable when the number of input points is increased, so that the position / orientation estimation accuracy decreases. There was a problem. Furthermore, when the number of input points is increased, there is a problem that the calculation cost for estimation increases.

In addition, when the bundle adjustment method is performed using the method described in Non-Patent Document 3 and the solution obtained by the method described in

Non-Patent Documents

1 and 2 as an initial value, the initial value is a global optimum. If it is not close enough to the solution, there is a problem that it converges or diverges to the local optimal solution. Furthermore, in the nonlinear optimization method, it is necessary to solve the linear equation repeatedly, and there is a problem that the calculation cost is high.

That is, it is difficult to estimate the position and orientation of the camera or subject with high accuracy and stability even when any of the above methods is used.

Therefore, an object of the present invention is to provide a position / orientation estimation apparatus, a position / orientation estimation method, and a position / orientation estimation program capable of estimating the position and orientation of a camera or a subject with high accuracy and stability. More specifically, “stable” means that numerical calculation does not become unstable even if the number of input points is increased (for example, 5 points or more). Also, the calculation cost does not exceed a predetermined range even if the number of input points is increased.

The position / orientation estimation apparatus according to the present invention receives a set of three or more three-dimensional coordinates related to an object and a two-dimensional coordinate corresponding to the three-dimensional coordinate on the image, and inputs the set of the input three-dimensional coordinate and the two-dimensional coordinate. Using the attitude of 3 degrees of freedom as a variable, all solutions satisfying the simultaneous polynomial with the gradient of the error function as zero are calculated for a predetermined error function representing the conversion relationship between the 3D coordinates and the 2D coordinates. Optimal solution candidate calculation means and a camera that has captured the object based on the extracted optimal solution, extracting the optimal solution with the smallest error function from all the solutions of simultaneous polynomials calculated by the optimal solution candidate calculation means And a position / orientation calculating means for calculating the position / orientation.

Also, the position and orientation estimation method according to the present invention receives a set of three or more three-dimensional coordinates related to an object and two-dimensional coordinates corresponding to the three-dimensional coordinates on the image, and inputs the inputted three-dimensional coordinates and two-dimensional coordinates. Using a set of three degrees of freedom as a variable, all solutions satisfying a simultaneous polynomial with the gradient of the error function as zero for a given error function representing the conversion relationship between 3D coordinates and 2D coordinates are expressed. Calculating and extracting the optimal solution that minimizes the error function from all the calculated simultaneous polynomial solutions, and calculating the position and orientation of the camera that captured the object based on the extracted optimal solution To do.

Further, the position / orientation estimation program according to the present invention uses a set of three or more three-dimensional coordinates relating to an input object and a two-dimensional coordinate corresponding to the three-dimensional coordinates on the image to the computer. Processing for calculating all solutions satisfying the simultaneous polynomial with the gradient of the error function as zero, for a predetermined error function representing the conversion relationship between the three-dimensional coordinates and the two-dimensional coordinates, using as a variable, and the calculated simultaneous polynomial An optimal solution that minimizes the error function is extracted from all the solutions, and a process for calculating the position and orientation of the camera that captured the object is executed based on the extracted optimal solution.

According to the present invention, the position and orientation of the camera or subject can be estimated with high accuracy and stability.

It is a block diagram which shows the structural example of the position and orientation estimation apparatus of 1st Embodiment. It is a flowchart which shows an example of operation | movement of the position and orientation estimation apparatus of 1st Embodiment. It is a block diagram which shows the structural example of the position and orientation estimation apparatus of 2nd Embodiment. It is a flowchart which shows an example of operation | movement of the position and orientation estimation apparatus of 2nd Embodiment. It is a block diagram which shows the structural example of the position and orientation estimation apparatus of 3rd Embodiment. It is a flowchart which shows an example of operation | movement of the position and orientation estimation apparatus of 3rd Embodiment. It is a block diagram which shows the structural example of the position and orientation estimation apparatus of 4th Embodiment. It is a flowchart which shows an example of operation | movement of the position and orientation estimation apparatus of 4th Embodiment. It is a block diagram at the time of applying this invention to an information processing system. It is a block diagram which shows the outline | summary of this invention.

Hereinafter, embodiments of the present invention will be described with reference to the drawings.

Embodiment 1. FIG.
FIG. 1 is a block diagram illustrating a configuration example of the position and orientation estimation apparatus according to the first embodiment of the present invention. The position / orientation estimation apparatus shown in FIG. 1 includes an optimal solution candidate calculation unit 1 and a position / orientation calculation unit 2.

Optimal solution candidate calculation unit 1 receives three or more three-dimensional coordinates and two-dimensional coordinates corresponding to the three-dimensional coordinates as input, and converts the three-degree-of-freedom camera posture as a variable to convert three-dimensional coordinates and two-dimensional coordinates. A processing unit that outputs a solution of a simultaneous polynomial (hereinafter simply referred to as a simultaneous polynomial) in which the gradient of the error function representing the relationship is zero, and includes a coefficient calculation unit 11 and a simultaneous polynomial solution unit 12 in this example. .

The coefficient calculation unit 11 receives the three-dimensional coordinates and the two-dimensional coordinates as input, calculates the coefficients of the simultaneous polynomial, and outputs them to the simultaneous polynomial solving unit 12.

The simultaneous polynomial solving unit 12 receives the coefficients of the simultaneous polynomial calculated by the coefficient calculating unit 11, solves the simultaneous polynomial, and outputs all the solutions.

The position / orientation calculation unit 2 is a processing unit that receives the solution of simultaneous polynomials calculated by the optimal solution candidate calculation unit 1 and calculates and outputs a position / orientation that minimizes the error function. The position / orientation calculation unit 2 includes a real number solution extraction unit 21, a posture candidate calculation unit 22, a posture candidate number totaling unit 23, an input score totaling unit 24, a minimum error candidate extraction unit 25, and a position candidate calculation unit 26. including. The position / orientation calculation unit 2 outputs a no-solution flag if a position / orientation that minimizes the error function cannot be obtained.

The real number solution extraction unit 21 receives all the solutions of the simultaneous polynomials calculated by the optimum solution candidate calculation unit 1, extracts all the real number solutions from the solutions, and outputs them. If no real number solution can be extracted, the real number solution extraction unit 21 stops the subsequent processing and outputs a no solution flag.

The posture candidate calculation unit 22 receives all real solutions extracted by the real number solution extraction unit 21 as input, and calculates and outputs posture candidates. The posture candidate calculation unit 22 performs a process of converting each real solution into a value that is treated as representing one posture candidate in the subsequent calculations.

The posture candidate number counting unit 23 receives the posture candidates calculated by the posture candidate calculation unit 22, counts the number of posture candidates, and outputs the posture or the posture candidates according to the number of posture candidates.

The input point counting unit 24 receives the three-dimensional coordinates and the posture candidates, counts the number of the three-dimensional coordinates, and outputs the number of input points and the posture or posture candidates. The input point counting unit 24 uses, for example, two-dimensional coordinates in order to count the number of input points.

The minimum error candidate extraction unit 25 receives a posture candidate as an input, calculates a posture that minimizes the error function, and outputs it.

The position candidate calculation unit 26 calculates a position corresponding to the posture when the posture is input, and outputs the position and posture. The position candidate calculation unit 26, when a posture candidate and the number of input points are input, converts the posture candidate into a posture, calculates a position corresponding to each posture, and outputs all the positions and postures.

In the present embodiment, the optimum solution candidate calculation unit 1 (more specifically, the coefficient calculation unit 11 and the simultaneous polynomial solution unit 12), the position and orientation calculation unit 2 (more specifically, the real number solution extraction unit 21, the orientation candidate). The calculation unit 22, the number of posture candidates totaling unit 23, the number of input points totaling unit 24, the minimum error candidate extracting unit 25, and the position candidate calculating unit 26) are, for example, hardware or a program designed to perform specific arithmetic processing or the like It is realized by an information processing apparatus such as a CPU that operates according to the above.

Next, the operation of this embodiment will be described with reference to FIG. FIG. 2 is a flowchart showing an example of the operation of the position / orientation estimation apparatus according to the present embodiment.

In the example illustrated in FIG. 2, first, when the three-dimensional coordinates and the two-dimensional coordinates corresponding to the three-dimensional coordinates are input, the coefficient calculation unit 11 uses the three-degree-of-freedom posture as a variable and the three-dimensional coordinates and the two-dimensional coordinates. Coefficients of simultaneous polynomials with the gradient of the error function representing the coordinate conversion relationship set to zero are calculated and output to the simultaneous polynomial solving unit 12 (step S11). Here, the degree of freedom of the camera posture is an index indicating how many variables each of the nine elements of the 3 × 3 matrix representing the posture of the camera are represented. Limiting the degree of freedom to 3 indicates that the rows (or columns) of the matrix R representing the camera posture are orthogonal to each other, and the L2 norms of the rows (or columns) are equal to each other. In the present embodiment, the simultaneous polynomial for which the coefficient calculation unit 11 calculates the coefficient is uniquely determined by how to express the degree of freedom of the camera posture in the error function. In accordance with a method for expressing the degree of freedom of the camera posture or the type of simultaneous polynomials defined in advance, using a set of three or more input three-dimensional coordinates and two-dimensional coordinates corresponding to the three-dimensional coordinates, a predetermined simultaneous system What is necessary is just to calculate the coefficient in a polynomial. In addition, the coefficient calculation unit 11 predefines a plurality of simultaneous polynomial types corresponding to, for example, a method for expressing the degree of freedom of the camera posture, and uses the simultaneous polynomials that are used according to the values of the setting parameters read at the time of activation or the like. It is also possible to select the type.

Further, it is assumed that the input coordinates are three or more different three-dimensional coordinates that are not in the camera position and degenerate arrangement, and two-dimensional coordinates corresponding to the three-dimensional coordinates. That is, the case where it is theoretically impossible to solve by the PnP problem is excluded on the input side. If the number of three-dimensional coordinates is different from the number of two-dimensional coordinates, the coefficient calculation unit 11 may return with an error without performing the subsequent operation on the assumption that there is no correspondence. Further, a coordinate set input unit (not shown) for inputting a set of 3D coordinates and 2D coordinates from the outside instead of the coefficient calculation unit 11 counts the number of input points, determines an error based on the number of each coordinate, and the like. It is also possible to perform.

Next, when the coefficients of the simultaneous polynomial are input, the simultaneous polynomial solving unit 12 solves the simultaneous polynomial using the three-dimensional coordinates input in step S11 and the corresponding two-dimensional coordinates (step S12). In addition, the simultaneous polynomial solving unit 12 outputs all solutions satisfying the simultaneous polynomial to the real number solution extracting unit 21. The solution of the simultaneous polynomial here includes an inappropriate value as a posture. For example, some or all of them may be complex numbers.

When all the solutions of the simultaneous polynomials solved by the simultaneous polynomial solving unit 12 are input, the real solution extracting unit 21 extracts all the real solutions from them and outputs them to the posture candidate calculating unit 22 (step S13). Yes). When no real number solution can be extracted from all the solutions of the simultaneous polynomials, the real number solution extraction unit 21 outputs a “no solution” flag as the position and orientation estimation result and ends the operation (No in step S13). Step S19). For example, if all the solutions are complex numbers, the real number solution extraction unit 21 outputs a “no solution” flag and ends the operation. The “no solution” flag may be a true / false value, for example, or may output a position / orientation value indicating no solution determined in advance.

If there is even one real number solution in the simultaneous polynomial, the posture candidate calculation unit 22 receives all real number solutions and calculates a posture candidate (step S14). The posture candidates obtained as a result of the calculation are output to the posture candidate number counting unit 23. The posture candidate calculation unit 22 may perform a process of converting the obtained real number solution into values (3 × 3 matrix representing the posture) each of which is a posture candidate.

When the posture candidates obtained as a result of the calculation of the posture candidate calculation unit 22 are input, the posture candidate count totaling unit 23 counts the number of input posture candidates, and the number of posture candidates as a counting result. The subsequent processing is branched according to (step S15).

If the number of posture candidates is one, the posture candidate number counting unit 23 outputs the posture to the position candidate calculation unit 27 (proceeds to step S18). If there are a plurality of posture candidates, the posture candidate count totaling unit 23 outputs all input posture candidates to the input score totaling unit 24 (proceeds to step S16).

When there are a plurality of posture candidates, the input point totaling unit 24, together with the plurality of posture candidates output by the posture candidate number totaling unit 23, the same three-dimensional coordinates as those input in step S11 or / And two-dimensional coordinates are input. The input point totaling unit 24 counts the number of input three-dimensional coordinates or two-dimensional coordinates, and branches the subsequent processing according to the input point number that is the counting result (step S16).

When the number of input points is 3, the input point totaling unit 24 outputs all the input posture candidates and the input point number information to the position candidate calculating unit 27 (proceeds to step S18). This is because when the number of input points is 3, the error in all the posture candidates is 0, and it is impossible to determine which candidate is closest to the correct answer. On the other hand, when the number of input points is 4 or more, since the error in each candidate is different, it is possible to select only one candidate with the smallest error. The posture candidates are output to the minimum error candidate extraction unit 25. The input point information is information indicating the number of input points, and may be numerical information indicating the number of input points themselves or a flag indicating that there are three points.

When there are a plurality of posture candidates and the number of input points is four or more, all posture candidates are input to the minimum error candidate extraction unit 25 by the determination processing in step S16. The minimum error candidate extraction unit 25 calculates one posture that minimizes the error function from all input posture candidates, and outputs the calculated one to the position candidate calculation unit 27 (step S17).

In step S18, if the number of posture candidates is one by the determination processing in step S15, the posture candidate is input to the position candidate calculation unit 27 as the posture with the minimum error. Alternatively, if there are a plurality of posture candidates and the number of input points is three by the determination processing in step S16, all the posture candidates are input to the position candidate calculation unit 27 together with the input point number information as the posture of the minimum error. Alternatively, the one posture whose error function is minimized by the process of step S <b> 17 is input to the position candidate calculation unit 27. When one posture is input, the position candidate calculation unit 27 calculates a position corresponding to the posture and outputs the position and posture as an estimation result. When a plurality of posture candidates and input point number information (in this case, information indicating three points) are input, the position candidate calculation unit 27 converts the input plurality of posture candidates into postures, The position corresponding to the posture is calculated, and all the calculated positions and postures are output as the position and posture of the estimation result.

Hereinafter, the present embodiment will be described using specific examples. In this embodiment, it is assumed that the number of three-dimensional coordinates and two-dimensional coordinates are always equal.

First, the definition of the error function will be described. In the following, the superscript T is a transpose of a matrix and a vector, 0 is a zero matrix and a zero vector, I is a unit matrix, det is a determinant, and ∥ is an L2 norm of a vector. Also, the input number is n points, relative to the i-th three-dimensional coordinates X _i, Qi Tsugika the two-dimensional coordinates of v _i, the camera posture matrix R, the camera position vector t, from t to v _i vector _c i the Do position, the proportional constant of _{v i} represents the lambda _i. Here, R has a constraint condition of det (R) = 1 and RR ^T = I.

Internal parameters in known camera, the relationship between X _i and v _i is expressed by the following equation (1).

λ _i v _i = RX _i + t−c _i (1)

Also, from the equation (1), the estimated error between R and t is expressed by the following equation (2).

In addition, if the result obtained by differentiating equation (2) with respect to λ _i is set to zero and substituted into equation (2), the following equation (3) is obtained.

Here, λ _i is eliminated by differentiation, but v _i may be converted to a distortion symmetric matrix indicating the outer product of the vectors and erased by multiplying both sides of the above equation (1).

Next, when the value obtained by differentiating the expression (3) with respect to t is set to zero, the optimum t is expressed by the following expression (4).

In equation (4), r is a vectorization of R. Further, when the optimum t is substituted into Expression (3), the error function is expressed by the following Expression (5) as a constrained nonlinear optimization problem.

min r ^T M ₁ r + M ₂ ^T r + M ₃ (5)
However, s, t, det (R) = 1, RR ^T = I

M ₁ , M ₂ , and M ₃ are represented by the following formula (6).

In the above formula (5), in order to express R with 3 degrees of freedom, for example, Euler angles with respect to the X axis, the Y axis, and the Z axis may be used, or the normalized quaternion q = [q ₀ , Q ₁ , q ₂ , q ₃ ] (∥q∥ = 1, q ₀ ≧ 0), or an orthogonal matrix with 9 variables and a determinant of 1 may be used. The following may also be used.

That is, in Equation (5), R is represented by an unnormalized quaternion q = [1, a, b, c], and the constraint condition of R is det (R) = ∥q∥ ⁶ , RR ^T = When relaxing with ∥q∥ ⁴ I, the error function is expressed by the following equation (7) as a three-variable unconstrained nonlinear optimization problem.

min r ^T M ₁ r + ∥q∥ ² M ₂ ^T r + ∥q∥ ⁴ M ₃ Formula (7)

R is represented by the following formula (8).

Since the error function of equation (5) and the error function of equation (7) have different constants, the optimum solution of equation (5) can be obtained by multiplying the optimum solution of equation (7) by a constant.

Therefore, the conditions for minimizing the error function defined by Equation (7) will be described below. Since Equation (7) is a non-constrained nonlinear optimization problem, the optimum solution is a solution of simultaneous polynomials with the gradient obtained by partial differentiation of Equation (7) by a, b, and c being zero.

x = [a, b, c] ^T , coefficient of simultaneous polynomial matrix N, vector z = [a ³ , a ² b, ab ² , b ³ , a ² c, abc, b ² c, ac ² , bc ² , c ³ , a ² , ab, b ² , ac, bc, c ² , a, b, c, 1] ^T , the simultaneous polynomial is expressed by the following equation (9).

The term order of z can be set to reverse lexicographic order with order, lexicographic order with order, lexicographic order, etc.

Next, the operation of each part in the present embodiment will be specifically described.

In step S11, the coefficient calculation unit 11 has X _i and v _i (1 ≦ i ≦ n) as inputs, and a coefficient matrix of a simultaneous polynomial in which the gradient obtained by partial differentiation of equation (7) with a, b, and c is zero. N is calculated and output to the simultaneous polynomial solving unit 12.

In step S12, the simultaneous polynomial solving unit 12 receives the coefficient matrix N and outputs a solution of the simultaneous polynomial Nz = 0 to the real number solution extracting unit 21.

The solution of simultaneous polynomials is, for example, to calculate the Gröbner basis according to the z term order. As a method for obtaining the Gröbner basis, there are a Buchberger algorithm, an F4 algorithm, an XL algorithm, and the like. Further, since the term appearing in z does not change every calculation, the method of Kukelova et al. May be used without obtaining the Gröbner basis. The method of Kukelova et al. Is a method for obtaining a term appearing in a Gröbner basis by pre-calculation, calculating Action Matrix that constitutes the Gröbner basis from the coefficient matrix N, and obtaining a solution by eigenvalue decomposition of the Action Matrix. When the simultaneous polynomial Nz = 0 is solved by these methods, a total of 27 solutions including complex solutions can be obtained.

Next, in step S13, the real solution extraction unit 21 receives 27 solutions of the simultaneous polynomial Nz = 0, calculates all real solutions from the 27 solutions, and outputs them to the posture candidate calculation unit 22. If all the solutions are complex numbers, the real number solution extraction unit 21 proceeds to step S19, outputs a no solution flag, and ends the operation. The no solution flag may be a true / false value, for example. The real number solution extraction unit 21 may output a position / orientation value indicating no solution determined in advance.

In step S14, the posture candidate calculation unit 22 receives all real number solutions, calculates posture candidates, and outputs the posture candidates to the posture candidate count totaling unit 23. Posture candidates are given by the following equation (10).

Equation (10) represents a calculation method for all nine elements of the posture (3 × 3 matrix). For example, if there are k real number solutions, it means that there are k sets of [a, b, c], and the posture candidate calculation unit 22 sets one set of k sets of [a, b, c]. Substituting into Equation (10) one by one, k candidate candidates (3 × 3 matrix) may be obtained.

Next, in step S15, the posture candidate number counting unit 23 receives the calculated posture candidates as input and calculates the number of posture candidates. If the number of posture candidates is one, the posture candidates are converted into postures by Expression (11) and output to the position candidate calculator 26. The posture candidate count totaling unit 23 may multiply each of the posture candidates by 1 / (∥q∥ ² ) using the corresponding [a, b, c] when converting the posture candidates into postures. Note that ∥q∥ ² = 1 + a ² + b ² + c ² .

If the number of posture candidates is plural as a result of the counting in step S15, the posture candidate count totaling unit 23 outputs all the posture candidates to the input score totaling unit 24.

In step S16, _{X i} or _{v i (1 ≦ i ≦ n} ) is input to the input number counting unit 24. When n = 3, the input point totaling unit 24 outputs all posture candidates to the position candidate calculating unit 26 together with n as input point information. The input score totaling unit 24 converts the posture candidates into postures using the equation (11), and outputs all postures as conversion results to the position candidate calculation unit 26 together with n as input point number information. May be. The input point totaling unit 24 may output n as the input point information, or may output a flag indicating that n = 3.

In step S16, when n ≧ 4, the input point totaling unit 24 outputs all posture candidates to the minimum error candidate extracting unit 25.

In step S17, the minimum error candidate extraction unit 25 receives a plurality of posture candidates, vectorizes them and substitutes them into equation (7), and uses the posture candidate that minimizes equation (7) as a position candidate. It outputs to the calculation part 26.

In step S18, one or a plurality of postures (or posture candidates) and the number of input points are input to the position candidate calculation unit 26. When a candidate for posture is input, the position candidate calculation unit 26 converts the posture into a posture by Expression (11), and then sets one set based on the input one posture or a plurality of postures and the number of input points. Calculates and outputs the position and orientation of multiple or multiple positions and orientations. The position is calculated by equation (4).

In the present embodiment, the position / orientation calculation unit 2 first extracts all real solutions and narrows down candidates for the optimal solution from the real solutions by batch processing, but sequentially processes the solutions of simultaneous polynomials one by one. Is also possible. For example, every time the simultaneous polynomial solving unit 12 obtains one solution, the solution is output to the real number solution extracting unit 21. If the real solution extraction unit 21 determines that the solution is a real number, conversion processing to a posture candidate by the posture candidate calculation unit 22, input point determination processing by the input point totaling unit 24, and minimum error candidate determination by the minimum error candidate extraction unit 25 Processing is executed by loop processing. The minimum error candidate extraction unit 25 stores, as the minimum error candidate determination process, a candidate with the minimum error each time, and every time a real solution is extracted, an error calculation, a minimum error comparison process, Overwriting processing of the minimum candidate and the minimum value may be performed. Here, the branching process based on the number of posture candidates and the number of input points may be collectively performed by the minimum error candidate extraction unit 25. Note that the branching process based on the number of posture candidates and the number of input points is not limited to sequential processing, and can be performed collectively by the minimum error candidate extraction unit 25. In such a case, the posture candidate calculation unit 22 and the input score totaling unit 24 can be omitted.

In this embodiment, the minimum error candidate extraction unit 25 uses Expression (7) as an error function, but Expression (5) and reprojection error can also be used. When using equation (5), the minimum error candidate extraction unit 25 converts all posture candidates into postures from equation (11) and substitutes them into equation (5). In the case of using a reprojection error, the minimum error candidate extraction unit 25 first converts all posture candidates into postures from Equation (11). Next, the minimum error candidate extraction unit 25 calculates positions for all postures. Next, the minimum error candidate extraction unit 26 receives all postures and positions, and outputs a position and posture that minimizes the reprojection error.

As described above, in this embodiment, the position and orientation of the camera or subject can be estimated with high accuracy and stability.

The reason is as follows. First, since the error function having the three-degree-of-freedom posture as a variable does not depend on the distribution of the three-dimensional coordinates or the number of input points, it is not necessary to apply a different position / posture calculation method. Moreover, since the nonlinear optimization method is not used, the calculation cost is low. Also, solving the simultaneous polynomial with the error function gradient being zero will calculate all the optimal solution candidates and extract the candidate with the smallest error function, so the output position and orientation is the global minimum solution This is because it is guaranteed.

That is, the position and orientation estimation method according to the present embodiment is different from the conventional method in which one method is used to search for one value that seems to be close to the global optimal solution and nonlinear optimization is performed using the value as an initial value. To find all candidates for the minimum value of the error function, and select the candidate with the minimum error as the optimum value.

The solution of the simultaneous polynomial with the error function gradient being zero is called the stop point, and is a necessary condition for the error function to be minimized. However, since the stopping point is not a sufficient condition, it is necessary to narrow down the global optimum solution from all candidates as described above. When nonlinear optimization is used as in a general method, a value that makes the simultaneous polynomial zero is searched for from an initial value that does not become zero even if it is substituted into the simultaneous polynomial. Then, since only one solution of the simultaneous polynomial is searched, there is no guarantee that the searched solution is a global optimal solution. Further, in the method of searching for a solution by optimization, it is difficult to discriminate between a local optimum solution and a global optimum solution.

For example, in the method described in Non-Patent Document 1, a solution can be obtained by a very simple calculation method by ignoring the degree of freedom of the posture. A value with poor accuracy is obtained by comparison. For this reason, it is necessary to perform nonlinear optimization so that the degree of freedom 3 is satisfied by using the method described in Non-Patent Document 3. On the other hand, in this embodiment, all candidates that can be the optimum value can be obtained without performing optimization. That is, a method is adopted in which all values that have the possibility of a global optimal solution are calculated and then narrowed down. Therefore, the position and orientation of the camera or subject can be estimated with high accuracy and stability.

Embodiment 2. FIG.
Next, a second embodiment of the present invention will be described with reference to the drawings. FIG. 3 is a block diagram illustrating a configuration example of the position and orientation estimation apparatus according to the second embodiment of the present invention. The position / orientation estimation apparatus according to this embodiment is different from the first embodiment shown in FIG. 1 in that the position / orientation calculation unit 2 further includes a secondary optimality verification unit 31.

The second-order optimality verification unit 31 receives the real number solution output from the real number solution extraction unit 21 as input, verifies the positive definiteness of the Hessian matrix obtained by second-order differentiation of the error function with a variable, and calculates a real number solution that is a positive definite value The solution is output to the optimal posture candidate unit 23 as a solution satisfying the second-order optimality sufficient condition.

Further, when the verification result shows that there is no real solution that is a positive definite value, the secondary optimality verification unit 31 aborts the subsequent processing and outputs a no solution flag.

In the present embodiment, the secondary optimality verification unit 31 is realized by, for example, hardware designed to perform specific arithmetic processing or the like, or an information processing apparatus such as a CPU that operates according to a program.

Next, the operation of this embodiment will be described with reference to FIG. FIG. 4 is a flowchart showing an example of the operation of the position / orientation estimation apparatus of this embodiment. Since the operations other than step S21 are the same as those in the first embodiment, description thereof will be omitted.

The quadratic optimality verification unit 31 verifies the positive definiteness of the Hessian matrix obtained by second-order differentiation of the error function with respect to the real solution of the simultaneous polynomial extracted by the real number solution extraction unit 21, and is a positive definite value. The real number solution is output to the optimal posture candidate unit 23 as a solution that satisfies the secondary optimality sufficient condition (step S21). When all the real solutions do not satisfy the secondary optimality sufficient condition, the secondary optimality verification unit 31 outputs a no-solution flag and ends the operation (No in step S21, step S19). The no solution flag may be a true / false value, for example. The secondary optimality verification unit 31 may output posture and position values representing no solution determined in advance.

Next, the operation of each part in the present embodiment will be specifically described. The operations other than the secondary optimality verification unit 31 are the same as those in the first embodiment.

The secondary optimality verification unit 31 receives the real number solution output from the real number solution extraction unit 21, and substitutes the real number solution into the Hessian matrix obtained by second-order differentiation of equation (7) by [a, b, c]. If the Hessian matrix is a positive definite value, the secondary optimality verification unit 31 outputs a solution satisfying the secondary optimality sufficient condition to the posture candidate calculation unit 23 (Yes in step S21). When there is no positive definite Hessian matrix, the secondary optimality verification unit 31 outputs a no-solution flag in the same manner as the real solution extraction unit 21, and ends (No in step S21). In order to verify the positive definiteness of the Hessian matrix, the second-order optimality verification unit 31 may verify, for example, whether all the minor determinants of the Hessian matrix are greater than zero, or the eigenvalue of the Hessian matrix may be It may be verified whether all are greater than zero.

In this embodiment, the position and orientation of the camera or subject can be estimated at higher speed. Further, the reliability of the position and orientation output as the estimation result can be further improved. This is because the second-order optimality verification unit 31 eliminates inappropriate solutions that satisfy the simultaneous polynomials but are not locally optimal solutions, and reduce the number of posture candidates.

In addition, the real solution obtained by the secondary optimality verification unit 31 is always a local optimal solution. Therefore, it is guaranteed that the solution having the smallest error among them is a global optimum solution.

Also, when the real solution obtained by the secondary optimality verification unit 31 does not exist, the calculation cost can be reduced by terminating the subsequent processing.

Embodiment 3. FIG.
Next, a third embodiment of the present invention will be described with reference to the drawings. FIG. 5 is a block diagram illustrating a configuration example of the position and orientation estimation apparatus according to the third embodiment of the present invention. In addition to the configuration of the first embodiment shown in FIG. 1 or the second embodiment shown in FIG. 3, this embodiment further includes a flatness verification unit 4.

The frontality verification unit 4 is a processing unit that receives the three-dimensional coordinates, corrects the position and orientation so that the three-dimensional coordinates are in front of the camera, and outputs them. The frontality verification unit 4 includes a plane determination unit 41 and a position / orientation correction unit 42.

The plane discriminating unit 41 receives the three-dimensional coordinates, determines whether the distribution of the three-dimensional coordinates is planar or non-planar, and outputs a flag indicating whether the plane is non-planar.

The position / orientation correction unit 42 receives a flag indicating whether it is a plane or non-planar, a three-dimensional coordinate, and a position / orientation. If it is a plane, the position / orientation correction unit 42 determines whether the three-dimensional coordinate is in front of the camera. If the three-dimensional coordinates are not in front of the camera, the position / orientation correction unit 42 corrects and outputs the position / orientation.

In a perspective projection model camera, if the distribution of the three-dimensional coordinates is a plane, there are two solutions for the position and orientation of the camera due to indefiniteness of the sign. Of the two solutions, one corresponds to the front of the camera and the other corresponds to the back of the camera. It is practically impossible for the three-dimensional coordinates to exist behind the camera. Therefore, in this embodiment, the frontality verification unit 4 determines that a posture in which three-dimensional coordinates do not exist in front of the camera is an inappropriate value as a posture even if the solution satisfying the simultaneous polynomial is a real number solution. The position and orientation are corrected so that the dimensional coordinates are in front of the camera, and processing for removing the indefiniteness of the code is performed.

In this embodiment, the frontality verification unit 4 (more specifically, the plane determination unit 41 and the position / orientation correction unit 42) operates in accordance with, for example, hardware or a program designed to perform specific arithmetic processing or the like. This is realized by an information processing device such as a CPU.

Next, the operation of this embodiment will be described with reference to FIG. FIG. 6 is a flowchart illustrating an example of the operation of the position / orientation estimation apparatus according to this embodiment. Hereinafter, the operations other than Step S31 and Step S32 are the same as those in the second embodiment, and thus the description thereof is omitted.

The plane discriminating unit 41 receives the three-dimensional coordinates, determines whether the distribution of the three-dimensional coordinates is plane or non-plane, and outputs a flag indicating whether the plane is non-plane (step S31).

The position / orientation correction unit 42 receives as input the three-dimensional coordinates input in step S11, the flag indicating whether the plane or non-plane is output from the plane determination unit 41, and the position / orientation output from the position candidate calculation unit 27. If the flag indicates a plane (Yes in step S31), it is determined whether or not the three-dimensional coordinates are in front of the camera. If the three-dimensional coordinates are not in front of the camera, the position / orientation correction unit 42 corrects the position / orientation and outputs the position / orientation as an estimation result (step S32). In the case of a non-planar surface, the position / orientation correction unit 42 outputs the input position / orientation as it is as the position / orientation of the estimation result.

Next, the operation of each part in this embodiment will be specifically described. Operations other than the frontality verification unit 4 are the same as those in the second embodiment. Hereinafter, as an example, the camera is a perspective projection model with c _i = 0, and the three-dimensional coordinate distribution is a plane with all Z coordinates being zero.

First, when the j-th column of R is r _j , the equation (1) is expressed by the following equation (12).

In addition, since the expression (12) is indefinite with respect to the sign, [r ₁ , r ₂ , t] and [−r ₁ , −r ₂ , −t] cannot be distinguished.

The plane discriminating unit 41 receives the three-dimensional coordinates, determines whether the distribution of the three-dimensional coordinates is plane or non-plane, and outputs a flag indicating whether the plane is non-plane (step S31). In order to determine whether the distribution of the three-dimensional coordinates is a plane in which all the Z coordinates are 0 or non-planar, the plane determination unit 41 may calculate the absolute value of the sum of the sign functions of all the Z coordinates, for example. Alternatively, the determination may be made based on whether the determinant of the moment matrix of the three-dimensional coordinates is close to zero. The sign function is a function that returns +1 if the input value is positive, -1 if it is negative, and 0 if it is 0.

The plane discriminating unit 41 sets the plane having the Z coordinate of 0 if the absolute value of the sum of the sign functions of all the Z coordinates is equal to or less than the threshold value. For example, the threshold is determined as 80% of the number of input points.

Also, when using a moment matrix, if all the Z coordinates are planes of zero, one of the eigenvalues of the moment matrix is zero. Therefore, the plane discriminating unit 41 discriminates whether the plane is a zero plane or a non-plane depending on whether the determinant of the moment matrix is close to zero.

The position / orientation correction unit 42 receives a plane / non-planar flag, a three-dimensional coordinate, and a position / orientation. When the plane flag is input (Yes in step S31), whether the three-dimensional coordinate is in front of the camera. Determine. If the three-dimensional coordinates are not in front of the camera, the position / orientation correction unit 42 corrects and outputs the position / orientation (step S32).

When the non-planar flag is input (No in step S31), the position / orientation correction unit 42 outputs the input position / orientation as it is. Note that the position / orientation correction unit 42 counts the number of three-dimensional coordinates existing in front of the camera when a plane flag having a Z coordinate of 0 is input. If it is determined that the input three-dimensional coordinate group is not in front of the camera, the position / orientation correction unit 42 corrects and outputs the position / orientation. As the flag, for example, a true / false value may be used, or a value or symbol representing a plane determined in advance may be used.

The position / orientation correction unit 42 uses, for example, a sign function in order to count the number of three-dimensional coordinates existing on the front surface of the camera. If the k-th row of R is r _k with a bar and sgn is a sign function, the number of three-dimensional coordinates existing on the front surface of the camera is expressed by Expression (13).

The position / orientation correction unit 42 determines that the camera is inverted if Expression (13) is equal to or less than the threshold value. For example, the threshold is determined as 80% of the number of input points. When the camera is inverted, the posture is corrected to [−r ₁ , −r ₂ , r ₃ ] and the position is corrected to −t.

In the present embodiment, it is possible to estimate the position and orientation of the camera or the subject with higher reliability than in the first embodiment and the second embodiment. This is because the plane discriminating unit 41 outputs a position and orientation that are geometrically consistent by correcting the position and orientation so that the three-dimensional coordinates are always in front of the camera.

In this embodiment, the frontality verification unit 4 accepts a flag indicating the distribution of the three-dimensional coordinates as an input, omits the processing when the distribution is non-planar, and the distribution is flat or the distribution is unknown. It is also possible to perform processing in some cases.

Further, when the three-dimensional distribution is unknown or when it is desired to omit the operation of the plane discriminating unit 41, the position / orientation correcting unit 42 determines whether or not to output without performing the correcting operation, and outputs It is also possible to output the position / orientation as the estimation result only when it is determined that it is acceptable. For example, the position / orientation correction unit 42 counts the number of three-dimensional coordinates existing on the front surface of the camera based on the equation (13), and if the number is equal to or greater than the threshold, the input three-dimensional coordinate group is the front surface of the camera. The position / orientation input as the estimation result is output as it is. If the number of three-dimensional coordinates is equal to or less than the threshold, the position / orientation correction unit 42 determines that the input three-dimensional coordinate group is not in front of the camera, and does not output the position / orientation input as an estimation result, but generates an error. May be output.

In the above description, the position / orientation correction unit 42 receives the three-dimensional coordinates and counts the number of the three-dimensional coordinates existing on the front surface of the camera using the sign function, so that the input three-dimensional coordinate group becomes the camera. Although the method for determining whether or not it is in front of the camera has been shown, it is possible to determine whether or not the input three-dimensional coordinate group is in front of the camera by using two-dimensional coordinates as input. It is also possible to determine.

For example, the position / orientation correction unit 42 uses the following equation (14) instead of equation (13) to calculate as follows.

Here, the position / orientation correction unit 42 indicates that the input three-dimensional coordinate group exists in front of the camera if the above expression (14) is positive, and does not exist in front of the camera if it is negative (that is, the camera It may be determined that it exists on the back side.

Embodiment 4 FIG.
Next, a fourth embodiment of the present invention will be described with reference to the drawings. FIG. 7 is a block diagram illustrating a configuration example of the position and orientation estimation apparatus according to the fourth embodiment of the present invention. The present embodiment further includes a normalization unit 51 and a denormalization unit 52 in addition to the third embodiment.

The normalization unit 51 receives the three-dimensional coordinates and normalizes the three-dimensional coordinates based on the normalization parameters determined in advance or the normalization parameters calculated from the three-dimensional coordinates. Then, the normalization unit 51 outputs normalized three-dimensional coordinates and normalization parameters.

The denormalization unit 52 receives the normalization parameter and the position and orientation, denormalizes the position and orientation, and outputs the result.

In the present embodiment, the normalization unit 51 and the denormalization unit 52 are realized by, for example, hardware designed to perform specific arithmetic processing or the like, or an information processing device such as a CPU that operates according to a program.

Next, the operation of this embodiment will be described with reference to FIG. FIG. 8 is a flowchart showing an example of the operation of the position / orientation estimation apparatus of this embodiment. Hereinafter, operations other than step S41 and step S42 are the same as those in the third embodiment, and thus the description thereof is omitted.

The normalization unit 51 first receives the three-dimensional coordinates, and normalizes the three-dimensional coordinates based on the normalization parameters determined in advance or the normalization parameters calculated from the three-dimensional coordinates (step S41). Then, the normalization unit 51 outputs the normalized three-dimensional coordinates to the coefficient calculation unit 11 and outputs the normalization parameters to the inverse normalization unit 42.

The denormalization unit 52 receives the normalization parameter output from the normalization unit 51 and the position and orientation output from the position candidate calculation unit 27, denormalizes the position and orientation, and outputs the result (step S42). .

Next, the operation of each part in the present embodiment will be specifically described. Operations other than the normalization unit 51 and the denormalization unit 52 are the same as those in the third embodiment.

For cameras with known internal parameters, the range of the two-dimensional coordinates is often normalized to an appropriate value such as [−1, +1], so that it is not necessary to normalize. On the other hand, the range of the three-dimensional coordinate is, for example, hundreds or thousands if the unit is millimeters, which causes numerical instability such as overflowing in the middle of calculation.

Therefore, the normalization unit 51 normalizes the three-dimensional coordinates and matches the range with the two-dimensional coordinates to improve numerical stability.

Here, as an example, normalization using a translation vector p and a scale s will be described. The translation vector p may be, for example, the center of gravity of the three-dimensional coordinate or the median value. The scale s may be, for example, a standard deviation of three-dimensional coordinates or a maximum value of three-dimensional coordinates.

When the range of the three-dimensional coordinates is fixed, p or s calculated in advance may be used as the translation vector p. The normalization unit 51 receives the three-dimensional coordinates as input, calculates a predetermined normalization parameter or normalization parameter, and normalizes the three-dimensional coordinates. Then, the normalization unit 51 outputs the normalized three-dimensional coordinates and normalization parameters (Step S41).

Three-dimensional coordinates obtained by normalizing is _{s (X} i -p). Normalization parameters are p and s. The denormalization unit 52 receives the normalization parameter and the position and orientation, denormalizes the position and orientation, and outputs the result (step S413). In this embodiment, since the posture is not affected by normalization, it is output as it is. The position is denormalized to (1 / s) t-Rp and output.

In this embodiment, the position and orientation of the camera or subject can be estimated with higher accuracy. The reason for this is that normalizing the three-dimensional coordinates reduces the order of values in the middle of calculation and improves numerical stability.

Each of the above embodiments can be realized not only by the position / orientation estimation apparatus realized by hardware corresponding to each unit but also by an information processing system as shown in FIG. FIG. 9 is a block diagram when the position / orientation estimation apparatus according to the present invention is implemented in an information processing system. The information processing system shown in FIG. 9 is a general information processing system including a processor 61, a program memory 62, and a storage medium 63.

The storage medium 63 may be a storage area composed of separate storage media, or may be a storage area composed of the same storage medium. As the storage medium, a magnetic storage medium such as a RAM or a hard disk can be used.

The program memory 61 includes the above-described optimal solution candidate calculation unit 1 (more specifically, the coefficient calculation unit 11 and the simultaneous polynomial solution unit 12) and the optimal position / orientation candidate calculation unit 2 (more specifically, a real number). Solution extraction unit 21, posture candidate calculation unit 22, posture candidate calculation unit 23, input point totaling unit 24, minimum error candidate extraction unit 25, position candidate calculation unit 26), and secondary optimality verification unit 31 And the frontality verification unit 4 (more specifically, the plane discriminating unit 41 and the position / orientation correction unit 42), the normalization unit 51, and the denormalization unit 52 are processed by the processor 61. Is stored, and the processor 61 operates in accordance with this program. The processor 61 may be a processor that operates according to a program such as a CPU, for example.

Thus, the present invention can be realized by a computer program. Note that it is not necessary to operate all the parts that can be operated by the program by the program, and a part may be configured by hardware. Moreover, you may implement | achieve as a separate unit, respectively.

Next, the outline of the present invention will be described. FIG. 10 is a block diagram showing an outline of the present invention. The position / orientation estimation apparatus shown in FIG. 9 is an apparatus that receives a set of three or more three-dimensional coordinates related to an object and two-dimensional coordinates corresponding to the three-dimensional coordinates on the image, and includes an optimal solution candidate calculation unit 101. And position / orientation calculation means 102.

The optimum solution candidate calculation means 101 uses a set of the input 3D coordinates and 2D coordinates and uses a set of 3 degrees of freedom as a variable for a predetermined error function representing the conversion relationship between the 3D coordinates and the 2D coordinates. Thus, all solutions satisfying the simultaneous polynomial with the error function gradient set to zero are calculated. In the above embodiment, the optimal solution candidate calculation unit 101 is shown as the optimal solution candidate calculation unit 1, for example.

The position / orientation calculation means 102 extracts the optimal solution that minimizes the error function from all the solutions of the simultaneous polynomials calculated by the optimal solution candidate calculation means 101, and images the object based on the extracted optimal solution. Calculate the camera's position and orientation. The position / orientation calculation means 102 is shown as, for example, the position / orientation calculation unit 2 in the above embodiment.

The optimal solution candidate calculation means 101 receives a set of three-dimensional coordinates and two-dimensional coordinates as input, coefficient calculation means for calculating coefficients of simultaneous polynomials, and coefficients of simultaneous polynomials calculated by the coefficient calculation means as inputs, Simultaneous polynomial solving means for calculating all solutions satisfying the simultaneous polynomials may be included.

Further, the position / orientation calculation means 102 extracts all real solutions from all solutions of the simultaneous polynomials, and outputs a real number solution extraction means for outputting a flag indicating all the extracted real solutions or no solutions, and real number solution extraction Posture candidate calculation means for converting all real solutions extracted by the means into one posture candidate, the number of posture candidates converted by the posture candidate calculation means, the input three-dimensional coordinates and two-dimensional A minimum error candidate extraction unit that extracts one or a plurality of postures having a minimum error function from all the posture candidates converted by the posture candidate calculation unit based on the number of input points that is the number of sets of coordinates; The position corresponding to each posture is calculated based on one or a plurality of postures in which the error function extracted by the minimum error candidate extracting means is minimized, and each position is output as a set of positions and postures. A position candidate calculation unit that may contain. Note that the conversion processing by the posture candidate calculation means and the extraction processing by the minimum error candidate extraction means are not only performed in batch after all the real number solutions and posture candidates to be processed are input, but also the data to be processed It is also possible to process sequentially according to the input method.

The position / orientation calculation means 102 includes secondary optimality verification means for calculating and outputting real solutions satisfying the secondary optimality sufficient condition among all real number solutions extracted by the real number solution extraction means, The posture candidate calculation unit may convert all the real solutions that are determined to satisfy the secondary optimality sufficient condition by the secondary optimality verification unit into one posture candidate.

In addition to the configuration shown in FIG. 1, the position / orientation estimation apparatus determines whether or not the input three-dimensional coordinates are in front of the camera. Frontality verification means for correcting the position and orientation output from the calculation means or outputting a flag indicating an error may be provided.

Further, the frontality verification means determines whether the three-dimensional coordinates are planar or non-planar, outputs a determination result, the three-dimensional coordinates, the determination result, and the position and orientation output from the position and orientation calculation means If the determination result is a plane, it is determined whether or not the three-dimensional coordinate is in front of the camera. May be included.

In addition to the configuration shown in FIG. 1, the position / orientation estimation apparatus receives a set of three-dimensional coordinates and two-dimensional coordinates, when a normalization parameter or a three-dimensional parameter determined in advance is input to the position / orientation estimation apparatus. Based on the normalization parameter calculated by the coordinates, the normalization means for normalizing the three-dimensional coordinates, the normalization parameters, and the position and orientation calculated using the normalized three-dimensional coordinates are input, and the normalization parameters And a denormalization unit that calculates and outputs a position and orientation that have been denormalized based on the above. In such a case, the optimal solution candidate calculation unit 101 calculates all solutions satisfying the simultaneous polynomials using the three-dimensional coordinates normalized by the normalization unit.

The position / orientation calculation means 102 described above receives the real number solution extraction means for outputting a flag indicating all real solutions or no solution from the simultaneous polynomial solution, and the extracted real number solution as inputs, and calculates the attitude candidates. Posture candidate calculation means, posture candidate number input means for calculating the number of posture candidates, input three-dimensional coordinates or two-dimensional coordinates, and posture candidates are input and input. Input point totaling means for calculating the number of three-dimensional coordinates or two-dimensional coordinates, and outputting a plurality of postures for which the error function is minimized, based on the number of input points, or a posture candidate for which the minimum error is to be calculated; The candidate for the posture to be calculated as the minimum error is input, the one posture that minimizes the error function is calculated and output, and the one or more postures that minimize the error function are input. , The positions corresponding to the force posture calculated, and may be configured to include a position candidate calculation unit for outputting.

As mentioned above, although this invention was demonstrated with reference to embodiment and an Example, this invention is not limited to the said embodiment and Example. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.

This application claims priority based on Japanese Patent Application 2011-113969 filed on May 20, 2011, the entire disclosure of which is incorporated herein.

When the internal parameters of the camera are known, the present invention can be suitably applied to a use for estimating the position and orientation of the camera or subject.

DESCRIPTION OF SYMBOLS 1 Optimal solution candidate calculation part 11 Coefficient calculation part 12 Simultaneous polynomial solution part 2 Optimal position and posture candidate calculation part 21 Real number solution extraction part 22 Posture candidate calculation part 23 Posture candidate totaling part 24 Input point totaling part 25 Error phase candidate extraction part 26 Position Candidate calculation unit 31 Secondary optimality verification unit 4 Frontality verification unit 41 Front discrimination unit 42 Position and orientation correction unit 51 Normalization unit 52 Denormalization unit 61 Recording medium 62 Processor 63 Program memory 101 Optimal solution candidate calculation means 102 Position and orientation Calculation means

Claims

A set of three or more three-dimensional coordinates related to the object and two-dimensional coordinates corresponding to the three-dimensional coordinates on the image is input,
Using the set of the input three-dimensional coordinates and the two-dimensional coordinates, the error is determined with respect to a predetermined error function representing the conversion relationship between the three-dimensional coordinates and the two-dimensional coordinates using a posture of three degrees of freedom as a variable. An optimal solution candidate calculation means for calculating all solutions satisfying the simultaneous polynomial with the function gradient set to zero;
From all the solutions of the simultaneous polynomial calculated by the optimal solution candidate calculating means, an optimal solution that minimizes the error function is extracted, and the position of the camera that has photographed the object based on the extracted optimal solution A position / orientation estimation apparatus comprising: a position / orientation calculation means for calculating an attitude.
The optimal solution candidate calculation means includes:
Coefficient calculating means for taking a set of the three-dimensional coordinates and the two-dimensional coordinates as input and calculating a coefficient of the simultaneous polynomial;
The position / orientation estimation apparatus according to claim 1, further comprising: a simultaneous polynomial solving unit that receives the coefficients of the simultaneous polynomials calculated by the coefficient calculating unit and calculates all solutions satisfying the simultaneous polynomials.
The position / orientation calculation means includes:
Real number solution extracting means for extracting all real number solutions from all solutions of the simultaneous polynomial and outputting a flag indicating all extracted real number solutions or no solution;
Posture candidate calculation means for converting all the real solutions extracted by the real number solution extraction means into one posture candidate;
Based on the number of posture candidates converted by the posture candidate calculation means and the number of input points that is the number of sets of the input three-dimensional coordinates and the two-dimensional coordinates, all of the positions converted by the posture candidate calculation means Minimum error candidate extraction means for extracting one or a plurality of postures that minimize the error function from among the posture candidates,
Position candidate calculation means for calculating a position corresponding to each of the attitudes based on one or a plurality of attitudes with which the error function extracted by the minimum error candidate extracting means is minimized, and outputting each position as a set of position and attitude; The position and orientation estimation apparatus according to claim 1 or 2.
The position / orientation calculation means includes a secondary optimality verification means for calculating and outputting a real number solution satisfying a secondary optimality sufficient condition among all the real number solutions extracted by the real number solution extraction means,
The position / orientation estimation according to claim 3, wherein the attitude candidate calculation unit converts all real solutions determined to satisfy the second-order optimality sufficient condition by the second-order optimality verification unit into one attitude candidate. apparatus.
It is determined whether or not the input three-dimensional coordinates are in front of the camera. If it is determined that the input three-dimensional coordinates are not in front, a position / orientation output from the position / orientation calculation means is corrected or a flag indicating an error is displayed. The position and orientation estimation apparatus according to any one of claims 1 to 4, further comprising a frontality verification means for outputting.
The frontality verification means includes
Plane discriminating means for discriminating whether the three-dimensional coordinates are planar or non-planar and outputting a discrimination result;
If the three-dimensional coordinates, the determination result, and the position / orientation output from the position / orientation calculation means are input and the determination result is a plane, it is determined whether the three-dimensional coordinates are in front of the camera. The position / orientation estimation apparatus according to claim 5, further comprising: a position / orientation correction unit that corrects the position / orientation when it is determined that the position / orientation is not present.
When a set of 3D coordinates and 2D coordinates is input to the position and orientation estimation apparatus, the 3D coordinates are normalized based on a predetermined normalization parameter or a normalization parameter calculated from the 3D coordinates. Normalization means,
A denormalization unit that receives the normalization parameter and the position and orientation calculated using the normalized three-dimensional coordinates, and calculates and outputs a denormalized position and orientation based on the normalization parameter; The position / orientation estimation apparatus according to any one of claims 1 to 6.
Three or more three-dimensional coordinates related to the object and a set of two-dimensional coordinates corresponding to the three-dimensional coordinates on the image are input, and the set of the three-dimensional coordinates and the two-dimensional coordinates input is used to make three freedoms. With respect to a predetermined error function that represents the transformation relationship between the three-dimensional coordinates and the two-dimensional coordinates using the degree attitude as a variable, calculate all solutions satisfying the simultaneous polynomial with the gradient of the error function being zero,
An optimal solution that minimizes the error function is extracted from all the solutions of the calculated simultaneous polynomials, and the position and orientation of the camera that captured the object is calculated based on the extracted optimal solution. A position and orientation estimation method.
On the computer,
Using a set of three or more three-dimensional coordinates related to the input object and two-dimensional coordinates corresponding to the three-dimensional coordinates on the image, the three-dimensional coordinates and the two-dimensional coordinates with a posture of three degrees of freedom as a variable A process for calculating all solutions satisfying the simultaneous polynomial with the gradient of the error function being zero, and all of the calculated solutions of the simultaneous polynomial A position and orientation estimation program for executing processing for extracting an optimal solution that minimizes an error function and calculating a position and orientation of a camera that has photographed the object based on the extracted optimal solution.