WO2023130842A1

WO2023130842A1 - Camera pose determining method and apparatus

Info

Publication number: WO2023130842A1
Application number: PCT/CN2022/132927
Authority: WO
Inventors: 赵德力; 彭登; 陶永康; 傅志刚; 曾阳露
Original assignee: 广东汇天航空航天科技有限公司
Priority date: 2022-01-06
Filing date: 2022-11-18
Publication date: 2023-07-13
Also published as: CN114399532A

Abstract

Embodiments of the present application provide a camera pose determining method and apparatus. The method comprises: obtaining a plurality of image frames collected by a camera; dividing non-key frames into a plurality of image blocks, extracting, from the plurality of image blocks, corner points satisfying a preset feature point condition as feature points, and according to matched feature points between adjacent non-key frames, using an optical flow method to determine a first camera relative pose; extracting feature points from key frames, and according to matched feature points between adjacent key frames, using a feature point matching method to determine a second camera relative pose; and determining, according to the first camera relative pose between the plurality of adjacent non-key frames and the second camera relative pose between the adjacent key frames, an optimized camera pose corresponding to a current key frame. In the embodiments of the present application, pose precision can be improved, and the pose precision and the processing efficiency are balanced.

Description

A camera pose determination method and device

This application claims the priority of the Chinese patent application submitted to the China Patent Office on January 06, 2022, with application number 202210012434.8, and the title of the invention is "A Method and Device for Determining Camera Pose and Orientation", the entire content of which is incorporated by reference in this application middle.

technical field

The present application relates to the technical field of image processing, in particular to a camera pose determining method and a camera pose determining device.

Background technique

Visual odometry estimates mileage based on visual images, and is widely used in the positioning of mobile devices in unknown environments. The visual odometry architecture is usually divided into two parts: the front end and the back end. The front end is responsible for constructing multi-view geometric constraints, and the back end performs nonlinear least squares optimization based on the reprojection error of the front end.

Commonly used front-ends are divided into two visual methods: feature point matching method and optical flow method. The feature point matching method uses the matched feature points to construct the common-view constraint between multiple image frames. The disadvantages of matching. The optical flow method does not calculate and match descriptors, and directly tracks feature points through optical flow, which greatly reduces the computational overhead, but also has the problems of relatively low accuracy and easy tracking. Essentially, both the feature point method and the optical flow method need to rely on the stability of the feature points in the scene. Therefore, in some weakly textured scenes with sparse feature points, the two front-end constraint methods have more or less stability problems.

Contents of the invention

In view of the above problems, the embodiments of the present application are proposed to provide a method for determining a camera pose and a corresponding device for determining a camera pose that overcome the above problems or at least partially solve the above problems.

In order to solve the above problems, the embodiment of the present application discloses a camera pose determination method, including:

Obtain multiple image frames collected by the camera; multiple image frames include key frames and non-key frames;

Divide the non-key frame into multiple image blocks, extract the corner points satisfying the preset feature point conditions from the multiple image blocks as feature points, and use the optical flow method to determine the matching feature points between adjacent non-key frames Relative pose of the first camera;

Extract feature points from the key frames, and determine the relative pose of the second camera by feature point matching method according to the feature points matched between adjacent key frames;

According to the first camera relative pose between multiple adjacent non-key frames and the second camera relative pose between adjacent key frames, an optimized camera pose corresponding to the current key frame is determined.

In some embodiments, the method also includes:

Extract line segments satisfying preset feature line conditions from key frames as feature lines;

Determine the third camera pose between adjacent key frames according to the matching feature lines between adjacent key frames;

According to the relative pose of the first camera between multiple adjacent non-keyframes and the relative pose of the second camera between adjacent keyframes, determine the optimal camera pose corresponding to the current keyframe, including:

According to the first camera relative pose between multiple adjacent non-keyframes, the second camera relative pose between adjacent keyframes and the third camera pose between adjacent keyframes, determine the current keyframe corresponding The optimized camera pose of .

In some embodiments, according to the first camera relative pose between adjacent non-key frames, the second camera relative pose between adjacent key frames and the third camera pose between adjacent key frames , to determine the optimized camera pose corresponding to the current keyframe, including:

Carry out Kalman filter fusion according to the relative pose of the first camera between multiple adjacent non-key frames and the relative pose of the second camera between adjacent key frames, and determine the estimated camera pose corresponding to the current key frame;

According to the estimated camera pose corresponding to the current key frame, determine the feature point reprojection error for the current key frame;

Determine the feature line reprojection error for the current key frame according to the third camera pose between adjacent key frames;

Based on the feature point re-projection error and feature line re-projection error for the current key frame, the optimal camera pose corresponding to the current key frame is determined.

In some embodiments, extracting corner points satisfying preset feature point conditions as feature points from multiple image blocks includes:

Perform low-pass filtering processing on multiple image blocks;

Calculating corner-based response values for the pixels in each image block processed by the low-pass filter;

Respectively sort the corner response values corresponding to each pixel in each image block, and select a preset number of pixels as initial corners according to the sorting result;

Determine the degree of dispersion of the initial corner points in each image block;

Set a corresponding screening ratio for each image block according to the degree of dispersion of the initial corner points in each image block and the corner point response value;

According to the screening ratio of each image block, select candidate corner points from the corresponding initial corner points;

Random sampling consensus algorithm to screen target corners from candidate corners.

In some embodiments, determining the degree of dispersion of initial corner points in each image block includes:

Cluster the initial corner points of the image block to obtain the cluster center;

Determine the pixel distance from each initial corner point to the cluster center, and determine the dispersion degree according to the pixel distance from each initial corner point to the cluster center.

In some embodiments, the degree of dispersion is the pixel distance and value of each initial corner point in the image block, and according to the degree of dispersion of the initial corner points in each image block and the corner response value, a corresponding screening ratio is set for each image block, include:

To the corner point response value of the initial corner point of the image block, find the mean square error sum value;

Calculate the evaluation parameters according to the pixel distance and value of each initial corner point in the image block and the mean square error and value of the corner point response value;

According to the evaluation parameters of each image block, the screening ratio of each image block is determined.

In some embodiments, feature points are extracted from key frames, and the relative pose of the second camera is determined according to feature points matched between adjacent key frames, including:

Set a window with a preset pixel size centered on the feature point in the key frame;

Determine the absolute value of the difference between the gray value of the feature point and the gray value of other pixels in the window;

Generate an adjacency matrix based on the absolute value of the gray difference between the feature point and other pixels in the window;

According to the adjacency matrix, generate the description vector as the descriptor of the feature point;

Determine the matching feature points between adjacent key frames according to the position information and descriptors of the feature points of adjacent key frames;

The relative pose of the second camera between the adjacent key frames is determined according to the matching feature points between the adjacent key frames.

In some embodiments, extracting a line segment satisfying a preset feature line condition from a key frame as a feature line includes:

Detect the line segments in the key frame, and use two non-parallel line segments as line segment pairs;

For line segment pairs, select the line segment pairs that meet the preset feature line conditions as feature lines; the preset feature line conditions include: the length of the line segment is greater than or equal to the preset length threshold, and the distance between the intersection point of the line segment pair and the line segment pair is less than or equal to the preset Distance threshold, the intersection of line segment pairs is within the image.

In some embodiments, according to the matching feature lines between adjacent key frames, determining the third camera pose between adjacent key frames includes:

Take the two non-parallel feature lines in the key frame as the feature line segment pair;

Determining the descriptor of the characteristic line of the characteristic line segment pair with the acute angle bisector of the intersection point of the characteristic line segment pair as the direction quantity of the descriptor, and taking the pixel size of the pixel block at the center of the intersection point as the scale quantity of the descriptor;

Determine the matching feature lines between adjacent key frames according to the position information and descriptors of the feature lines of adjacent key frames;

Based on the matching feature lines between adjacent keyframes, the third camera pose between adjacent keyframes is determined.

The embodiment of the present application also discloses a camera pose determining device, including:

The image acquisition module is used to acquire multiple image frames collected by the camera; the multiple image frames include key frames and non-key frames;

The first pose determination module is used to divide non-key frames into multiple image blocks, extract corner points satisfying preset feature point conditions from multiple image blocks as feature points, and The matched feature points use the optical flow method to determine the relative pose of the first camera;

The second pose determination module is used to extract feature points from key frames, and determine the relative pose of the second camera by using feature point matching method according to feature points matched between adjacent key frames;

The optimized pose determination module is used to determine the optimized camera pose corresponding to the current key frame according to the relative pose of the first camera between multiple adjacent non-key frames and the relative pose of the second camera between adjacent key frames .

The embodiment of the present application also discloses an electronic device, including: a processor, a memory, and a computer program stored on the memory and capable of running on the processor. When the computer program is executed by the processor, the above camera pose determination method is implemented. step.

The embodiment of the present application also discloses a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the above steps of the method for determining the camera pose are realized.

The embodiment of the present application includes the following advantages:

In the embodiment of the present application, multiple image frames collected by the camera can be acquired, including key frames and non-key frames; the non-key frames are divided into multiple image blocks, and corner points that meet the preset feature point conditions are extracted from the multiple image blocks As feature points, and according to the matching feature points between adjacent non-key frames, the optical flow method is used to determine the relative pose of the first camera; the feature points are extracted from the key frames, and the matching feature points between adjacent key frames are used. The feature point matching method determines the relative pose of the second camera; according to the relative pose of the first camera between multiple adjacent non-key frames and the relative pose of the second camera between adjacent key frames, determine the relative pose of the current key frame Optimize camera pose. The embodiment of the present application integrates the respective advantages of the optical flow tracking algorithm and the feature point matching algorithm, and can achieve a balance between pose accuracy and processing efficiency; and by using the optical flow method to determine uniformly distributed feature points for non-key frames, Determining the relative pose of the first camera between adjacent non-key frames can improve the calculation accuracy of the relative pose of the first camera and improve the calculation efficiency and stability of the optical flow method.

Description of drawings

FIG. 1 is a flow chart of the steps of a camera pose determination method provided in an embodiment of the present application;

Fig. 2 is the flowchart of Kalman filtering combined with feature point matching method and optical flow method in the embodiment of the present application;

FIG. 3 is a flow chart of steps of another camera pose determination method provided in an embodiment of the present application;

FIG. 4 is a schematic diagram of uniformly screening feature points from each image block in the embodiment of the present application;

Fig. 5 is a schematic diagram of determining the reprojection error of the characteristic line;

FIG. 6 is a flow chart of a visual odometry optimization method based on camera pose optimization in an embodiment of the present application;

Fig. 7 is a structural block diagram of an apparatus for determining a camera pose provided by an embodiment of the present application.

Detailed ways

In order to make the above objects, features and advantages of the present application more obvious and comprehensible, the present application will be further described in detail below in conjunction with the accompanying drawings and specific implementation methods.

Visual odometer VIO (Visual Inertial Odometry) uses the image collected by the camera and the motion information (including acceleration and angular velocity) detected by the inertial navigation unit IMU (Inertial Measurement Unit) to complement each other. The IMU can accurately measure the motion in a short time. When the front-end tracking quality of the camera is poor between adjacent frames, the IMU measurement can detect the motion between frames and provide constraints between frames to ensure that the system continues to run. At the same time, the fusion of vision and IMU will also make the pose estimation more accurate.

For the visual part, the core idea of the embodiment of the present application is to integrate the advantages of the optical flow tracking algorithm and the feature point matching algorithm to achieve a balance between pose accuracy and processing efficiency; and to determine the uniform distribution of feature points for non-key frames The optical flow method is used to determine the relative pose of the first camera between adjacent non-key frames, which can provide the calculation accuracy of the relative pose of the first camera and improve the calculation efficiency and stability of the optical flow method.

Referring to FIG. 1 , it shows a flow chart of the steps of a method for determining a camera pose provided by an embodiment of the present application. The method may specifically include the following steps:

Step 101, acquiring a plurality of image frames collected by a camera; the plurality of image frames include key frames and non-key frames.

In practical application scenarios, cameras can be installed on mobile devices such as vehicles and aircraft, and image sequences of the surrounding environment can be collected through the cameras. The image sequence may include multiple image frames, the image frames may be key frames and non-key frames, and the key frames are representative image frames. In some embodiments, the key frame selection principle may be as follows: 1. If the number of feature points tracked by the current image frame is less than 20, the new image frame is counted as a key frame. 2. If the average disparity between the current image frame and the previous N and previous N+1 key frames (the value of N can be set by yourself, such as N=10) is greater than a certain threshold (such as 0.02), then the current frame is counted as Keyframe. The specific key frame selection method is not limited in this embodiment of the present application.

Referring to FIG. 2 , it is a flow chart of Kalman filtering combined with the feature point matching method and the optical flow method in the embodiment of the present application. The key frame image is optimized by the feature point matching algorithm to estimate the camera pose, the non-key frame is determined by the optical flow method and accumulated, and the camera pose obtained by the feature point matching algorithm is compared with the camera pose obtained by the optical flow method The pose is fused with a Kalman filter to obtain an optimized camera pose.

Step 102, divide the non-key frame into multiple image blocks, extract the corner points satisfying the preset feature point conditions from the multiple image blocks as feature points, and use light to match feature points between adjacent non-key frames The flow method determines the relative pose of the first camera.

The movement of the object in the real scene will not only cause the corresponding point on the image to move, but also cause the brightness mode of the corresponding pixel on the image to change. The so-called optical flow is the speed of the movement of the pixel brightness pattern on the image, or the pixel brightness change caused by the movement of the light-emitting part of the object and projected onto the image plane. The optical flow method can determine the relative camera pose between adjacent image frames based on the optical flow constraints of pixels between adjacent image frames.

The optical flow method uses corner points for optical flow tracking, and corner points refer to points in the image where the gray value changes greatly. Corner detection methods may include: a Harris corner detection algorithm, an ORB corner detection algorithm, a FAST detection algorithm, and the like.

In the embodiment of the present application, the optical flow method is used to track the feature points in the non-key frames, and the optical flow tracking tends to have the problem that the tracking points are concentrated in a small image area. In this regard, in the embodiment of the present application, the improved corner point detection method can be used to uniformly screen and extract feature points in the image frame, and then the random sample concensum algorithm RANSAC (random sample concensum) can be used to further improve the homogenized corner points. Handling to reduce possible pairing errors.

In the embodiment of the present application, non-key frames can be divided into multiple image blocks, and the size of each image block is the same, and a preset number of feature points are extracted from each image block, which can ensure that the feature points in the entire image Evenly distributed. For example, the image frame may be divided into 9 image blocks of the same size, and the corner points satisfying the preset feature point conditions are separately extracted as feature points for each image block. The preset feature point condition can be used to evenly filter corner points from the image block as feature points.

According to the evenly distributed feature points between adjacent non-key frames, the optical flow method is used to determine the relative pose of the first camera between adjacent non-key frames, which can provide the calculation accuracy of the relative pose of the first camera and improve the optical flow. computational efficiency and stability of the method.

Step 103, extract feature points from the key frames, and determine the relative pose of the second camera by using the feature point matching method according to the feature points matched between adjacent key frames.

The feature point-based matching method needs to extract feature points and determine descriptors, and determine the matching feature points based on the feature point descriptors. Since the matching takes a long time, only the feature point matching method is used for key frames. Based on the feature point matching method, the feature points in the two key frames are used to determine the relative pose of the second camera between the two key frames.

Step 104: Determine an optimized camera pose corresponding to the current key frame according to the first camera relative pose between multiple adjacent non-key frames and the second camera relative pose between adjacent key frames.

The optimal camera position corresponding to the current key frame can be determined according to the relative pose of the first camera between multiple adjacent non-key frames between adjacent key frames and the relative pose of the second camera between adjacent key frames posture.

Referring to FIG. 3 , it shows a flow chart of the steps of another camera pose determination method provided in the embodiment of the present application. The method may specifically include the following steps:

Step 301, acquiring a plurality of image frames collected by a camera; the plurality of image frames include key frames and non-key frames.

Step 302, divide the non-key frame into multiple image blocks, extract the corner points satisfying the preset feature point conditions from the multiple image blocks as feature points, and use light to match feature points between adjacent non-key frames The flow method determines the relative pose of the first camera.

In the embodiment of the present application, the improved Harris corner detection algorithm can be used to divide the non-key frame into multiple image blocks, and extract the corner points satisfying the preset feature point conditions as feature points from the multiple image blocks.

In the embodiment of the present application, the step of extracting corner points satisfying preset feature point conditions as feature points from multiple image blocks may include the following sub-steps:

Sub-step S11, performing low-pass filtering processing on multiple image blocks.

When using the Harris corner detection algorithm, the traditional Harris corner detection algorithm first performs Gaussian smoothing on the image, that is, uses Gaussian Gaussian to filter the image. However, Gaussian smoothing will greatly weaken the high-frequency part (the edge of the object), the edge will slow down, and the image histogram will be compressed. It leads to the loss of corner points when doing non-maximum suppression.

In this regard, the improved Harris corner detection algorithm of the present application can perform low-pass filtering on multiple image blocks first. In some embodiments, a cubic B-spline function with a low-pass characteristic can be used instead of a Gaussian function for smoothing.

Sub-step S12, calculating the cornering response values for the pixels in each image block processed by the low-pass filter.

Assume that the grayscale change caused by the slight movement of the image window is E(u,v)=(u,v)M(u,v) ^T .

This regional change form is similar to the local autocorrelation function. M is the matrix representation of the autocorrelation function, and λ ₁ and λ ₂ are the two eigenvalues of the matrix M, expressed as the first-order curvature of the autocorrelation function.

At the pixel point (x,y), the cornerization response function is as follows: C(x,y)=λ ₁ λ ₂ -α(λ ₁ +λ ₂ ) ² , where α is the verified value, generally set to 0.04 -0.06.

In sub-step S13, respectively sort the cornerization response values corresponding to each pixel in each image block, and select a preset number of pixels as initial corners according to the sorting result.

For example, the N corner response values in each image block can be sorted and the first B=k*N corner points with relatively large values can be selected as the initial corner points to be checked out (k takes a value of 0 to 1), The value of k is different in different image blocks, and the value of k should ensure that each image block has a considerable number of initial corner points.

Sub-step S14, determining the degree of dispersion of the initial corner points in each image block.

The degree of dispersion can represent the dispersion of the initial corner points in the image block. In some embodiments, substep S14 may further include:

Sub-step S141, clustering the initial corner points of the image block to obtain the cluster center;

Sub-step S142, determine the pixel distance from each initial corner point to the cluster center, and determine the dispersion degree according to the pixel distance from each initial corner point to the cluster center.

In some examples, the sum of pixel distances from each initial corner point to the cluster center may be used as the degree of dispersion of the initial corner points in the image block. In other examples, the average value of pixel distances from each initial corner point to the cluster center may be used as the degree of dispersion of the initial corner points in the image block.

Sub-step S15, according to the degree of dispersion of the initial corner points in each image block and the corner point response value, set a corresponding screening ratio for each image block.

Screening ratio is the ratio of screening and retaining corner points from the initial corner points. In order to ensure that corner points of each image block are retained and utilized, and the more scattered and the higher the response value, the corner points should be retained. A corresponding screening ratio can be set for each image block according to the degree of dispersion of the initial corner points in each image block and the corner point response value.

In some embodiments, substep S15 may further include:

Sub-step S151, calculate the mean square error sum value for the corner point response value of the initial corner point of the image block.

In sub-step S152, the evaluation parameters are calculated according to the sum of pixel distances of each initial corner point in the image block and the sum of mean square errors of cornerization response values.

In some embodiments, evaluation parameters

Ω represents the image block; i represents the initial corner index;

Indicates the mean value of the response value of the initial corner point in the block; d _i represents the pixel distance from the initial corner point i to the cluster center.

In sub-step S153, the screening ratio of each image block is determined according to the evaluation parameters of each image block.

In some embodiments, the image blocks may be sorted according to the evaluation parameters of each image block, and the screening ratio of each image block is determined according to the sorting result. Among them, the higher the ranking, the greater the screening ratio.

For example, according to the sorting result based on the evaluation parameter W, determine the screening ratio ηW, where (ηW∈(0, 1), finally extract ηW·B candidate corner points from a single image block.

Referring to Fig. 4, it is a schematic diagram of uniformly screening feature points from each image block in the embodiment of the present application. For image blocks with relatively concentrated initial corner points, more initial corner points are deleted. For image blocks with scattered initial corner points, more initial corner points are retained, and each image block can retain a considerable number of corner points after screening.

Sub-step S16, according to the screening ratio of each image block, select candidate corner points from the corresponding initial corner points.

In sub-step S17, a random sampling consensus algorithm is used to select target corners from candidate corners.

The consensus algorithm can be randomly sampled to remove outliers from the candidate corners and use the retained candidate corners as the target corners.

Random sampling consensus algorithm, in a given set P composed of N data points, it is assumed that most of the points in the set can be generated by a model, and at least n points (n<N) can be fitted. The parameters of the model, the points that do not conform to the fitting model are outliers.

In some embodiments, the following iterative method can be used for fitting, and the specific steps are: a. randomly select n data points from P; b. use these n data points to fit a model M; c. use P For the remaining data points in , calculate the distance between each point and the model M. If the distance exceeds the threshold, it is considered as an outboard point, otherwise it is considered as an internal point, and the tree amount m of the internal point corresponding to the model is recorded; d. Iteration After k times, the model M with the largest m is selected as the fitting result.

Step 303, extract feature points from the key frames, and determine the relative pose of the second camera by using a feature point matching method according to the feature points matched between adjacent key frames.

In the embodiment of this application, step 303 may include the following sub-steps:

Sub-step S21, setting a window with a preset pixel size centered on the feature point in the key frame.

Sub-step S22, determining the absolute value of the difference between the gray value of the feature point and the gray value of other pixels in the window.

In sub-step S23, an adjacency matrix is generated according to the absolute value of the gray difference between the feature point and other pixel points in the window.

Sub-step S24, according to the adjacency matrix, generate a description vector as a descriptor of the feature point.

Sub-step S25, according to the location information and descriptors of the feature points of the adjacent key frames, determine the matching feature points between the adjacent key frames.

In some embodiments, in order to reduce the probability of false matching in the feature point matching process, a 3×3 pixel-sized window can be set with the feature point as the center, and the difference between the window center and the 8-field pixel points and the special gray value can be calculated. The absolute value of the difference. I _p represents the center point, that is, the feature point, and I _xi is the gray value of the 8-field.

I _i ＝|I _p -I _xi |, i＝1,2,...8

Generate an adjacency matrix F,

Use a description vector H to represent the feature point p,

Description vector, where λ _i (p) represents the eigenvalue after decomposition of the adjacency matrix F, and ||λ _p || represents the second norm of the eigenvalue. In the same way, the description vector of the pair point q of the feature point p can be obtained

set up

|H| and |G| are the modulus lengths of the two vectors respectively, and set the threshold t. When v>t, the feature point pairs are kept, otherwise they are deleted.

Sub-step S26, determining the relative pose of the second camera between the adjacent key frames according to the matching feature points between the adjacent key frames.

Step 304: Perform Kalman filter fusion according to the relative pose of the first camera between multiple adjacent non-key frames and the relative pose of the second camera between adjacent key frames, and determine the estimated camera pose corresponding to the current key frame .

Through the Kalman filter algorithm, the results of optical flow tracking are fused with the matching results of optimized feature points. Kalman filtering is performed at each key frame, and the cumulative error of non-key frame optical flow tracking between the two key frames can be corrected by using the high-precision feature point matching results of the two key frames. Using the idea of Kalman filtering and combining the advantages of optical flow tracking algorithm and feature point matching algorithm, the balance between accuracy and efficiency can be achieved.

Kalman filtering is divided into two steps: prediction and update:

In the prediction stage, the relative pose between two key frames is accumulated by the optical flow tracking method as an estimate, and the relative pose of the camera obtained by optimal matching of feature points is used as an observation.

Among them, A _k is the state transition matrix, which represents the camera relative pose transformation accumulated by the optical flow method from the previous key frame to the current key frame; ε _k ~ (0, R) represents the Gaussian noise of the motion equation;

is the corrected state estimation and covariance corresponding to the last key frame; R is the covariance of the noise, set to a constant;

and

is the state estimate and covariance of the current keyframe. z _k is the camera pose obtained by the feature point method at the current moment, C is set as the identity matrix, and δ _k ~ N(0,Q) represents the observation noise. Since the error of the state estimation equation (optical flow method to estimate the pose) is larger than that of the feature point method, the Q value is generally smaller than the R value.

In the update phase, the Kalman gain K is calculated first, and then the state estimation and covariance of the current key frame are corrected to obtain the fused estimated camera pose

and covariance

Step 305, according to the estimated camera pose corresponding to the current key frame, determine the feature point reprojection error for the current key frame.

Based on the position information of the feature point P1 in the previous key frame and the estimated camera pose, the position information corresponding to the feature point P1′ at the moment of the current key frame can be calculated; the feature point P2 matching the feature point P1 can be determined from the current key According to the position information of P1′ and P2, the feature point reprojection error can be calculated. In practice, feature point reprojection errors can be calculated based on multiple feature points between adjacent keyframes.

Step 306, extracting a line segment satisfying a preset feature line condition from the key frame as a feature line.

In the weak texture area above the city and the scene of drastic changes in light, if the visual odometry of the camera is only limited by point features, it is prone to positioning failure and output instability. When looking down on the ground from the sky above the urban area as the scene, the buildings on the ground have strong structural characteristics. The line features formed by the geometric boundaries of buildings can provide directional information that cannot be provided by feature points. Therefore, the introduction of line features can increase the robustness of the system in urban scenes, and through the fusion of feature points and optical flow and the addition of line feature constraints, the problems of positioning failure and output instability can be well solved.

In the embodiment of this application, step 306 may include the following sub-steps:

Sub-step S31, detecting the line segment in the key frame, using two non-parallel line segments as a line segment pair.

Sub-step S32, for the line segment pair, select the line segment pair that meets the preset feature line condition as the feature line; the preset feature line condition includes: the length of the line segment is greater than or equal to the preset length threshold, and the distance between the intersection point of the line segment pair and the line segment pair is less than Or equal to the preset distance threshold, the intersection point of the pair of line segments is within the image.

In some embodiments, for the non-parallel line segments (line segment extension lines) in the image, there must be an intersection point, and usually the number of internal elements of such a line segment set will be very large. In order to better extract the feature line, do the following screening: remove the line segment pair whose distance from the intersection point to the line segment pair is greater than or equal to the preset distance threshold, and keep the line segment pair whose distance from the intersection point to the line segment pair is less than or equal to the preset distance threshold value; remove the intersection point For the line segment pairs outside the image, keep the line segment pairs whose intersection points are within the image; remove the line segment pairs whose length is less than the preset length threshold, and keep the line segment pairs whose length is greater than or equal to the preset length threshold; the obtained line segment after the above screening is a valid feature line segment.

Step 307: Determine the pose of the third camera between the adjacent key frames according to the matching feature lines between the adjacent key frames.

In the embodiment of this application, step 307 may include the following sub-steps:

In sub-step S41, two non-parallel feature lines in the key frame are used as feature line segment pairs.

Sub-step S42, using the acute angle bisector of the intersection point of the feature line segment pair as the direction quantity of the descriptor, and taking the pixel size of the pixel block at the center of the intersection point as the scale quantity of the descriptor, determine the descriptor of the feature line of the feature line segment pair.

In the embodiment of the present application, the description of line segment features may be based on the intersection points of feature line segment pairs. The multi-scale rotation BRIEF descriptor can be calculated based on the intersection point. In some embodiments, the acute angle bisector of the intersection point of the feature line segment pair is used as the direction quantity of the descriptor, and the pixel size of the pixel block in the center of the intersection point is used as the scale of the descriptor Quantity, determine the descriptor of the characteristic line of the characteristic line segment pair.

Sub-step S43, according to the location information and descriptors of the feature lines of adjacent key frames, determine the matching feature lines between adjacent key frames.

Sub-step S44, according to the matching feature lines between adjacent key frames, determine the third camera pose between adjacent key frames.

Referring to FIG. 5 , it is a schematic diagram of determining the reprojection error of the characteristic line. Assume that the line segment l seen in the image frame at time i matches the observed line segment L in the image at time j. However, the line segment ab is obtained by projecting the line segment l at time i to the image at time j through the camera pose (including rotation matrix R and translation matrix T). The reprojection error of the line segment is represented by the distance between the two ports a, b of the projected line segment and the observed straight line L (Ax+By+C=0).

e represents the reprojection error of a certain line segment, and d(a, L) represents the distance from the endpoint a of the projected line segment to the observed straight line L. By optimizing the reprojection error of the line segment, the camera pose RT can be optimized.

Step 308, according to the third camera pose between adjacent key frames, determine the feature line reprojection error for the current key frame.

Step 309, based on the feature point re-projection error and feature line re-projection error for the current key frame, determine the optimal camera pose corresponding to the current key frame.

The nonlinear least squares optimization can be performed on the feature point reprojection error and the feature line reprojection error, so as to determine the optimized camera pose corresponding to the current key frame.

In the embodiment of the present application, multiple image frames collected by the camera can be obtained, including key frames and non-key frames; the non-key frames are divided into multiple image blocks, and angles satisfying preset feature point conditions are extracted from multiple image blocks. Points are used as feature points, and the optical flow method is used to determine the relative pose of the first camera according to the matching feature points between adjacent non-key frames; the feature points are extracted from the key frames, and the matching feature points between adjacent key frames Using the feature point matching method to determine the relative pose of the second camera; performing Kalman filter fusion according to the relative pose of the first camera between multiple adjacent non-key frames and the relative pose of the second camera between adjacent key frames, Determine the estimated camera pose corresponding to the current key frame; determine the feature point reprojection error for the current key frame according to the estimated camera pose corresponding to the current key frame; extract a line segment that meets the preset feature line condition from the key frame as a feature line; According to the matching feature lines between adjacent key frames, determine the third camera pose between adjacent key frames; according to the third camera pose between adjacent key frames, determine the feature line reprojection for the current key frame Error: Determine the optimal camera pose corresponding to the key frame based on the feature point reprojection error and the feature line reprojection error for the key frame. In the embodiment of the present application, the relative pose of the first camera between adjacent non-key frames can be determined by using the optical flow method to process the evenly distributed feature points between adjacent non-key frames, and the relative pose of the first camera can be provided. Calculation accuracy, improving the calculation efficiency and stability of the optical flow method. Using the idea of Kalman filtering, it combines the advantages of optical flow tracking algorithm and feature point matching algorithm to achieve a balance between accuracy and efficiency. And the introduction of line features can increase the robustness of the system in urban scenes, and the stability of pose estimation can be improved by fusing feature points and optical flow and adding line feature constraints.

In order to enable those skilled in the art to better understand the embodiment of the present application, an example is used to illustrate the embodiment of the present application below: Referring to FIG. flow chart.

In this example, a binocular camera is installed in the vehicle to collect images, and an inertial navigation unit is used to collect motion information. There can be a rigid connection between the camera and the inertial navigation unit.

For the inertial navigation unit part, the motion information (including acceleration and angular velocity) is measured by the inertial navigation unit, and the multiple motion information between image frames is combined into one observation value output by the method of pre-integration, and the camera motion accumulation is obtained; according to the camera motion accumulation , to determine the pre-integration error; finally, the pre-integration error is sent to the backend server for optimization.

For the visual part, first initialize to obtain the image frame; determine whether the image frame is a key frame or a non-key frame; if it is a non-key frame, you can use the improved corner detection method in the embodiment of the present application to detect corner points and use them as feature points ; Use the tube optical flow method to track the feature points in the adjacent non-key frames, and obtain the cumulative change of the camera pose. For the key frame, on the one hand, feature points are extracted from the key frame, and the descriptor of the feature point is calculated; the feature point matching method is used to determine the relative pose of the camera according to the feature points matched between adjacent key frames; the camera relative pose determined by the optical flow method is used The pose changes are accumulated and the camera relative pose determined according to the feature point matching method is subjected to Kalman filtering to obtain the camera pose corresponding to the current key frame. On the other hand, the feature line is detected from the key frame, the descriptor of the feature line is calculated, the matching is performed according to the matching feature line between adjacent key frames, and the camera pose corresponding to the current key frame is solved.

Using the camera pose calculated based on the feature points and the IMU motion accumulation between two key frames detected by the inertial navigation unit as the camera constraints, the reprojection square error of the feature points is determined. The camera pose based on the feature line and the IMU motion accumulation between two key frames detected by the inertial navigation unit are used as camera constraints to determine the feature line reprojection square error. On the back-end server, optimize according to the square error of feature point reprojection, square error of feature line reprojection and pre-integration error, and finally obtain the optimized camera pose corresponding to the current key frame.

It should be noted that, for the method embodiment, for the sake of simple description, it is expressed as a series of action combinations, but those skilled in the art should know that the embodiment of the present application is not limited by the described action sequence, because According to the embodiment of the present application, certain steps may be performed in other orders or simultaneously. Secondly, those skilled in the art should also know that actions involved in some embodiments described in the specification are not necessarily required by the embodiments of the present application.

Referring to FIG. 7 , it shows a structural block diagram of a camera pose determination device provided in an embodiment of the present application, which may specifically include the following modules:

An image acquisition module 701, configured to acquire multiple image frames collected by the camera; multiple image frames include key frames and non-key frames;

The first pose determination module 702 is configured to divide non-key frames into multiple image blocks, extract corner points satisfying preset feature point conditions from multiple image blocks as feature points, and The matching feature points are determined by the optical flow method to determine the relative pose of the first camera;

The second pose determination module 703 is used to extract feature points from key frames, and determine the relative pose of the second camera by feature point matching method according to feature points matched between adjacent key frames;

An optimized pose determination module 704, configured to determine an optimized camera position corresponding to the current key frame according to the first camera relative pose between a plurality of adjacent non-key frames and the second camera relative pose between adjacent key frames posture.

In the embodiment of the present application, multiple image frames collected by the camera can be acquired, including key frames and non-key frames; the non-key frames are divided into multiple image blocks, and corner points that meet the preset feature point conditions are extracted from the multiple image blocks As feature points, and according to the matching feature points between adjacent non-key frames, the optical flow method is used to determine the relative pose of the first camera; the feature points are extracted from the key frames, and the matching feature points between adjacent key frames are used. The feature point matching method determines the relative pose of the second camera; according to the relative pose of the first camera between multiple adjacent non-key frames and the relative pose of the second camera between adjacent key frames, determine the relative pose of the current key frame Optimize the camera pose. The embodiment of the present application integrates the respective advantages of the optical flow tracking algorithm and the feature point matching algorithm, and can achieve a balance between pose accuracy and processing efficiency; and by using the optical flow method to determine uniformly distributed feature points for non-key frames, Determining the relative pose of the first camera between adjacent non-key frames can improve the calculation accuracy of the relative pose of the first camera and improve the calculation efficiency and stability of the optical flow method.

In the embodiment of the present application, the device for determining the camera pose may also include:

A feature line extraction module is used to extract a line segment satisfying a preset feature line condition from a key frame as a feature line;

The third pose determination module is used to determine the third camera pose between adjacent key frames according to the matching feature lines between adjacent key frames;

The optimized pose determination module includes:

Optimizing the pose determination sub-module, which is used to determine the relative pose of the first camera between a plurality of adjacent non-key frames, the relative pose of the second camera between adjacent key frames and the third relative pose of the camera between adjacent key frames Camera pose, which determines the optimized camera pose corresponding to the current keyframe.

In the embodiment of the present application, the optimal pose determination submodule may include:

The estimated pose determination unit is used to perform Kalman filter fusion according to the relative pose of the first camera between a plurality of adjacent non-key frames and the relative pose of the second camera between adjacent key frames, so as to determine that the current key frame corresponds to The estimated camera pose of ;

The first error determination unit is used to determine the feature point reprojection error for the current key frame according to the estimated camera pose corresponding to the current key frame;

The second error determination unit is used to determine the feature line reprojection error for the current key frame according to the third camera pose between adjacent key frames;

An optimized pose determining unit is configured to determine an optimized camera pose corresponding to the current key frame based on the feature point re-projection error and the feature line re-projection error for the current key frame.

In the embodiment of the present application, the first pose determining module may include:

A filtering submodule is used to perform low-pass filtering processing on multiple image blocks;

The response value calculation sub-module is used to calculate the corner response value for the pixels in each image block processed by the low-pass filter;

The initial corner selection submodule is used to sort the corner response values corresponding to each pixel in each image block, and select a preset number of pixels as the initial corner according to the sorting result;

The degree of dispersion determination submodule is used to determine the degree of dispersion of the initial corner points in each image block;

The screening ratio determination sub-module is used to set the corresponding screening ratio for each image block according to the degree of dispersion of the initial corner points in each image block and the corner point response value;

The candidate corner point determination submodule is used for screening candidate corner points from corresponding initial corner points according to the screening ratio of each image block;

The target corner screening sub-module is used for the random sampling consensus algorithm to screen target corners from candidate corners.

In the embodiment of the present application, the sub-module for determining the degree of dispersion may include:

The clustering unit is used to cluster the initial corner points of the image block to obtain the cluster center;

The degree of dispersion determination unit is configured to determine the pixel distance from each initial corner point to the cluster center, and determine the degree of dispersion according to the pixel distance from each initial corner point to the cluster center.

In the embodiment of the present application, the degree of dispersion is the pixel distance and value of each initial corner point in the image block, and the screening ratio determination submodule may include:

The response value processing unit is used for calculating the mean square error sum value for the corner point response value of the initial corner point of the image block;

An evaluation parameter calculation unit is used to calculate the evaluation parameter according to the pixel distance and value of each initial corner point in the image block and the mean square error and value of the corner response value;

The screening ratio determining unit is configured to determine the screening ratio of each image block according to the evaluation parameters of each image block.

In the embodiment of the present application, the second pose determination module may include:

The window setting submodule is used to set a window with a preset pixel size centered on the feature point in the key frame;

The difference determination submodule is used to determine the absolute value of the difference between the gray value of the feature point and the gray value of other pixels in the window;

The matrix generation module is used to generate an adjacency matrix according to the absolute value of the gray difference between the feature point and other pixels in the window;

The first descriptor determining submodule is used to generate a description vector as a descriptor of a feature point according to an adjacency matrix;

The feature point matching submodule is used to determine the matching feature points between adjacent key frames according to the position information and descriptors of the feature points of adjacent key frames;

The second pose determination sub-module is used to determine the relative pose of the second camera between adjacent key frames according to the matching feature points between adjacent key frames.

In the embodiment of the present application, the feature line extraction module may include:

The line segment pair selection sub-module is used to detect the line segment in the key frame, and two non-parallel line segments are used as the line segment pair;

The line segment pair screening sub-module is used to filter the line segment pairs and select the line segment pairs that meet the preset feature line conditions as feature lines; the preset feature line conditions include: the length of the line segment is greater than or equal to the preset length threshold, and the intersection point of the line segment pair is relative to the line segment The distance of the pair is less than or equal to the preset distance threshold, and the intersection point of the pair of line segments is within the image.

In the embodiment of the present application, the third pose determining module may include:

A feature line segment pair determination submodule is used to use two non-parallel feature lines in the key frame as a feature line segment pair;

The second descriptor determination sub-module is used to determine the feature of the feature line segment pair by taking the acute angle bisector of the intersection point of the feature line segment pair as the direction quantity of the descriptor, and taking the pixel size of the pixel block at the center of the intersection point as the scale quantity of the descriptor line descriptor;

The characteristic line matching submodule is used to determine the matching characteristic lines between adjacent key frames according to the position information and descriptors of the characteristic lines of adjacent key frames;

The third pose determination sub-module is configured to determine the pose of the third camera between adjacent key frames according to the matching feature lines between adjacent key frames.

As for the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.

The embodiment of the present application also provides an electronic device, including:

It includes a processor, a memory, and a computer program stored on the memory and capable of running on the processor. When the computer program is executed by the processor, it realizes the various processes of the above-mentioned camera pose determination method embodiment, and can achieve the same technical effect. To avoid repetition, details are not repeated here.

The embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored. When the computer program is executed by a processor, each process of the above-mentioned embodiment of the camera pose determination method can be realized, and the same can be achieved. Technical effects, in order to avoid repetition, will not be repeated here.

Each embodiment in this specification is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same and similar parts of each embodiment can be referred to each other.

Those skilled in the art should understand that the embodiments of the embodiments of the present application may be provided as methods, devices, or computer program products. Therefore, the embodiment of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

Embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor or processor of other programmable data processing terminal equipment to produce a machine such that instructions executed by the computer or processor of other programmable data processing terminal equipment Produce means for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing terminal to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the The instruction means implements the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

These computer program instructions can also be loaded into a computer or other programmable data processing terminal equipment, so that a series of operational steps are performed on the computer or other programmable terminal equipment to produce computer-implemented processing, thereby The instructions executed above provide steps for implementing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

While a few examples of the embodiments of the present application have been described, additional changes and modifications can be made to these examples by those skilled in the art once the basic inventive concept is understood. Therefore, the appended claims are intended to be interpreted to include some of the embodiments and all changes and modifications that fall within the scope of the embodiments of the application.

Finally, it should also be noted that in this text, relational terms such as first and second etc. are only used to distinguish one entity or operation from another, and do not necessarily require or imply that these entities or operations, any such actual relationship or order exists. Furthermore, the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article, or terminal equipment comprising a set of elements includes not only those elements, but also includes elements not expressly listed. other elements identified, or also include elements inherent in such a process, method, article, or terminal equipment. Without further limitations, an element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article or terminal equipment comprising the element.

A camera pose determination method and a camera pose determination device provided by this application have been introduced above in detail. In this paper, some examples are used to illustrate the principles and implementation methods of this application. The description of the above embodiments is only It is used to help understand the method and its core idea of this application; at the same time, for those of ordinary skill in the art, according to the idea of this application, there will be changes in the specific implementation and application scope. In summary, the content of this specification It should not be construed as a limitation of the application.

Claims

A camera pose determination method, characterized in that, comprising:

Obtain a plurality of image frames collected by the camera; the plurality of image frames include key frames and non-key frames;

Dividing the non-key frame into multiple image blocks, extracting corner points satisfying preset feature point conditions from the multiple image blocks as feature points, and using The optical flow method determines the relative pose of the first camera;

Extract feature points from the key frames, and determine the relative pose of the second camera by using a feature point matching method according to feature points matched between adjacent key frames;

An optimized camera pose corresponding to the current key frame is determined according to the first camera relative pose between the multiple adjacent non-key frames and the second camera relative pose between the adjacent key frames.
The method according to claim 1, further comprising:

Extracting a line segment satisfying a preset feature line condition from the key frame as a feature line;

Determine the third camera pose between adjacent key frames according to the matching feature lines between adjacent key frames;

The determining the optimized camera pose corresponding to the current key frame according to the first camera relative pose between the adjacent non-key frames and the second camera relative pose between the adjacent key frames includes :

According to the first camera relative pose between the adjacent non-key frames, the second camera relative pose between the adjacent key frames and the third camera pose between the adjacent key frames, determine The optimized camera pose corresponding to the current keyframe.
The method according to claim 2, characterized in that, according to the first camera relative pose between the adjacent non-key frames, the second camera relative pose between the adjacent key frames and the third camera pose between adjacent keyframes to determine the optimized camera pose corresponding to the current keyframe, including:

Perform Kalman filter fusion according to the relative poses of the first camera between multiple adjacent non-key frames and the relative pose of the second camera between the adjacent key frames, and determine the estimated camera position corresponding to the current key frame posture;

Determine the feature point reprojection error for the current key frame according to the estimated camera pose corresponding to the current key frame;

determining a feature line reprojection error for the current key frame according to a third camera pose between the adjacent key frames;

Based on the feature point re-projection error and the feature line re-projection error for the current key frame, an optimized camera pose corresponding to the current key frame is determined.
The method according to claim 1, wherein said extracting corner points satisfying preset feature point conditions as feature points from said plurality of image blocks comprises:

performing low-pass filtering processing on the plurality of image blocks;

Calculating corner-based response values for the pixels in each image block processed by the low-pass filter;

Respectively sort the corner response values corresponding to each pixel in each image block, and select a preset number of pixels as initial corners according to the sorting result;

Determine the degree of dispersion of the initial corner points in each image block;

Set a corresponding screening ratio for each image block according to the degree of dispersion of the initial corner points in each image block and the corner point response value;

According to the screening ratio of each image block, select candidate corner points from the corresponding initial corner points;

A random sampling consensus algorithm is used to screen target corner points from the candidate corner points.
The method according to claim 4, wherein said determining the degree of dispersion of the initial corner points in each image block comprises:

clustering the initial corner points of the image blocks to obtain cluster centers;

Determine the pixel distance from each initial corner point to the cluster center, and determine the dispersion degree according to the pixel distance from each initial corner point to the cluster center.
The method according to claim 5, wherein the degree of dispersion is the pixel distance and value of each initial corner point in the image block, and the degree of dispersion and the corner response value of the initial corner point in each image block are , set the corresponding screening ratio for each image block, including:

To the corner point response value of the initial corner point of the image block, find the mean square error sum value;

Calculate the evaluation parameter according to the pixel distance sum value of each initial corner point in the image block and the mean square error sum value of the corner point response value;

According to the evaluation parameters of each image block, the screening ratio of each image block is determined.
The method according to claim 1, wherein the feature points are extracted from the key frames, and the second camera relative pose is determined according to matching feature points between adjacent key frames, comprising:

Set a window with a preset pixel size centered on the feature point in the key frame;

Determine the absolute value of the difference between the gray value of the feature point and the gray value of other pixels in the window;

Generate an adjacency matrix according to the absolute value of the gray difference between the feature point and other pixels in the window;

According to the adjacency matrix, generate a description vector as a descriptor of the feature point;

According to the position information and the descriptor of the feature points of the adjacent key frames, determine the matching feature points between the adjacent key frames;

The relative pose of the second camera between the adjacent key frames is determined according to the matching feature points between the adjacent key frames.
The method according to claim 2, wherein said extracting a line segment satisfying a preset feature line condition from said key frame as a feature line comprises:

Detecting the line segments in the key frame, using two non-parallel line segments as line segment pairs;

For the line segment pair, select the line segment pair satisfying the preset feature line condition as the feature line; the preset feature line condition includes: the length of the line segment is greater than or equal to the preset length threshold, and the distance between the intersection point of the line segment pair and the line segment pair is less than Or equal to the preset distance threshold, the intersection point of the pair of line segments is within the image.
The method according to claim 8, wherein said determining the third camera pose between adjacent key frames according to the matching feature lines between adjacent key frames comprises:

Take the two non-parallel feature lines in the key frame as the feature line segment pair;

Using the acute angle bisector of the intersection point of the characteristic line segment pair as the direction quantity of the descriptor, and taking the pixel size of the pixel block at the center of the intersection point as the scale quantity of the descriptor, determine the descriptor of the characteristic line of the characteristic line segment pair;

Determine the matching feature lines between adjacent key frames according to the position information and descriptors of the feature lines of adjacent key frames;

Based on the matching feature lines between adjacent keyframes, the third camera pose between adjacent keyframes is determined.
A device for determining a camera pose, characterized in that it comprises:

An image acquisition module, configured to acquire multiple image frames collected by the camera; the multiple image frames include key frames and non-key frames;

The first pose determination module is used to divide the non-key frame into a plurality of image blocks, extract corner points satisfying preset feature point conditions from the plurality of image blocks as feature points, and according to adjacent non-key frames The matching feature points between the key frames use the optical flow method to determine the relative pose of the first camera;

The second pose determination module is used to extract feature points from the key frames, and determine the relative pose of the second camera by using feature point matching method according to feature points matched between adjacent key frames;

An optimized pose determination module, configured to determine the relative pose of the current key frame according to the relative pose of the first camera between multiple adjacent non-key frames and the relative pose of the second camera between the adjacent key frames. Optimize camera pose.
An electronic device, characterized by comprising: a processor, a memory, and a computer program stored on the memory and capable of running on the processor, when the computer program is executed by the processor, it realizes the claims The steps of the camera pose determining method described in any one of 1-9.
A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the camera pose according to any one of claims 1-9 is realized Determine the steps of the method.