CN113012298B

CN113012298B - Curved MARK three-dimensional registration augmented reality method based on region detection

Info

Publication number: CN113012298B
Application number: CN202011563089.4A
Authority: CN
Inventors: 张明敏; 陈忠庆; 潘志庚
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2020-12-25
Filing date: 2020-12-25
Publication date: 2022-04-08
Anticipated expiration: 2040-12-25
Also published as: CN113012298A

Abstract

The invention discloses a curved MARK three-dimensional registration augmented reality method based on region detection. Partial occlusion can be done for the curved MARK without affecting the final effect. The method provided by the invention overcomes the problems that the traditional plane MARK can not be bent, so that the consistency of a cylindrical object on an augmented reality scene is damaged, the robustness of the natural texture MARK is low, the real-time performance is low and the like.

Description

Curved MARK three-dimensional registration augmented reality method based on region detection

Technical Field

The invention belongs to the field of intersection of computer vision technology and graphics, and particularly relates to a curved MARK three-dimensional registration augmented reality method based on region detection.

Background

With the increasing development and maturity of the internet and the iterative update of multimedia technology, Augmented Reality (AR) is more and more common in our daily life and learning. Augmented reality is a very practical technology combining computer graphics and computer vision, and can overlay information such as virtual objects, videos and characters to a real scene so that a user can acquire more information in the scene, and the user can understand the scene more deeply and clearly.

The augmented reality has wide applications in daily life, such as teaching demonstrations, tour navigation, virtual shopping, workshop guidance, and the like. In the teaching field, the augmented reality can bring safer and more interesting teaching experiments for students, and improve the interest and the practical ability of the students. Because the dangerousness or the imperceptibility of many experiments, these experiments are often neglected in the teaching process, this has promoted the application of augmented reality in the teaching process, and the student can use the interaction of virtual object and real object that superpose in the augmented reality to accomplish the dangerousness experiment and observe more detailed experimental effect, all has very big promotion to student's hands-on ability and theoretical cognition. For the tourism industry, the augmented reality can also enable the user to obtain more direct and vivid explanation on the mobile terminal, and the interestingness and the interactivity of navigation are enhanced.

The augmented reality system mainly relates to technologies such as three-dimensional registration, user interaction, virtual-real fusion and the like, wherein the three-dimensional registration plays a decisive role in development and popularization of the augmented reality system, and the method mainly has the function of estimating the relative pose of a camera in a scene and then superimposing a virtual object on the real scene. Three-dimensional registration is not satisfactory in the experience degree of users at present due to the problems of instantaneity, robustness, stability, attractiveness and the like, so that the deep exploration of the three-dimensional registration technology has a profound significance in the development process of augmented reality for the research and development of the three-dimensional registration technology to be a hot topic in the field of augmented reality nowadays.

In the augmented reality system, in order to generate the visual effect of virtual-real fusion, registration alignment of virtual-real environment is firstly ensured. The most commonly used method in the augmented reality system is that the virtual and real environments share the same spatial coordinate system, so that virtual objects can be rendered in a scene to achieve the effect of virtual-real interaction. In augmented reality systems, cameras are generally used as main sensors, and the positions of virtual objects to be rendered are acquired by estimating the relative poses of the cameras in real time through a three-dimensional registration technology.

The three-dimensional registration technique most commonly used today is the vision-based registration technique, with planar MARKs being the most commonly used. However, for a planar MARK with a curved surface, such as a cylinder, the aesthetic property of the MARK is damaged, so that the immersion of a user is greatly reduced, and therefore, the three-dimensional registration based on the curved MARK is not very significant for the development of augmented reality. The category of the MARK mainly includes an artificial MARK and a MARK based on a natural texture, wherein the artificial MARK such as a hamming code, a two-dimensional code, and the like has the disadvantages of being not shelterable and not being able to obtain a correct pose after being bent, so the MARK based on the natural texture becomes a unique choice for realizing the bent MARK.

Disclosure of Invention

The invention aims to apply a curved MARK three-dimensional registration technology to the field of augmented reality, and provides a curved MARK three-dimensional registration augmented reality method based on region detection. The region where the MARK is located is obtained through the neural network model, three-dimensional registration is carried out on the bent MARK, meanwhile, the MARK can be partially shielded, and the attractiveness and the real-time performance of augmented reality cannot be damaged.

The method is based on a region detection technology, the region where the bent MARK is located in the scene is obtained, and a three-dimensional model formed by attaching the MARK to the cylinder after bending is built according to the radius of the cylinder object and the coordinates of the MARK plane feature points. Coordinates of the curved MARK feature points are acquired in a scene, and the relative pose between the camera and the MARK is restored through a PnP algorithm, so that the virtual object is rendered in the scene.

The technical scheme adopted by the invention for solving the technical problems is as follows:

a curved MARK three-dimensional registration augmented reality method based on region detection comprises the following steps:

step (1), calibrating a camera:

acquiring internal parameters and distortion parameters of the RGB monocular camera by using a Zhang Zhen friend camera calibration method;

step (2), constructing a data set:

the natural texture MARK to be identified is shot by more than 300 pictures under different angles, different distances, different illumination, partial shielding and non-shielding conditions respectively, wherein 80% of the pictures are used as a training set, and the rest 20% of the pictures are used as a verification set. Calibrating a border (bounding box) and a class (classes) of a natural texture MARK in a picture by using labelImg software to generate a corresponding xml format file, framing out an area where the natural texture MARK is located by using a rectangular frame in the calibration process, and then calibrating the class corresponding to the natural texture MARK;

step (3), a neural network model of Yolov5 is used as a natural texture MARK target detection model, the training set constructed in the step (2) is adopted for training, the accuracy of the model is verified through a verification set, the trained model is extracted, and the frame of MARK in a scene can be identified and a specific type can be identified through the trained natural texture MARK target detection model;

step (4), printing the MARK, pasting the MARK on a cylindrical object, and simultaneously measuring the radius r of the cylinder and the width and height of the MARK picture with the natural texturer _ w, marker _ h, extracting MARK picture feature points by using a Fast algorithm, calculating three-dimensional coordinates of each feature point relative to a MARK central point, and calculating an included angle between a feature point with (x, y) coordinates and a cylinder central connecting line for the feature point with (x, y) coordinates in the MARK picture

The pixel to millimeter conversion scale is pixel2mm, solving for the equation (1):

the corresponding three-dimensional coordinates are (the coordinate units here are all millimeters):

storing the three-dimensional coordinates of all the feature points in a dictionary, wherein the keys of the dictionary are the coordinates (x, y) of a MARK picture, and the values are the coordinates obtained by the formulas (2), (3) and (4);

and (5) for a scene picture obtained by a camera, extracting a natural texture MARK in the scene picture by using a natural texture MARK target detection model to generate a Region Of Interest (ROI Region), extracting all feature points in the ROI Region by adopting a Fast algorithm, calculating a descriptor by using an ORB algorithm, matching the extracted feature points Of the ROI Region with original MARK feature points by using a RANSAC and K nearest neighbor classification algorithm (KNN) on the basis Of calculating the hamming distance Of the descriptor, obtaining 30 most matched feature point matching pairs, obtaining three-dimensional coordinates Of the feature points Of the MARK picture from a dictionary in the step (4), and estimating the relative pose Of the camera and the curved MARK by using a PNP algorithm, wherein the specific implementation is as follows:

point X of world coordinate system_w＝(x_w,y_w,z_w1), with its projection coordinates X on the image plane_i＝(x_i,y_iThe relation of 1) is expressed by the following formula, wherein fx, fy, cx and cy are camera internal parameters calibrated by Zhangyingyou, and r_ijRepresenting a rotational variable, t_iRepresents the translation variables:

the method is simplified as follows:

wherein, λ represents a scale factor, the matrix K is a camera internal parameter matrix, and the matrix M is a model viewpoint matrix. Randomly selecting 4 characteristic point matching point pairs from 30 characteristic point matching pairs, calculating 4 groups of different solutions by using 3 point pairs, substituting the rest 1 point pairs into a formula, solving the solution with the minimum reprojection error into a final solution, and optimizing the final solution by using a random sample consensus (RANSAC) algorithm in the process;

step (6), in the MARK moving process, tracking the moving state of the feature points by using an optical flow method, judging the moving quantity of the feature points of the same two frames, when the moving quantity is less than or equal to ten percent of the total feature point quantity, considering that the pose of the marker does not change relative to the previous frame, and when the moving quantity is greater than ten percent of the total feature point quantity, considering that the pose of the marker changes relative to the previous frame, and then following the step (5) to obtain the pose of the current natural texture MARK for three-dimensional registration, specifically realizing the following steps:

and setting I and J as the gray level images of the previous frame and the current frame, then:

wherein, the point A is any point in the image, and the coordinate vector is (x, y)^TFor a point u on the previous frame I ═ u_x,u_y]^TThe purpose of feature point tracking is to find its position v + u + d in the current frame image_x+d_x,u_y+d_y]^TD is ═ d_x,d_y]^TIs the image velocity at point a, i.e. the optical flow at point a. Defining the concept of similarity in the two-dimensional neighborhood sense due to the influence of the aperture, setting ω_xAnd ω_yFor two integer values, the minimized residual function for the velocity vector d is defined as follows:

the similarity definition can be obtained through the formula, and the similarity definition is based on the image neighborhood size of (2 omega)_x+1)×(2ω_y+1), solving d to obtain the corresponding position of the point u in the image J; omega_xAnd ω_yIs 2, 3, 4, 5, 6 or 7.

Comparing the position of the feature point calculated by the current frame with the position of the feature point corresponding to the previous frame, judging whether the feature points of two adjacent frames of the camera move or not, counting the number of the moved feature points, if the number of the feature points of the current frame is less than or equal to ten percent of the number of the total feature points, considering that the object does not move relative to the object of the previous frame, and directly acquiring the pose of the previous frame, if the number of the feature points of the current frame is greater than ten percent of the number of the total feature points, considering that the object moves relative to the object of the previous frame, and recalculating the relative pose of the camera relative to the object;

and (7) in the process of estimating the relative pose of the camera in the step (5), predicting and correcting the 6D pose of the MARK by using Kalman filtering.

First defining the displacement of the camera with respect to the natural texture MARK (t)_x,t_y,t_z) And the rotation angles (psi, theta, phi), the first derivative of the coordinates being (t)_x',t_y',t_z') and the second derivative of the coordinates is (t)_x”,t'_y',t_z") where the first derivative represents the speed at which the natural texture MARK moves and the second derivative represents the acceleration at which the natural texture MARK moves, the first derivative of the rotation angle is (ψ ', θ ', φ '), and the second derivative is (ψ", θ ", φ"), where the first derivative represents the speed at which the MARK rotates and the second derivative represents the acceleration at which the natural texture MARK rotates. The kalman filtering may be used for estimation and correction, and the specific formula is as follows:

Kalman＝(t_x,t_y,t_z,t_x′,t_y′,t_z′,t_x″,t_y″,t_z″,ψ,θ,φ,ψ′,θ′,φ′,ψ″,θ″,φ″) (9)

and (8) eliminating the frame with the wrong pose estimation by using a sliding window.

And judging whether the current estimated camera pose is correct or not by the camera pose coordinates of the last two frames and the first two frames relative to the current frame, so as to eliminate the frame with wrong estimation of the camera pose caused by blurring in the motion process of the natural texture MARK. The displacement in the 6D pose estimation of the current frame camera is (x)^t,y^t,z^t) (parameters with different meanings need to be represented by different symbols), calculating the average displacement (x ', y ', z ') of the cameras of the first two frames and the average displacement (x ", y", z ") of the cameras of the second two frames, wherein the current displacement satisfies:

x"-d_t＜x^t＜x'+d_t orx'-d_t＜x^t＜x"+d_t

y"-d_t＜y^t＜y'+d_t ory'-d_t＜y^t＜y"+d_t (10)

z"-d_t＜z^t＜z'+d_t or z'-d_t＜z^t＜z"+d_t

considering the current pose as an effective pose, otherwise considering the current frame as a fuzzy frame, and continuously using the last effective pose, wherein d_tThreshold adjusted for translation, d_t＝3。

And (9) after the relative pose between the camera and the natural texture MARK is obtained through the steps (5), (6), (7) and (8), the virtual object needing three-dimensional registration is subjected to translation and rotation transformation, and the virtual object is rendered into a scene through OpenGL and OpenCV to achieve the effect of augmented reality.

The invention has the beneficial effects that:

the method comprises the steps of attaching a two-dimensional natural texture MARK to a cylinder to form a curved MARK, processing the curved MARK through a neural network model to obtain the region where the curved MARK is located in the current scene, calculating the relative pose between a camera and an object through feature point matching, and rendering a virtual object into an augmented reality scene. Partial occlusion can be done for the curved MARK without affecting the final effect. The method solves the problems that the traditional plane MARK can not be bent, so that the consistency of the cylindrical object to the augmented reality scene is damaged, the robustness of the natural texture MARK is low, the real-time performance is low and the like.

Drawings

FIG. 1 is a picture of a natural texture MARK according to an embodiment of the present invention;

FIG. 2 is a diagram illustrating feature points detected in a MARK picture according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating the calculation of three-dimensional coordinates of feature points on a MARK according to an embodiment of the present invention;

FIG. 4 is a comparison diagram of feature points between two adjacent frames according to an embodiment of the present invention;

FIG. 5 is an effect diagram of a virtual object rendered to an assigned pose in a scene according to an embodiment of the present invention;

FIG. 6 is a flowchart of a method according to an embodiment of the present invention.

Detailed Description

The method of the present invention is further described below with reference to the accompanying drawings.

The experimental environment is a monocular RGB video camera (640 x 480), a cylindrical object, a natural texture MARK picture is printed and attached to the cylindrical object, and the part of the MARK on the cylindrical object is always aligned to the monocular camera in the experimental process.

As shown in fig. 1, a picture with sharp corners and irregular natural texture features with multiple features is used as a MARK, and a symmetrical picture is not selected, so that a large number of pictures are selected, and the pictures have obvious differences.

As shown in fig. 2, all feature points in the MARK picture are calculated by using Fast algorithm, and the specific steps are as follows:

step (a), selecting a pixel Q from the MARK picture, and setting the brightness value of the pixel Q as I to judge whether the pixel is a characteristic point or not_q；

Step (b), a Bresenham circle is obtained by taking the pixel point Q as the center and the radius of the Bresenham circle as 3, and the circle has 16 pixels;

step (c), on the circle with the size of 16 pixels, if the pixel values of 9 continuous pixel points are all larger than I_q+ t or both less than I_qAnd t, the pixel point Q is considered as a characteristic point, and the t is a set threshold value.

Step (d), in order to improve the judgment efficiency of the angular points to eliminate pixels of non-angular points in the image, checking corresponding pixels according to four positions of 1, 9, 5 and 13, and when a pixel point Q is an angular point, at least 3 pixel values of the pixel points of the four positions are all larger than I_q+ t is greater than or less than I_qAnd (c) if the pixel values of the pixel points at the four positions do not meet the condition, judging and screening all the pixel points which are not the angular points if the pixel values of the pixel points at the four positions are not the angular points, and judging and screening the rest pixel points to obtain the final angular points by performing the operation judgment in the step (c).

As in fig. 3, for each feature point, its three-dimensional coordinates relative to the MARK center point are calculated. After attaching the MARK to the cylindrical object, a three-dimensional model can be obtained. After the radius r of the cylinder is obtained, for the feature point with the coordinate (x, y) of the MARK picture, the included angle between the feature point and the central connecting line of the cylinder is

The pixel to millimeter conversion scale is pixel2mm, and the calculation formula is as follows:

the corresponding three-dimensional coordinates are (the coordinate units here are all pixels):

storing the three-dimensional coordinates of all the feature points in a dictionary, wherein the keys of the dictionary are the coordinates (x, y) of the MARK picture, and the values are the obtained three-dimensional coordinates (3 d)_x,3d_y,3d_z)。

Shooting under different angles, different distances, different illumination, partial shielding and non-shielding conditions to obtain a training set picture; 300 pictures are shot in total, wherein 240 pictures are used as a training set, the rest pictures are used as a verification set, a border (bounding box) and a category (classes) of the pictures are calibrated by using labelImg to generate an xml file, and then the corresponding pictures and the marked xml are put into corresponding paths of a Yolov5 model code (step (3)).

Detecting a natural texture MARK in a scene through a Yolov5 target detection model, wherein the model can extract a region where the MARK is located in the scene and the confidence coefficient of the region in real time, if the confidence coefficient is less than 20, the region is not considered to be the region where the MARK is located, if the confidence coefficient is greater than or equal to 20, a bounding box (bounding box) of the MARK in the scene is obtained, and masking operation is performed on an image of the region by using OPENCV so that the RGB value of pixels except the other parts of the MARK region is (0,0, 0).

As shown in fig. 4, tracking all feature points in a scene by an optical flow method (optical flow) is specifically implemented as follows:

the similarity definition can be obtained through the formula, and the similarity definition is based on the image neighborhood size of (2 omega)_x+1)×(2ω_y+1), solving for d, resulting in the corresponding position of point u in image J, for ω_xAnd ω_yTypical values are 2, 3, 4, 5, 6, 7.

and calculating the obtained descriptor of the feature point by using an ORB algorithm, and specifically comprising the following steps of:

setting the center of a key point O, and using O as the center of a circle_rThe size of the pixel is that the radius is made into a circle;

taking N point pairs in the circle, wherein N is 512;

step (g), defining operation M (wherein I)_AExpressing the gray scale of A, I_BGrayscale for B):

and (h) carrying out the operation of the step (g) on the selected key points to obtain a descriptor combination consisting of 0 and 1.

The ORB implementation in OPENCV adopts an image pyramid to solve the problem that descriptors are not sensitive to illumination and have no scale consistency. For rotation consistency, the principal direction of each feature point is calculated by adopting a gray scale centroid method, the gray scale centroid coordinate is calculated in the circular area range with the radius r of the feature point, and the direction vector from the center position to the centroid position is determined as the principal direction.

And carrying out similarity matching on the feature point descriptor of the natural texture MARK in the scene and the original MARK feature point descriptor. Hamming distance is used to calculate the similarity between two descriptors, with d_kHamming distance, D, between rBRIEF descriptors representing feature points A and B_ADescriptor representing characteristic point A, D_BA descriptor representing a feature point B, i representing the bit value of the i-th position of the descriptor:

and after obtaining a feature point pair matched with the natural texture MARK and the reference MARK in the scene, carrying out outlier rejection through a ratio test, regarding the feature point p of the natural texture MARK in the scene, the distance between the two feature points closest to the reference image is d1 and d2, and when d1/d2> ratio (ratio is preferably 0.8), considering the feature point p as an outlier to carry out rejection. The use of random sample consensus (RANSAC) algorithm on the ratio-tested valid feature points (inliers) further eliminates possible outliers. In the matching process, cross validation (namely that the feature point p and the feature point q are mutually the most matched feature points) and a nearest neighbor algorithm are used for further screening out the point pairs with wrong matching, and finally the camera pose is calculated through a PnP algorithm. As shown in fig. 5, for the effect of augmented reality by rendering the virtual object into the scene, the virtual object completely covers the cup attached with the natural texture MARK, so that the cup in the scene is replaced.

As shown in fig. 6, which is a flow chart of the practical application of the method of the present invention, the steps are as follows:

step (1) acquiring a natural texture MARK Region (ROI) in a scene by using Yolov 5;

step (2) comparing the feature points of the previous frame with the optical flow method, if the object is judged not to move, directly acquiring the pose of the camera of the previous frame, and executing step (4), otherwise, executing step (3);

step (3) extracting feature points of an ROI (region of interest) by using a Fast algorithm to match with feature points of a MARK picture, finding out 30 feature points which are most matched by a KNN (nearest neighbor algorithm), recovering three-dimensional coordinates of the 30 feature points by a dictionary, calculating to obtain a MARK pose by using a PnP (neighbor nearest neighbor algorithm) and a RANSAC (random sample consensus) algorithm, comparing the MARK pose with an average pose of a sliding window to judge whether the current pose is effective, updating the sliding window if the MARK pose is effective, and executing the step (4), otherwise, using the pose of a previous frame of camera;

and (4) rendering the virtual object to the acquired pose through OPENGL and OPENCV to perform augmented reality.

Claims

1. A curved MARK three-dimensional registration augmented reality method based on region detection is characterized by comprising the following steps:

step (1), calibrating a camera:

step (2), constructing a data set:

respectively taking more than 300 pictures of the natural texture MARK to be identified under the conditions of different angles, different distances, different illumination, partial shielding and non-shielding, wherein 80% of the pictures are used as a training set, and the rest 20% of the pictures are used as a verification set; calibrating a border (bounding box) and a class (classes) of a natural texture MARK in a picture by using labelImg software to generate a corresponding xml format file, framing out an area where the natural texture MARK is located by using a rectangular frame in the calibration process, and then calibrating the class corresponding to the natural texture MARK;

step (4), printing the MARK, pasting the MARK on a cylindrical object, measuring the radius r of the cylinder, the width and height of a natural texture MARK picture, marker _ w and marker _ h, extracting the MARK picture characteristic points by using a Fast algorithm, calculating the three-dimensional coordinate of each characteristic point relative to the central point of the MARK, and calculating the included angle between the characteristic point with the coordinate (x, y) of the MARK picture and the connecting line of the characteristic point and the center of the cylinder

the corresponding three-dimensional coordinates are, in units of mm:

the method is simplified as follows:

wherein, λ represents a scale factor, and the matrix K is a camera internal parameter matrix; randomly selecting 4 characteristic point matching point pairs from 30 characteristic point matching pairs, calculating 4 groups of different solutions by using 3 point pairs, substituting the rest 1 point pairs into a formula, solving the solution with the minimum reprojection error into a final solution, and optimizing the final solution by using a random sample consensus (RANSAC) algorithm in the process;

step (6), in the MARK moving process, tracking the moving state of the feature points by using an optical flow method, judging the moving quantity of the feature points of the same two frames, when the moving quantity is less than or equal to ten percent of the total feature point quantity, considering that the pose of the marker does not change relative to the previous frame, and when the moving quantity is greater than ten percent of the total feature point quantity, considering that the pose of the marker changes relative to the previous frame, and performing three-dimensional registration by following the pose of the current natural texture MARK obtained in the step (5);

step (7), in the process of estimating the relative pose of the camera in the step (5), predicting and correcting the 6D pose of the MARK by using Kalman filtering;

first defining the displacement of the camera with respect to the natural texture MARK (t)_x,t_y,t_z) And the rotation angles (psi, theta, phi), the first derivative of the coordinates being (t)_x',t_y',t_z') and the second derivative of the coordinates is (t)_x″,t_y″,t_z"), wherein the first derivative represents the speed at which the natural texture MARK moves, and the second derivative represents the acceleration at which the natural texture MARK moves, the first derivative of the rotation angle is (ψ ', θ ', φ '), and the second derivative is (ψ", θ ", φ"), wherein the first derivative represents the speed at which the MARK rotates, and the second derivative represents the acceleration at which the natural texture MARK rotates; the kalman filtering may be used for estimation and correction, and the specific formula is as follows:

Kalman＝(t_x,t_y,t_z,t_x′,t_y′,t_z′,t_x″,t_y″,t_z″,ψ,θ,φ,ψ′,θ′,φ′,ψ″,θ″,φ″) (7)

step (8), eliminating the frame with the wrong pose estimation by using a sliding window;

judging whether the current estimated camera pose is correct or not by the camera pose coordinates of the last two frames and the first two frames relative to the current frame, so as to eliminate the frame with wrong estimation of the camera pose caused by blurring in the motion process of the natural texture MARK; the displacement in the 6D pose estimation of the current frame camera is (x)^t,y^t,z^t) Calculating the average displacement (x ', y ', z ') of the cameras of the first two frames and the average displacement (x ", y", z ") of the cameras of the second two frames, wherein the current displacement satisfies:

considering the current pose as an effective pose, otherwise considering the current frame as a fuzzy frame, and continuously using the last effective pose, wherein d_tThreshold adjusted for translation, d_t＝3；

2. The method of claim 1, wherein a well-defined, irregular picture with multi-feature natural texture features is used as the MARK, and no symmetrical picture is selected, and a large number of pictures and obvious differences among the pictures are selected.

3. The method as claimed in claim 1, wherein the method for the curved MARK three-dimensional registration augmented reality based on the region detection uses a Fast algorithm to calculate all feature points in the MARK picture, and comprises the following steps:

step (a), selecting a pixel Q from the MARK picture, in order to judge whether the pixel is a characteristic point,first set its brightness value to I_q；

step (c), on the circle with the size of 16 pixels, if the pixel values of 9 continuous pixel points are all larger than I_q+ t or both less than I_q+ t, the pixel point Q is considered as a characteristic point, and t is a set threshold value;

4. The method for the three-dimensional registration of the curved MARK augmented reality based on the region detection as claimed in claim 1 or 3, wherein the descriptor of the feature point calculated by the ORB algorithm is obtained by the following steps:

taking N point pairs in the circle, wherein N is 512;

step (g), defining operation M, wherein I_AExpressing the gray scale of A, I_BGradation representing B:

5. The method for the curved MARK three-dimensional registration augmented reality based on the region detection as claimed in claim 4, wherein the step (6) is implemented as follows:

wherein, the point A is any point in the image, and the coordinate vector is (x, y)^TFor a point u on the previous frame I ═ u_x,u_y]^TThe purpose of feature point tracking is to find its position v + u + d in the current frame image_x+d_x,u_y+d_y]^TD is ═ d_x,d_y]^TIs the image velocity at point a, i.e. the optical flow at point a; defining the concept of similarity in the two-dimensional neighborhood sense due to the influence of the aperture, setting ω_xAnd ω_yFor two integer values, the minimized residual function for the velocity vector d is defined as follows:

the similarity definition can be obtained through the formula, and the similarity definition is based on the image neighborhood size of (2 omega)_x+1)×(2ω_y+1), solving d to obtain the corresponding position of the point u in the image J;

comparing the position of the feature point calculated by the current frame with the position of the feature point corresponding to the previous frame, judging whether the feature points of two adjacent frames of the camera move or not, counting the number of the moved feature points, if the number of the feature points of the current frame is less than or equal to ten percent of the number of the total feature points, considering that the object does not move relative to the object of the previous frame, and directly acquiring the pose of the previous frame, if the number of the feature points of the current frame is greater than ten percent of the number of the total feature points, considering that the object moves relative to the object of the previous frame, and recalculating the relative pose of the camera relative to the object.

6. The method of claim 5, wherein ω is a three-dimensional registration augmented reality method based on region detection_xAnd ω_yIs 2, 3, 4, 5, 6 or 7.