WO2020228453A1 - 位姿跟踪方法、位姿跟踪装置及电子设备 - Google Patents

位姿跟踪方法、位姿跟踪装置及电子设备 Download PDF

Info

Publication number
WO2020228453A1
WO2020228453A1 PCT/CN2020/083893 CN2020083893W WO2020228453A1 WO 2020228453 A1 WO2020228453 A1 WO 2020228453A1 CN 2020083893 W CN2020083893 W CN 2020083893W WO 2020228453 A1 WO2020228453 A1 WO 2020228453A1
Authority
WO
WIPO (PCT)
Prior art keywords
pose
image
current frame
calculated
tracking
Prior art date
Application number
PCT/CN2020/083893
Other languages
English (en)
French (fr)
Inventor
章霖超
王进
Original Assignee
虹软科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 虹软科技股份有限公司 filed Critical 虹软科技股份有限公司
Priority to KR1020217041032A priority Critical patent/KR20220008334A/ko
Priority to US17/610,449 priority patent/US11922658B2/en
Publication of WO2020228453A1 publication Critical patent/WO2020228453A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/003Navigation within 3D models or images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/292Multi-camera tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • This application relates to computer vision processing technology, and in particular to a pose tracking method, a pose tracking device and electronic equipment.
  • Three-dimensional reconstruction refers to the construction of a digital three-dimensional model of the real object by obtaining the geometric shape and material of the real object to truly restore the shape of the object.
  • the input can be real-time images, video streams, and three-dimensional point clouds captured by various types of cameras, or it can be captured images, videos, and three-dimensional point clouds.
  • Three-dimensional reconstruction is widely used in computer-aided geometric design, computer animation, computer vision, medical images, virtual reality, augmented reality, digital media and other fields.
  • Camera tracking is the core and key algorithm module in 3D reconstruction, which is used to estimate the pose of the camera at any time during the shooting process, including the three-dimensional position and orientation in space. Accurate camera tracking results are the prerequisite for successful 3D reconstruction.
  • the existing real-time camera tracking methods are not robust enough, have higher requirements on the quality of input data, and have more restrictions on user shooting techniques, which are not conducive to ordinary users.
  • the embodiments of the present application provide a pose tracking method, a pose tracking device, and an electronic device to at least solve the problem of poor pose tracking robustness in the prior art and easy tracking loss.
  • a pose tracking method which includes the following steps: acquiring continuous multi-frame images of the scanned object and the initial pose of the image capturing unit; taking the initial pose as The initial value is based on the previous frame image and the current frame image in the continuous multi-frame image, using the first algorithm to obtain the first calculated pose of the current frame; taking the first calculated pose as the initial value, based on the The current frame image and the current frame reconstruction model, the second algorithm is used to obtain the second calculated pose of the current frame; the initial pose of the image capturing unit is updated according to the second calculated pose, and the above steps are repeated to achieve The pose tracking of the image capture unit.
  • the initial pose of the image capturing unit is set to a unit matrix or randomly set to any value.
  • the continuous multiple frames of images are continuous RGB-D images.
  • using the initial pose as an initial value, and using the first algorithm to obtain the first calculated pose of the current frame based on the previous frame image and the current frame image in the continuous multiple frame images includes: The initial pose is taken as the initial value, and the previous frame image and the current frame image are aligned pixel by pixel using the first algorithm to obtain the relative relationship between the previous frame image and the current frame image. Coordinate transformation to obtain the first calculated pose of the current frame.
  • using a second algorithm to obtain the second calculated pose of the current frame includes: using the first calculated pose The pose is the initial value, and the current frame image is aligned with the current frame reconstruction model using the second algorithm to obtain the relative coordinate transformation between the previous frame image and the current frame image, thereby obtaining the current frame The second calculation pose.
  • the first algorithm is calculated on a low-resolution image
  • the second algorithm is calculated on a high-resolution image.
  • the initial pose of the image capturing unit is acquired through an inertial navigation module.
  • the state quantity of the inertial navigation module is updated according to the second calculated pose, thereby updating the initial pose of the image capturing unit.
  • the inertial navigation module uses a multi-sensor fusion method to obtain the initial pose of the image capturing unit.
  • the inertial navigation module is a state estimation system based on extended Kalman filtering.
  • the method further includes: verifying the second calculated pose, and when the verification passes, update the current frame reconstruction model using the second calculated pose and the current frame image.
  • verifying the second calculated pose includes: obtaining a comparison image in the current frame reconstruction model, and comparing the comparison image with the current frame image, to achieve the second calculation Verification of pose.
  • the method further includes: when the verification is passed, selecting a key frame from the image frames that have passed the verification; and constructing a bag of words database based on the selected key frame.
  • the method further includes: when the verification fails, using a relocation method to restore the pose tracking of the image capturing unit.
  • the current frame image is marked as a tracking failure, and when the number of consecutively failed tracking frames exceeds a second threshold, it indicates that the pose tracking of the image capture unit is lost, and the The relocation method resumes tracking of the image capture unit.
  • the relocation method includes: calculating the bag-of-words vector of the current frame image when the pose tracking is lost; selecting the candidate key according to the constructed word-bag database and the bag-of-words vector of the current frame image Frame; according to the relative pose between the candidate key frame and the current frame image, use a third algorithm to obtain the third calculated pose of the current frame; update the image capture unit according to the third calculated pose Initial pose to restore the pose tracking of the image capture unit.
  • the method further includes: initializing the inertial navigation module after restoring the pose tracking of the image capturing unit.
  • a pose tracking device includes: an image capturing unit configured to acquire consecutive multiple frames of images of a scanned object; an initial pose determining unit configured to determine The initial pose of the image capture unit; the first pose acquisition unit is configured to use the initial pose as an initial value, based on the previous frame image and the current frame image in the continuous multi-frame image, using the first algorithm Obtain the first calculated pose of the current frame; a second pose acquisition unit configured to use the first calculated pose as an initial value, and use a second algorithm to obtain the current frame based on the current frame image and the current frame reconstruction model The second calculated pose; the pose update unit is configured to update the initial pose of the image capture unit according to the second calculated pose, so as to realize the pose tracking of the image capture unit.
  • the initial pose determining unit is further configured to set the initial pose of the image capturing unit as a unit matrix or randomly set to any value.
  • the continuous multiple frames of images are continuous RGB-D images.
  • the first pose acquiring unit is configured to use the initial pose as an initial value, and use the first algorithm to perform pixel-by-pixel color alignment on the previous frame image and the current frame image, The relative coordinate transformation between the previous frame image and the current frame image is obtained, so as to obtain the first calculated pose of the current frame.
  • the second pose acquisition unit is configured to use the first calculated pose as an initial value, and use the second algorithm to align the current frame image with the current frame reconstruction model to obtain the The relative coordinate transformation between the previous frame image and the current frame image is used to obtain the second calculated pose of the current frame.
  • the first algorithm is calculated on a low-resolution image
  • the second algorithm is calculated on a high-resolution image.
  • the initial pose determination unit is an inertial navigation module.
  • the pose updating unit is further configured to update the state quantity of the inertial navigation module according to the second calculated pose, thereby updating the initial pose of the image capturing unit.
  • the inertial navigation module is configured to obtain the initial pose of the image capturing unit by using a multi-sensor fusion method.
  • the inertial navigation module is a state estimation system based on extended Kalman filtering.
  • the pose tracking device further includes: a pose verification unit configured to verify the second calculated pose, and when the verification is passed, use the second calculated pose and the current frame image Updating the current frame reconstruction model.
  • a pose verification unit configured to verify the second calculated pose, and when the verification is passed, use the second calculated pose and the current frame image Updating the current frame reconstruction model.
  • the pose verification unit is further configured to obtain a comparison image in the current frame reconstruction model, and compare the comparison image with the current frame image, so as to realize the calculation of the second calculated pose verification.
  • the pose verification unit is further configured to select key frames from the image frames that have passed the verification when the verification is passed; and construct a bag of words database based on the selected key frames.
  • the pose verification unit is further configured to use a relocation method to restore the pose tracking of the image capturing unit when the verification fails.
  • the pose verification unit is further configured to mark the current frame image as a tracking failure when the verification fails, and when the number of consecutively failed tracking frames exceeds a second threshold, it indicates that the image capture unit The pose tracking is lost, and the relocation method is used to restore the tracking of the image capture unit.
  • the relocation method includes: calculating the bag-of-words vector of the current frame image when the pose tracking is lost; selecting the candidate key according to the constructed word-bag database and the bag-of-words vector of the current frame image Frame; according to the relative pose between the candidate key frame and the current frame image, use a third algorithm to obtain the third calculated pose of the current frame; update the image capture unit according to the third calculated pose Initial pose to restore the pose tracking of the image capture unit.
  • the initial pose determination unit is further configured to initialize the inertial navigation module after resuming the pose tracking of the image capturing unit.
  • an electronic device including: a processor; and a memory configured to store executable instructions of the processor; wherein the processor is configured to execute The executable instruction is used to execute the pose tracking method described in any one of the above.
  • a storage medium includes a stored program, wherein, when the program is running, the device where the storage medium is located is controlled to execute any one of the foregoing Pose tracking method.
  • Fig. 1 is a flowchart of an optional pose tracking method according to one of the embodiments of the present application
  • Fig. 2 is a flowchart of an optional pose tracking method based on an inertial navigation module according to one of the embodiments of the present application;
  • Figure 3a and Figure 3b are respectively the three-dimensional model and trajectory error graph generated by the KinectFusion algorithm
  • 4a and 4b are respectively a three-dimensional model and a trajectory error diagram generated by using the pose tracking method based on the inertial navigation module provided by an embodiment of the present application;
  • FIG. 5 is a flowchart of an optional pose tracking method including relocation according to one of the embodiments of the present application
  • Fig. 6 is a structural block diagram of an optional pose tracking device according to one of the embodiments of the present application.
  • Fig. 7 is a structural block diagram of an optional electronic device according to one of the embodiments of the present application.
  • the embodiments of this application can be applied to the end mode, that is, applied to the cameras of various mobile devices (smartphone cameras, digital cameras, SLR cameras, depth cameras, Pad cameras, laptop cameras, game console cameras, etc.); it can also be applied
  • the cloud plus terminal mode is applied to computer systems/servers, which can operate with many other general-purpose or special-purpose computing system environments or configurations. Examples of well-known computing systems, environments and/or configurations suitable for use with computer systems/servers include, but are not limited to: personal computer systems, handheld or laptop devices, microprocessor-based systems, programmable consumer electronics, Small computer systems, large computer systems, and distributed cloud computing technology environments including any of the above systems, etc.
  • the computer system/server may be described in the general context of computer system executable instructions (such as program modules, etc.) executed by the computer system.
  • program modules can include routines, programs, components, logic, and data structures, etc., which perform specific tasks or implement specific abstract data types.
  • the computer system/server can be implemented in a distributed cloud computing environment, and tasks are performed by remote processing equipment linked through a communication network.
  • program modules may be located on a storage medium of a local or remote computing system including a storage device.
  • a pose tracking method is provided.
  • FIG. 1 it is a flowchart of an optional pose tracking method according to an embodiment of the present application. As shown in Figure 1, the method includes the following steps:
  • S12 Use the initial pose as the initial value, and use the first algorithm to obtain the first calculated pose of the current frame based on the previous frame image and the current frame image in the continuous multi-frame images;
  • Step S10 acquiring continuous multiple frames of images of the scanned object and the initial pose of the image capturing unit
  • the continuous multi-frame images of the scanned object can be obtained by using an image capture unit, which can be an independent camera or a camera integrated with a camera, a mobile phone and other electronic equipment.
  • the type of camera Including infrared structured light camera, time-of-flight (ToF) camera, RGB camera, Mono camera, etc.; the initial pose of the image capture unit can be set to a unit matrix or randomly set to any value.
  • the pose of the image capture unit includes the three-dimensional position and orientation of the image capture unit, with 6 degrees of freedom.
  • the continuous multi-frame image can be a continuous RGB-D image.
  • the RGB-D image is an image pair composed of a depth image and a color image.
  • the depth image and the color image usually consist of different image capture units. It can be assumed that the color map and depth map of each frame are synchronized in time. For color and depth cameras with a fixed relative position, it is easy to achieve data alignment through external parameter calibration. The color map of the frame is synchronized with the frame of the depth map.
  • Step S12 Use the initial pose as the initial value, and use the first algorithm to obtain the first calculated pose of the current frame based on the previous frame image and the current frame image in the continuous multi-frame images;
  • the first algorithm is a three-dimensional point cloud alignment algorithm based on pixel-by-pixel color alignment, for example, a dense visual odometry (Dense Visual Odometry, DVO) algorithm.
  • a dense visual odometry Dense Visual Odometry, DVO
  • Step S14 Using the first calculated pose as the initial value, based on the current frame image and the current frame reconstruction model, the second algorithm is used to obtain the second calculated pose of the current frame.
  • the second algorithm is an iterative three-dimensional point cloud alignment algorithm, for example, an improved Iterative Closest Point (ICP) algorithm in the KinectFusion algorithm.
  • ICP Iterative Closest Point
  • steps S12 and S14 can all be summed up as solving a nonlinear least squares problem with 6 degrees of freedom transformation parameters as the optimization objective.
  • the initial pose of the image capture unit in step S10 is updated according to the second calculated pose, and steps S12 and S14 are performed.
  • the image capture unit can be The pose tracking.
  • the pose tracking method implemented through steps S10 to S16 due to the use of two different algorithms, not only can achieve accurate pose estimation of the image capture unit, but also because only the previous frame image and the current frame are involved in step S12
  • the image has nothing to do with the reconstruction model of the current frame. Therefore, it can provide roughly accurate pose even when the scanned object deviates from the field of view, which improves the robustness of the pose tracking of the image capture unit.
  • RGB-D images can be built into 3-4 layer image pyramids.
  • the first algorithm for example, the DVO algorithm
  • the second Algorithms such as ICP algorithm
  • step S15 may be further included: verifying the second calculated pose.
  • the comparison image can be obtained in the current frame reconstruction model, and the comparison image can be compared with the current frame image (for example, the current frame depth image) to achieve the verification of the second calculated pose.
  • obtaining the contrast image in the current frame reconstruction model may be by using a ray casting method to render a depth map from the current frame reconstruction model as the contrast image.
  • step S16 may also include using the second calculated pose and the current frame image to update the current frame reconstruction model. Otherwise, it means that the pose tracking of the image capture unit fails, and the reconstruction model is not updated.
  • the pose tracking method provided by the embodiment has the characteristics of high accuracy and fast speed, but it is not suitable for the situation where the image capturing unit moves quickly. When the image capture unit moves too fast, the image content of adjacent frames is too different, and there is a problem of motion blur, which may cause the pose tracking of the image capture unit to fail.
  • a pose tracking method based on an inertial navigation module is also provided to further improve the robustness of image capture unit tracking.
  • FIG. 2 it is a flowchart of an optional method for tracking a pose based on an inertial navigation module according to an embodiment of the present application. As shown in Figure 2, the method includes the following steps:
  • S20 Acquire continuous multiple frames of images of the scanned object, and obtain the initial pose of the image capturing unit through the inertial navigation module;
  • S22 Use the initial pose as the initial value, and use the first algorithm to obtain the first calculated pose of the current frame based on the previous frame image and the current frame image in the continuous multiple frames of images;
  • the pose tracking method based on the inertial navigation module provided in this embodiment can also significantly improve the robustness of the pose tracking of the image capturing unit under severe exercise conditions.
  • the inertial navigation module can be used to obtain a basically accurate image capture unit
  • the initial pose can accelerate the speed of optimization convergence and improve calculation performance.
  • steps S22 and S24 are basically the same as the steps S12 and S14 of the first embodiment, and will not be repeated here. Steps S20 and S26 will be described in detail below.
  • Step S20 Acquire consecutive multiple frames of images of the scanned object, and obtain the initial pose of the image capturing unit through the inertial navigation module;
  • the continuous multi-frame images of the scanned object can be obtained by using an image capture unit, which can be an independent camera or a camera integrated with a camera, a mobile phone and other electronic equipment.
  • the type of camera Including infrared structured light camera, time-of-flight camera (Time-of-flight, ToF), RGB camera, Mono camera, etc.
  • the pose of the image capture unit includes the spatial three-dimensional position and orientation of the image capture unit, with 6 degrees of freedom.
  • the continuous multi-frame image can be a continuous RGB-D image.
  • the RGB-D image is an image pair composed of a depth image and a color image.
  • the depth image and the color image usually consist of different image capture units.
  • the inertial navigation module is a state estimation system based on Extended Kalman Filter (EKF).
  • the inertial navigation module can take the data of the common inertial measurement unit (IMU) of the mobile platform as input, and obtain the initial pose of the image capturing unit through the dynamic integration method.
  • An inertial sensor is a sensor that measures the state of motion through inertial force.
  • Commonly used inertial sensors include an accelerometer (Accelerator) that obtains linear acceleration data and a gyroscope (Gyroscope) that obtains angular velocity data.
  • the inertial sensor reading is used as the measured value
  • the multi-sensor fusion method is used to complete the Kalman filter prediction by solving the dynamic equation
  • the predicted pose is used as the initial pose of the image capture unit.
  • the state quantity of the inertial navigation module is updated according to the second calculated pose.
  • the state quantity may include the position, speed, and orientation of the inertial navigation module (for example, accelerometer and gyroscope), and the position of the inertial navigation module. Bias, etc.
  • step S24 it may further include step S25: verifying the second calculated pose.
  • the comparison image can be acquired in the current frame reconstruction model, and the comparison image can be compared with the current frame image (for example, the current frame depth image), so as to realize the verification of the second calculated pose.
  • obtaining the contrast image in the current frame reconstruction model may be by using a ray casting method to render a depth map from the current frame reconstruction model as the contrast image.
  • compare it with the current frame depth image use the robust kernel function to calculate the weighted mean square error, and then compare it with the first threshold to verify the second calculated pose.
  • step S26 may also include using the second calculated pose and the current frame image to update the current frame reconstruction model. Otherwise, it means that the image capture unit has failed to track, and the reconstruction model and the state of the inertial navigation module are not updated.
  • the three-dimensional model of the scanned object can be reconstructed.
  • KinectFusion is a real-time 3D reconstruction method based on infrared structured light input.
  • the real-time camera tracking method used by KinectFusion is an improved Iterative Closest Point (ICP) algorithm.
  • ICP Iterative Closest Point
  • KinectFusion uses the projection method to determine the corresponding point, instead of calculating the accurate nearest neighbor, and the calculation speed is significantly improved.
  • KinectFusion always aligns the input depth map with the current reconstruction model, which significantly reduces the phase The cumulative error caused by the alignment of adjacent frames.
  • KinectFusion is prone to tracking failure in two situations: First, the hand-held image capture unit is violently moving. KinectFusion uses the projection method to determine the corresponding point. The method is only suitable for slow camera movement, fast and random cameras. Motion can easily lead to tracking loss. Tracking loss means that the camera pose cannot be estimated, or the estimated camera pose is very different from the actual; the second is the situation where the scanned object deviates from the field of view. When the reconstructed model basically moves out of the camera field of view, KinectFusion The method will definitely lose track, which is easy to occur when ordinary users are shooting. .
  • the pose tracking method based on the inertial navigation module will almost never cause tracking loss due to the image capture unit moving too fast. This is mainly because the inertial navigation module uses high frequency
  • the inertial sensor data can provide a more accurate initial pose. For the case where the scanned object deviates from the field of view, because the first algorithm is used to obtain the first calculated pose of the current frame, only the previous frame image and the current frame image are involved, and it has nothing to do with the reconstruction model of the current frame.
  • the three-dimensional model generated by the pose tracking method based on the inertial navigation module can basically restore the shape of the object (as shown in Figure 4a), and the difference between the estimated pose of the image capture unit and the real pose in each frame of trajectory error map The value (shown in the shaded area in Figure 4b) is also very small.
  • the above-mentioned pose tracking method based on the inertial navigation module has a significant improvement in the robustness of the pose tracking of the image capture unit in situations such as strenuous exercise, but when the user blocks the camera or the scene changes significantly, the position The problem of loss of pose tracking is still inevitable.
  • a pose tracking method including relocation which can quickly restore the pose tracking of the image capture unit in the case of tracking loss, and re-estimate the image capture unit's pose Pose, to further improve the robustness of image capture unit tracking and improve user experience.
  • FIG. 5 it is a flowchart of an optional pose tracking method including relocation according to an embodiment of the present application. As shown in Figure 5, the method includes the following steps:
  • S50 Acquire continuous multiple frames of images of the scanned object, and obtain the initial pose of the image capturing unit through the inertial navigation module;
  • S52 Use the initial pose as the initial value, and use the first algorithm to obtain the first calculated pose of the current frame based on the previous frame image and the current frame image in the continuous multi-frame images;
  • S55 Verify the second calculated pose. When the verification is passed, use the second calculated pose and the current frame image to update the current frame reconstruction model; when the verification fails, use the relocation method to restore the Tracking of the image capture unit;
  • step S55 the second calculated pose is verified, and when the verification passes, the second calculated pose and the current frame image are used to update the current frame reconstruction model; wherein, the verification method can adopt the first embodiment. And the method described in 2.
  • the verification fails it means that the pose tracking has failed, the reconstruction model and the state of the inertial navigation module are not updated, and the relocation method is used to restore the pose tracking of the image capture unit.
  • the current frame of the image can be marked as a tracking failure.
  • the relocation method is used to restore the image capture The pose tracking of the unit.
  • the relocation method may include feature points and Bag of Worlds (BoW) matching, so as to quickly restore the position of the image capture unit when the pose tracking is lost.
  • Bag of Worlds Bag of Worlds
  • Pose tracking is a method of describing image features through image feature point descriptors.
  • step S55 may also include:
  • Step S550 When the verification is passed, a key frame is selected from the image frames that have passed the verification.
  • the clearest frame from the adjacent image frames that passed the verification can be selected as Keyframe.
  • the clarity of the image can be smoothed using a low-pass filter to obtain a blurred image.
  • the blur degree of the blurred image can be obtained. The closer the original image and the blurred image are, the more blurred the original image itself.
  • Step S551 Build a bag of words database based on the selected key frames.
  • image feature points can be extracted based on selected key frames, feature point descriptors and bag-of-words vectors can be calculated, and a bag-of-words database can be constructed.
  • the aforementioned bag-of-words database can be constructed in an offline manner, and an offline bag-of-words database is trained from a set of image samples in a manner similar to text retrieval.
  • the image features are analogous to words, and the entire bag-of-words database is analogous to a dictionary. Through the dictionary, for any feature, the corresponding word can be found in the dictionary.
  • the loop clusters the descriptors into a tree structure.
  • the entire tree structure constitutes a dictionary, and the leaf nodes (also called word nodes) of the tree constitute words.
  • the frequency of the word in all training images is also recorded for each word. The higher the frequency of occurrence, the smaller the discrimination of the word to describe the image features represented by the word. The degree of discrimination.
  • step S55 when the verification fails, using the relocation method to restore the tracking of the image capturing unit includes:
  • Step S552 Calculate the bag-of-words vector of the current frame image when the pose tracking is lost;
  • Step S553 Select candidate key frames according to the constructed word-bag database and the word-bag vector of the current frame image;
  • Step S554 Calculate the relative pose between the candidate key frame and the current frame image
  • Step S555 Update the initial pose of the image capture unit according to the third calculated pose, so as to restore the pose tracking of the image capture unit.
  • the similarity between the bag-of-words vectors of all key frames in the constructed bag-of-words database and the bag-of-words vectors of the current frame image can be calculated, and the similarity exceeds the third threshold.
  • candidate keyframes As candidate keyframes.
  • the minimum threshold of shared word nodes is set to 0.8 times the maximum number, the key frames with the number of shared word nodes less than the minimum threshold are filtered out, and the rest are used as candidate key frames after the second screening. Then, each candidate key frame after the second screening and the key frames with similar positions are combined into a candidate key frame group, and the sum of the similarity scores between the candidate key frame group and the current frame image is calculated through the bag of words vector, and those The key frame whose total score is higher than the fourth threshold (for example, 0.75 times of the highest total score) is used as the final filtered candidate key frame.
  • the fourth threshold for example, 0.75 times of the highest total score
  • step S554 of the embodiment of the present application for each candidate key frame, the descriptors that match the current frame image are screened, and then the random sampling consensus algorithm (Random sample consensus, RANSAC) is used to screen out mismatched pairs.
  • RANSAC Random sample consensus
  • the third algorithm for example, the PnP algorithm
  • the PnP algorithm is used to calculate the candidate key frame and the current frame image through the sparse feature point matching pair between the candidate key frame and the current frame The relative pose between, obtain the third calculated pose of the current frame.
  • the pose recovery is successful, and the initial pose of the image capturing unit in step S50 is updated according to the third calculated pose, thereby, the image capturing unit can be realized
  • the relocation of the image capture unit resumes the pose tracking of the image capture unit.
  • the inertial navigation module cannot get feedback for a long time when the pose tracking of the image capturing unit is lost, its system state is likely to deviate far from the true value. For this reason, when the image capture unit is successfully repositioned and the pose tracking of the image capture unit is restored, the inertial navigation module needs to be reinitialized, including setting the external parameters, bias and variance matrix of the inertial navigation module to default values, using The pose of the current frame image is reversed to obtain the initial pose of the inertial navigation module.
  • the above-mentioned pose tracking method in addition to the accurate estimation of the pose of the image capture unit, it can also provide a roughly accurate pose when the scanned object deviates from the field of view, and is based on the idea of multi-sensor fusion , By integrating the input of the inertial navigation module and the image capture unit, a more robust pose tracking is achieved for the image capture device, especially the robustness of the pose tracking in the case of strenuous exercise.
  • the relocation method can provide effective scene recovery and quickly restore the pose tracking of the image capture device.
  • the above-mentioned pose tracking method is particularly suitable for mobile platforms. On the one hand, it makes full use of a variety of common sensor devices on mobile platforms; on the other hand, its computational cost is relatively small and meets the real-time computing performance requirements of mobile platforms.
  • a pose tracking device is also provided.
  • FIG. 6, is a structural block diagram of an optional pose tracking device according to one of the embodiments of the present application. As shown in Fig. 6, the pose tracking device 6 includes:
  • the image capturing unit 60 is configured to obtain continuous multiple frames of images of the scanned object
  • the image capture unit may be an independent camera or a camera integrated with a camera, a mobile phone and other electronic equipment.
  • the types of the camera include an infrared structured light camera and a time-of-flight camera (Time-of-Flight camera). Flight, ToF), RGB camera, Mono camera, etc.; continuous multi-frame images can be continuous RGB-D images, RGB-D images are image pairs composed of a depth image and a color image.
  • the image and the color image are usually acquired by different image capture units, and it can be assumed that the color image and depth image of each frame are synchronized in time. For color and depth cameras with a fixed relative position, it is easy to calibrate by external parameters. Realize data alignment, and realize the frame synchronization of the color image and the depth image of each frame through the time stamp of the image acquisition.
  • the initial pose determining unit 62 is configured to determine the initial pose of the image capturing unit
  • the initial pose determining unit is further configured to set the initial pose of the image capturing unit to the identity matrix or randomly set to any value.
  • the pose of the image capture unit includes the spatial three-dimensional position and orientation of the image capture unit, including 6 degrees of freedom.
  • the first pose obtaining unit 64 is configured to use the initial pose as an initial value, and use the first algorithm to obtain the first calculated pose of the current frame based on the previous frame image and the current frame image in the continuous multi-frame images;
  • the first algorithm is a three-dimensional point cloud alignment algorithm based on pixel-by-pixel color alignment, for example, a dense visual odometry (Dense Visual Odometry, DVO) algorithm.
  • a dense visual odometry Dense Visual Odometry, DVO
  • the second pose acquiring unit 66 is configured to use the first calculated pose as an initial value, and use a second algorithm to acquire the second calculated pose of the current frame based on the current frame image and the current frame reconstruction model;
  • the second algorithm is an iterative three-dimensional point cloud alignment algorithm, for example, an improved Iterative Closest Point (ICP) algorithm in the KinectFusion algorithm.
  • ICP Iterative Closest Point
  • the pose update unit 68 is configured to update the initial pose of the image capture unit according to the second calculated pose, so as to realize the pose tracking of the image capture unit.
  • the pose estimation device not only can accurate pose estimation of the image capture unit be achieved, but also because the first pose acquisition unit only involves the previous frame image and the current frame image, and the current frame
  • the reconstruction model is irrelevant. Therefore, it can provide a roughly accurate pose even when the scanned object deviates from the field of view, improving the robustness of the image capturing unit's pose tracking.
  • RGB-D images can be built into 3-4 layer image pyramids.
  • the first algorithm for example, the DVO algorithm
  • the second Algorithms such as ICP algorithm
  • the above-mentioned pose tracking device provided according to the embodiment of the present application has the characteristics of high accuracy and fast speed, it is not suitable for the case where the image capturing unit moves quickly.
  • the image capture unit moves too fast, the image content of adjacent frames is too different, and there is a problem of motion blur, which may cause the pose tracking of the image capture unit to fail.
  • the initial pose determination unit 62 may be an inertial navigation module.
  • the inertial navigation module is a state estimation system based on Extended Kalman Filter (EKF).
  • the inertial navigation module can take the data of the common inertial measurement unit (IMU) of the mobile platform as input, and obtain the initial pose of the image capturing unit through the dynamic integration method.
  • An inertial sensor is a sensor that measures the state of motion through inertial force. Commonly used inertial sensors include an accelerometer (Accelerator) that obtains linear acceleration data and a gyroscope (Gyroscope) that obtains angular velocity data.
  • Accelometer Accelometer
  • Gyroscope gyroscope
  • the pose error of the image capture unit calculated directly based on the original measurement value is very large. Therefore, the inertial sensor reading is taken as the measured value, and the multi-sensor fusion method is used to complete the Kalman filter prediction by solving the dynamic equation, and the predicted image capture unit pose is used as the initial pose of the image capture unit.
  • the measurement results and uncertainties of multiple sensors can be comprehensively considered, thereby obtaining more accurate state estimation results.
  • the pose update unit 68 is further configured to update the state of the inertial navigation module according to the second calculated pose, thereby updating the initial position of the image capturing unit. Posture.
  • the inertial navigation module is used as the pose tracking device of the initial pose determination unit, which can significantly improve the robustness of the image capturing unit's pose tracking under severe exercise.
  • the inertial navigation module can be used to obtain a basically accurate image capture unit
  • the initial pose can accelerate the speed of optimization convergence and improve calculation performance.
  • the pose tracking device further includes: a pose verification unit 67 configured to verify the second calculated pose.
  • the comparison image can be acquired in the current frame reconstruction model, and the comparison image can be compared with the current frame image (for example, the current frame depth image), so as to realize the verification of the second calculated pose.
  • obtaining the contrast image in the current frame reconstruction model may be by using a ray casting method to render a depth map from the current frame reconstruction model as the contrast image. After acquiring the contrast image, compare it with the current frame depth image, use the robust kernel function to calculate the weighted mean square error, and then compare it with the first threshold to verify the second calculated pose.
  • the three-dimensional model of the scanned object can be reconstructed.
  • the pose tracking device based on the inertial navigation module has a significant improvement in the robustness of the image capture unit tracking in situations such as strenuous exercise, the pose tracking can be achieved when the user blocks the camera or the scene changes significantly. The problem of loss is still inevitable.
  • the verification unit is also configured to select key frames from the image frames that have passed the verification when the verification is passed; build a word bag database based on the selected key frames; and use the relocation method to restore the image capture when the verification fails Unit tracking. For example, when the verification fails, the current frame image can be marked as a tracking failure. When the number of frames that have failed continuous tracking exceeds a second threshold, it indicates that the pose tracking of the image capture unit is lost, and the image capture unit is used to relocate. Method to restore the image capture unit tracking.
  • the relocation method may include feature points and Bag of Worlds (BoW) matching, so as to quickly restore the position of the image capture unit when the pose tracking is lost.
  • Bag of Worlds Bag of Worlds
  • Pose tracking is a method of describing image features through image feature point descriptors.
  • selecting a key frame from the image frames that have passed the verification includes: when the verification is passed, the pose can be calculated according to the second, every certain angle and distance , Select the clearest frame from the adjacent image frames that have passed the verification as the key frame.
  • the clarity of the image can be smoothed using a low-pass filter to obtain a blurred image.
  • the blur degree of the blurred image can be obtained. The closer the original image and the blurred image are, the more blurred the original image itself.
  • constructing the bag-of-words database includes: extracting image feature points based on the selected key frames, calculating feature point descriptors and bag-of-words vectors, and constructing Bag of words database.
  • the aforementioned bag-of-words database can be constructed in an offline manner, and an offline bag-of-words database is trained from a set of image samples in a manner similar to text retrieval.
  • the image features are analogous to words, and the entire bag-of-words database is analogous to a dictionary. Through the dictionary, for any feature, the corresponding word can be found in the dictionary.
  • the loop clusters the descriptors into a tree structure.
  • the entire tree structure constitutes a dictionary, and the leaf nodes (also called word nodes) of the tree constitute words.
  • the frequency of the word in all training images is also recorded for each word. The higher the frequency of occurrence, the smaller the discrimination of the word to describe the image features represented by the word. The degree of discrimination.
  • using the relocation method to restore the tracking of the image capture unit includes: calculating the bag of words vector of the current frame image when the pose tracking is lost; The constructed word bag database and the word bag vector of the current frame image are used to select candidate key frames; calculate the relative pose between the candidate key frame and the current frame image to obtain the third calculated pose of the current frame; The third calculated pose updates the initial pose of the image capture unit to restore the pose tracking of the image capture unit.
  • the inertial navigation module cannot get feedback from the visual positioning module for a long time when the tracking of the image capture unit is lost, its system state is likely to deviate far from the true value. Therefore, when the relocation is successful and the pose tracking of the image capture unit is restored, the inertial navigation module needs to be reinitialized, including setting the external parameters, offset and variance matrix of the inertial navigation module to default values, and using the current image capture The unit pose is reversed to obtain the initial pose of the inertial navigation module.
  • an electronic device is also provided.
  • FIG. 7 it is a structural block diagram of an optional electronic device according to an embodiment of the present application.
  • the electronic device 7 includes: a processor 70; and a memory 72 configured to store executable instructions of the processor 70; wherein, the processor 70 is configured to execute the executable instructions Perform the pose tracking method described in any one of Embodiment 1 to Embodiment 3.
  • a storage medium wherein the storage medium includes a stored program, and wherein the device where the storage medium is located is controlled to execute the first to first embodiments when the program is running.
  • the disclosed technical content can be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the units may be a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of units or modules, and may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program code .
  • the first algorithm By acquiring the continuous multi-frame images of the scanned object and the initial pose of the image capture unit; using the initial pose as the initial value, based on the previous frame image and the current frame image in the continuous multi-frame image, the first algorithm is used Obtain the first calculated pose of the current frame; use the first calculated pose as the initial value, and use the second algorithm to obtain the second calculated pose of the current frame based on the current frame image and the current frame reconstruction model; The second calculated pose updates the initial pose of the image capture unit, and repeats the above steps to realize the pose tracking of the image capture unit. Not only can the pose of the image capture unit be accurately estimated, but also the roughly accurate pose can be provided when the scanned object deviates from the field of view, which improves the robustness of the image capture unit's pose tracking. Furthermore, the problem of poor pose tracking robustness in the prior art and easy tracking loss is solved.
  • the relocation method can provide effective scene recovery and quickly restore the pose tracking of the image capture device.
  • the above-mentioned pose tracking method is particularly suitable for mobile platforms. On the one hand, it makes full use of a variety of common sensor devices on mobile platforms; on the other hand, its computational cost is relatively small and meets the real-time computing performance requirements of mobile platforms.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Computer Graphics (AREA)
  • Multimedia (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

一种位姿跟踪方法、位姿跟踪装置及电子设备。方法包括:获取扫描对象的连续多帧图像和图像捕获单元的初始位姿(S10);以所述初始位姿作为初值,基于所述连续多帧图像中的前一帧图像和当前帧图像,使用第一算法获取当前帧的第一计算位姿(S12);以所述第一计算位姿为初值,基于所述当前帧图像和当前帧重建模型,使用第二算法获取当前帧的第二计算位姿(S14);根据所述第二计算位姿更新所述图像捕获单元的初始位姿,并重复上述步骤以实现对所述图像捕获单元的位姿跟踪(S16)。不仅可以对图像捕获单元的位姿实现准确估计,并且在扫描对象偏离视野范围时也能提供大致准确的位姿,提高对图像捕获单元位姿跟踪的鲁棒性。

Description

位姿跟踪方法、位姿跟踪装置及电子设备
本申请要求于2019年05月14日提交中国专利局、优先权号为201910396914.7、发明名称为“位姿跟踪方法、位姿跟踪装置及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机视觉处理技术,具体而言,涉及一种位姿跟踪方法、位姿跟踪装置及电子设备。
背景技术
三维重建是指通过获取真实物体的几何形状和材质,构建真实物体的数字化三维模型,以真实还原物体的外形。其输入可以是利用各个类型的摄像头实时摄取的图像、视频流、三维点云,也可以是摄取好的图像、视频、三维点云。三维重建在计算机辅助几何设计、计算机动画、计算机视觉、医学图像、虚拟现实、增强现实、数字媒体等领域有非常广泛的应用。
相机跟踪是三维重建中核心、关键的算法模块,其用于估计拍摄过程中任一时刻的相机位姿,包括空间三维位置和朝向。精确的相机跟踪结果是三维重建成功的前提。现有的实时相机跟踪方法的鲁棒性不够好,对输入数据的质量有较高的要求,对用户拍摄手法有较多的限制,不利于普通用户使用。
发明内容
本申请实施例提供了一种位姿跟踪方法、位姿跟踪装置及电子设备,以至少解决现有技术中位姿跟踪鲁棒性较差,容易出现跟踪丢失的问题。
根据本申请其中一实施例的一个方面,提供了一种位姿跟踪方法,该方法包括以下步骤:获取扫描对象的连续多帧图像和图像捕获单元的初始位姿;以所述初始位姿作为初值,基于所述连续多帧图像中的前一帧图像和当前帧图像,使用第一算法获取当前帧的第一计算位姿;以所述第一计算位姿为初值,基于所述当前帧图像和当前帧重建模型,使用第二算法获取当前帧的第二计算位姿;根据所述第二计算位姿更新所述图像捕获单元的初始位姿,并重复上述步骤以实现对所述图像捕获单元的位姿跟踪。
可选地,所述图像捕获单元的初始位姿设置为单位矩阵或者随机设定为任意值。
可选地,所述连续多帧图像为连续的RGB-D图像。
可选地,以所述初始位姿作为初值,基于所述连续多帧图像中的前一帧图像和当前帧图像,使用第一算法获取当前帧的第一计算位姿包括:以所述初始位姿作为初值,使用所述第一算法对所述前一帧图像和所述当前帧图像进行逐像素的颜色对齐,得到所述前一帧图像和所述当前帧图像之间的相对坐标变换,从而获取所述当前帧的第一计算位姿。
可选地,以所述第一计算位姿为初值,基于所述当前帧图像和当前帧重建模型,使用第二算法获取当前帧的第二计算位姿包括:以所述第一计 算位姿为初值,使用所述第二算法将所述当前帧图像与所述当前帧重建模型对齐,得到所述前一帧图像和所述当前帧图像之间的相对坐标变换,从而获取当前帧的第二计算位姿。
可选地,所述第一算法在低分辨率图像上计算,所述第二算法在高分辨率图像上计算。
可选地,通过惯性导航模块获取所述图像捕获单元的初始位姿。
可选地,根据所述第二计算位姿更新惯性导航模块的状态量,从而更新所述图像捕获单元的初始位姿。
可选地,通过所述惯性导航模块使用多传感器融合的方法获取所述图像捕获单元的初始位姿。
可选地,所述惯性导航模块为基于扩展卡尔曼滤波的状态估计系统。
可选地,该方法还包括:对所述第二计算位姿进行验证,在验证通过时,使用所述第二计算位姿、所述当前帧图像更新所述当前帧重建模型。
可选地,对所述第二计算位姿进行验证包括:通过在所述当前帧重建模型中获取对比图像,将所述对比图像与所述当前帧图像进行比较,实现对所述第二计算位姿的验证。
可选地,该方法还包括:在验证通过时,从验证通过的图像帧中,选取关键帧;基于选取的所述关键帧,构建词袋数据库。
可选地,该方法还包括:在验证不通过时,使用重定位方法恢复对所述图像捕获单元的位姿跟踪。
可选地,在验证不通过时,将所述当前帧图像标记为跟踪失败,当连续跟踪失败的帧数超过第二阈值时,表明对所述图像捕获单元的位姿跟踪丢失,使用所述重定位方法恢复对所述图像捕获单元的跟踪。
可选地,所述重定位方法包括:在位姿跟踪丢失时,计算所述当前帧图像的词袋向量;根据所构建的词袋数据库和所述当前帧图像的词袋向量,选取候选关键帧;根据所述候选关键帧和所述当前帧图像之间的相对位姿,使用第三算法获取当前帧的第三计算位姿;根据所述第三计算位姿更新所述图像捕获单元的初始位姿,以恢复对所述图像捕获单元的位姿跟踪。
可选地,该方法还包括:在恢复对所述图像捕获单元的位姿跟踪后,将所述惯性导航模块初始化。
根据本申请其中一实施例的另一个方面,提供了一种位姿跟踪装置,该装置包括:图像捕获单元,配置为获取扫描对象的连续多帧图像;初始位姿确定单元,配置为确定所述图像捕获单元的初始位姿;第一位姿获取单元,配置为以所述初始位姿作为初值,基于所述连续多帧图像中的前一帧图像和当前帧图像,使用第一算法获取当前帧的第一计算位姿;第二位姿获取单元,配置为以所述第一计算位姿为初值,基于所述当前帧图像和当前帧重建模型,使用第二算法获取当前帧的第二计算位姿;位姿更新单元,配置为根据所述第二计算位姿更新所述图像捕获单元的初始位姿,以实现对所述图像捕获单元的位姿跟踪。
可选地,所述初始位姿确定单元还配置为将所述图像捕获单元的初始位姿设置为单位矩阵或者随机设定为任意值。
可选地,所述连续多帧图像为连续的RGB-D图像。
可选地,所述第一位姿获取单元配置为以所述初始位姿作为初值,使用所述第一算法对所述前一帧图像和所述当前帧图像进行逐像素的颜色对齐,得到所述前一帧图像和所述当前帧图像之间的相对坐标变换,从而获取所述当前帧的第一计算位姿。
可选地,所述第二位姿获取单元配置为以所述第一计算位姿为初值,使用所述第二算法将所述当前帧图像与所述当前帧重建模型对齐,得到所述前一帧图像和所述当前帧图像之间的相对坐标变换,从而获取当前帧的第二计算位姿。
可选地,所述第一算法在低分辨率图像上计算,所述第二算法在高分辨率图像上计算。
可选地,所述初始位姿确定单元为惯性导航模块。
可选地,所述位姿更新单元还配置为根据所述第二计算位姿更新惯性导航模块的状态量,从而更新所述图像捕获单元的初始位姿。
可选地,所述惯性导航模块配置为使用多传感器融合的方法获取所述图像捕获单元的初始位姿。
可选地,所述惯性导航模块为基于扩展卡尔曼滤波的状态估计系统。
可选地,所述位姿跟踪装置还包括:位姿验证单元,配置为对所述第二计算位姿进行验证,在验证通过时,使用所述第二计算位姿、所述当前帧图像更新所述当前帧重建模型。
可选地,所述位姿验证单元还配置为通过在所述当前帧重建模型中获取对比图像,将所述对比图像与所述当前帧图像进行比较,实现对所述第二计算位姿的验证。
可选地,所述位姿验证单元还配置为在验证通过时,从验证通过的图像帧中,选取关键帧;基于选取的所述关键帧,构建词袋数据库。
可选地,所述位姿验证单元还配置为在验证不通过时,使用重定位方法恢复对所述图像捕获单元的位姿跟踪。
可选地,所述位姿验证单元还配置为在验证不通过时,将所述当前帧图像标记为跟踪失败,当连续跟踪失败的帧数超过第二阈值时,表明对所述图像捕获单元的位姿跟踪丢失,使用所述重定位方法恢复对所述图像捕获单元的跟踪。
可选地,所述重定位方法包括:在位姿跟踪丢失时,计算所述当前帧图像的词袋向量;根据所构建的词袋数据库和所述当前帧图像的词袋向量,选取候选关键帧;根据所述候选关键帧和所述当前帧图像之间的相对位姿,使用第三算法获取当前帧的第三计算位姿;根据所述第三计算位姿更新所述图像捕获单元的初始位姿,以恢复对所述图像捕获单元的位姿跟踪。
可选地,所述初始位姿确定单元还配置为在恢复对所述图像捕获单元的位姿跟踪后,将所述惯性导航模块初始化。
根据本申请其中一实施例的另一个方面,提供了一种电子设备,包括:处理器;以及存储器,配置为存储所述处理器的可执行指令;其中,所述 处理器配置为经由执行所述可执行指令来执行上述任意一项所述的位姿跟踪方法。
根据本申请其中一实施例的另一个方面,提供了一种存储介质,该存储介质包括存储的程序,其中,在所述程序运行时控制所述存储介质所在设备执行上述任意一项所述的位姿跟踪方法。
在本申请其中一实施例中,通过获取扫描对象的连续多帧图像和图像捕获单元的初始位姿;以所述初始位姿作为初值,基于所述连续多帧图像中的前一帧图像和当前帧图像,使用第一算法获取当前帧的第一计算位姿;以所述第一计算位姿为初值,基于所述当前帧图像和当前帧重建模型,使用第二算法获取当前帧的第二计算位姿;根据所述第二计算位姿更新所述图像捕获单元的初始位姿,并重复上述步骤以实现对所述图像捕获单元的位姿跟踪。不仅可以对图像捕获单元的位姿实现准确估计,并且在扫描对象偏离视野范围时也能提供大致准确的位姿,提高对图像捕获单元位姿跟踪的鲁棒性。进而解决现有技术中位姿跟踪鲁棒性较差,容易出现跟踪丢失的问题。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明配置为解释本申请,并不构成对本申请的不当限定。在附图中:
图1根据本申请其中一实施例的一种可选的位姿跟踪方法的流程图;
图2是根据本申请其中一实施例的一种可选的基于惯性导航模块的位姿跟踪方法的流程图;
图3a和图3b分别是使用KinectFusion算法生成的三维模型和轨迹误差图;
图4a和图4b分别是使用本申请实施例提供的基于惯性导航模块的位姿跟踪方法生成的三维模型和轨迹误差图;
图5是根据本申请其中一实施例的一种可选的包含重定位的位姿跟踪方法的流程图;
图6是根据本申请其中一实施例的一种可选的位姿跟踪装置的结构框图;
图7是根据本申请其中一实施例的一种可选的电子设备的结构框图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里 描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
本申请实施例可以应用于端的模式,即应用于各种可移动设备的摄像头(智能手机相机、数码相机、单反相机、深度相机、Pad相机、手提电脑相机、游戏机相机等);也可以应用于云加端的模式,即应用于计算机系统/服务器,其可与众多其它通用或者专用计算系统环境或配置一起操作。适于与计算机系统/服务器一起使用的众所周知的计算系统、环境和/或配置的例子包括但不限于:个人计算机系统、手持或膝上设备、基于微处理器的系统、可编程消费电子产品、小型计算机系统、大型计算机系统和包括上述任何系统的分布式云计算技术环境,等等。
计算机系统/服务器可以在由计算机系统执行的计算机系统可执行指令(诸如程序模块等)的一般语境下描述。通常,程序模块可以包括例程、程序、组件、逻辑以及数据结构等等,它们执行特定的任务或者实现特定的抽象数据类型。计算机系统/服务器可以在分布式云计算环境中实施,由通过通信网络链接的远程处理设备执行任务。在分布式云计算环境中,程序模块可以位于包括存储设备的本地或者远程计算系统存储介质上。
下面通过详细的实施例来说明本申请。
【实施例一】
根据本申请的一个方面,提供了一种位姿跟踪方法。参考图1,是根据本申请其中一实施例的一种可选的位姿跟踪方法的流程图。如图1所示,该方法包括以下步骤:
S10:获取扫描对象的连续多帧图像和图像捕获单元的初始位姿;
S12:以初始位姿作为初值,基于连续多帧图像中的前一帧图像和当前帧图像,使用第一算法获取当前帧的第一计算位姿;
S14:以第一计算位姿为初值,基于连续多帧图像中的当前帧图像和当前帧重建模型,使用第二算法获取当前帧的第二计算位姿;
S16:根据第二计算位姿更新图像捕获单元的初始位姿,并重复上述步骤以实现对图像捕获单元的位姿跟踪。
在本申请其中一实施例中,通过上述步骤,不仅可以对图像捕获单元的位姿实现准确估计,并且在扫描对象偏离视野范围时也能提供大致准确的位姿,提高对图像捕获单元位姿跟踪的鲁棒性。
下面对上述各步骤进行详细说明。
步骤S10,获取扫描对象的连续多帧图像和图像捕获单元的初始位姿;
可选的,在本申请其中一实施例中,扫描对象的连续多帧图像可以使用图像捕获单元获得,图像捕获单元可以为独立的摄像头或集成有摄像头的相机、手机等电子设备,摄像头的类型包括红外结构光摄像头、飞行时间摄像头(Time-of-flight,ToF)、RGB摄像头、Mono摄像头等;图像捕获单元的初始位姿可以设置为单位矩阵或者随机设定为任意值。其中,图 像捕获单元位姿包括图像捕获单元的空间三维位置和朝向,含6个自由度。连续多帧图像可以是连续的RGB-D图像,RGB-D图像是由一帧深度图(Depth image)和一帧彩色图组成的图像对,深度图和彩色图通常由不同的图像捕获单元分别获取,并且可以假设每帧的彩色图与深度图在时间上是同步的,对于相对位置固定的彩色和深度摄像头,很容易通过外参标定的方式实现数据对齐,通过图像获取的时间戳实现每帧的彩色图与深度图的帧同步。
步骤S12:以初始位姿作为初值,基于连续多帧图像中的前一帧图像和当前帧图像,使用第一算法获取当前帧的第一计算位姿;
可选的,在本申请其中一实施例中,第一算法是一种基于逐像素颜色对齐的三维点云对齐算法,例如,稠密视觉测径法(Dense Visual Odometry,DVO)算法。以初始位姿作为初值,使用第一算法对前一帧图像和当前帧图像进行逐像素的颜色对齐,可以得到前一帧图像和当前帧图像之间的相对坐标变换,从而获取当前帧的第一计算位姿。
步骤S14:以第一计算位姿为初值,基于当前帧图像和当前帧重建模型,使用第二算法获取当前帧的第二计算位姿。
可选的,在本申请其中一实施例中,第二算法是一种迭代的三维点云对齐算法,例如,KinectFusion算法中改良的迭代最近邻(Iterative Closest Point,ICP)算法。以第一计算位姿为初值,使用第二算法将当前帧图像(例如,RGB-D图像中的深度图像)与当前帧重建模型对齐,可以得到前一帧图像和当前帧图像之间的相对坐标变换,从而获取当前帧的 第二计算位姿。
上述步骤S12和步骤S14均可以归结为求解一个以6自由度变换参数为优化目标的非线性最小二乘问题。
S16:根据第二计算位姿更新图像捕获单元的初始位姿,并重复上述步骤以实现对图像捕获单元的位姿跟踪。
可选的,在本申请其中一实施例中,根据第二计算位姿更新步骤S10中图像捕获单元的初始位姿,并进行步骤S12和S14,通过不断重复上述步骤,可以实现对图像捕获单元的位姿跟踪。
通过步骤S10至S16所实现的位姿跟踪方法,由于采用了两种不同的算法,不仅能实现准确的图像捕获单元的位姿估计,而且由于在步骤S12中仅涉及前一帧图像与当前帧图像,而与当前帧的重建模型无关,因此,在扫描对象偏离视野范围时也能提供大致准确的位姿,提高对图像捕获单元位姿跟踪的鲁棒性。
可选的,在本申请其中一实施例中,为了提高计算效率,可以将RGB-D图像建立3-4层图像金字塔,第一算法(例如DVO算法)在低分辨率图像上计算,第二算法(例如ICP算法)在高分辨率图像上计算,以此降低整体算法的复杂度。
由于通过优化方法求解非线性最小二乘问题,可能会存在优化结果不正确的情况。因此,可选的,在本申请其中一实施例中,在步骤S14之后,还可以包括步骤S15:对第二计算位姿进行验证。具体地,可以通过在当 前帧重建模型中获取对比图像,将对比图像与当前帧图像(例如,当前帧深度图像)进行比较,实现对第二计算位姿的验证。其中,在当前帧重建模型中获取对比图像可以是使用光线投射的方法从当前帧重建模型中渲染出一张深度图作为对比图像。在获取对比图像后,将其与当前帧深度图像比较,使用鲁棒核函数计算加权均方差,然后与第一阈值进行比较,实现对第二计算位姿的验证。若验证通过,说明对图像捕获单元的位姿跟踪成功,步骤S16还可以包括使用第二计算位姿、当前帧图像更新当前帧重建模型。否则,说明对图像捕获单元的位姿跟踪失败,不对重建模型进行更新。
由于仅涉及前一帧图像与当前帧图像,而与当前帧的重建模型无关,以及在低分辨率图像上实施第一算法,在高分辨率图像上实施第二算法,因此,上述根据本申请实施例提供的位姿跟踪方法具有精度高和速度快的特点,但是不适用于图像捕获单元快速运动的情况。当图像捕获单元运动过快时,相邻帧图像内容相差过大,存在运动模糊的问题,可能会导致图像捕获单元的位姿跟踪失败。
【实施例二】
根据本申请其中一实施例的另一方面,还提供了一种基于惯性导航模块的位姿跟踪方法,以进一步提高图像捕获单元跟踪的鲁棒性。参考图2,是根据本申请其中一实施例的一种可选的基于惯性导航模块的位姿跟踪方法的流程图。如图2所示,该方法包括以下步骤:
S20:获取扫描对象的连续多帧图像,并通过惯性导航模块获取图像 捕获单元的初始位姿;
S22:以初始位姿作为初值,基于连续多帧图像中的前一帧图像和当前帧图像,使用第一算法获取当前帧的第一计算位姿;
S24:以第一计算位姿为初值,基于当前帧图像和当前帧重建模型,使用第二算法获取当前帧的第二计算位姿;
S26:根据第二计算位姿更新惯性导航模块的状态量,从而更新图像捕获单元的初始位姿,并重复上述步骤以实现对图像捕获单元的位姿跟踪。
在本申请其中一实施例中,通过上述步骤,除了可以对图像捕获单元位姿实现准确估计,在扫描对象偏离视野范围时也能提供大致准确的图像捕获单元位姿,并且由于惯性导航模块与图像无关,不会受到运动模糊的影响,因此,本实施例提供的基于惯性导航模块的位姿跟踪方法,还可以显著提高剧烈运动情况下对图像捕获单元位姿跟踪的鲁棒性。此外,由于惯性导航模块的计算量很小,并且相较于将图像捕获单元的初始位姿设置为单位矩阵或者随机设定为任意值的方法,利用惯性导航模块可以获取基本准确的图像捕获单元的初始位姿,能够加速优化收敛的速度,提高计算性能。
上述步骤S22和S24基本与实施例一的步骤S12和S14相同,在此不再赘述。下面对步骤S20和步骤S26进行详细说明。
步骤S20:获取扫描对象的连续多帧图像,并通过惯性导航模块获取图像捕获单元的初始位姿;
可选的,在本申请其中一实施例中,扫描对象的连续多帧图像可以使用图像捕获单元获得,图像捕获单元可以为独立的摄像头或集成有摄像头的相机、手机等电子设备,摄像头的类型包括红外结构光摄像头、飞行时间摄像头(Time-of-flight,ToF)、RGB摄像头、Mono摄像头等。其中,图像捕获单元位姿包括图像捕获单元的空间三维位置和朝向,含6个自由度。连续多帧图像可以是连续的RGB-D图像,RGB-D图像是由一帧深度图(Depth image)和一帧彩色图组成的图像对,深度图和彩色图通常由不同的图像捕获单元分别获取。
可选的,在本申请其中一实施例中,惯性导航模块是一个基于扩展卡尔曼滤波(Extended Kalman Filter,EKF)的状态估计系统。惯性导航模块可以以移动平台常见的惯性传感器(Inertial Measurement Unit,IMU)的数据作为输入,通过动力学积分方法获得图像捕获单元的初始位姿。惯性传感器是一种通过惯性力测量运动状态的传感器,常用的惯性传感器包括获取线性加速度数据的加速度计(Accelerator)和获取角速度数据的陀螺仪(Gyroscope)。考虑到移动平台常见的惯性传感器存在很大的噪声以及连续变化的偏置量,直接基于原始测量值进行计算得到的位姿误差非常大。因此,将惯性传感器读数作为测量值,使用多传感器融合的方法,通过求解动力学方程完成卡尔曼滤波的预测,预测的位姿作为图像捕获单元的初始位姿。通过使用多传感器融合的方法,可以综合考虑多种传感器的测量结果和不确定性,从而得到更准确的位姿估计结果。
可选的,在步骤S26中根据第二计算位姿更新惯性导航模块的状态量, 状态量可以包括惯性导航模块(例如,加速度计和陀螺仪)的位置、速度、朝向,以及惯性导航模块的偏置等。
与实施例一类似,在步骤S24之后,还可以包括步骤S25:对第二计算位姿进行验证。具体地,可以通过在当前帧重建模型中获取对比图像,将对比图像与当前帧图像(例如,当前帧深度图像)进行比较,实现对第二计算位姿的验证。其中,在当前帧重建模型中获取对比图像可以是使用光线投射的方法从当前帧重建模型中渲染出一张深度图作为对比图像。在获取对比图像后,将其与当前帧深度图像比较,使用鲁棒核函数计算加权均方差,然后与第一阈值进行比较,实现对第二计算位姿的验证。若验证通过,说明位姿跟踪成功,步骤S26还可以包括使用第二计算位姿、当前帧图像更新当前帧重建模型。否则,说明图像捕获单元跟踪失败,不对重建模型和惯性导航模块的状态量进行更新。
由此,通过使用所述第二计算位姿、所述当前帧图像更新所述当前帧重建模型,可以实现重建扫描对象的三维模型。
参考图3a和图3b,分别是使用KinectFusion算法生成的三维模型和轨迹误差图。KinectFusion是一种基于红外结构光输入的实时三维重建方法。KinectFusion所采用的实时相机跟踪方法是一种改良的迭代最近邻(Iterative Closest Point,ICP)算法。与原始的ICP算法相比,KinectFusion采用投影法确定对应点,替代计算准确最近邻的步骤,计算速度显著提升;另一方面,KinectFusion始终将输入深度图与当前重建模型进行对齐,明显减少了相邻帧对齐所产生的累积误差。但是, KinectFusion算法在两种情况下很容易发生跟踪失败:一是手持图像捕获单元剧烈运动的情况,KinectFusion使用投影法确定对应点的方法仅适用于相机运动较慢的情况,快速、随意的相机运动很容易导致跟踪丢失,跟踪丢失是指无法估计相机位姿,或者估计的相机位姿与实际相差非常大;二是扫描对象偏离视野范围的情况,当重建模型基本移出相机视野范围时,KinectFusion方法必定会跟踪丢失,这种情况在普通用户拍摄时很容易出现。。从图3a和图3b中可以看出,使用KinectFusion算法生成的三维模型失真度较大,每帧轨迹误差图中图像捕获单元的估计位姿与真实位姿的差值(如图3b的阴影区域所示)也较大。
参考图4a和图4b,分别是使用本申请实施例提供的基于惯性导航模块的位姿跟踪方法生成的三维模型和轨迹误差图。可以看出,对于手持图像捕获单元剧烈运动的情况,基于惯性导航模块的位姿跟踪方法几乎不会因为图像捕获单元运动过快造成跟踪丢失的情况,这主要是由于惯性导航模块利用了高频的惯性传感器数据,能够提供较为准确的初始位姿。对于扫描对象偏离视野范围的情况,由于使用第一算法获取当前帧的第一计算位姿时,仅涉及前一帧图像与当前帧图像,而与当前帧的重建模型无关,因此,在扫描对象偏离视野范围时也能提供大致准确的图像捕获单元位姿,提高对图像捕获单元位姿跟踪的鲁棒性。由此,使用基于惯性导航模块的位姿跟踪方法生成的三维模型基本能真实还原物体外形(如图4a所示),每帧轨迹误差图中图像捕获单元的估计位姿与真实位姿的差值(如图4b的阴影区域所示)也非常小。
上述基于惯性导航模块的位姿跟踪方法,虽然对剧烈运动等情形下的图像捕获单元位姿跟踪的鲁棒性有显著的改善,但是在用户遮挡摄像头,或者场景发生明显变化等情形下,位姿跟踪丢失的问题仍然是不可避免的。
【实施例三】
根据本申请其中一实施例的又一方面,还提供了一种包含重定位的位姿跟踪方法,能够在跟踪丢失的情况下快速恢复对图像捕获单元的位姿跟踪,重新估计图像捕获单元的位姿,以进一步提高图像捕获单元跟踪的鲁棒性,改善用户体验。参考图5,是根据本申请其中一实施例的一种可选的包含重定位的位姿跟踪方法的流程图。如图5所示,该方法包括以下步骤:
S50:获取扫描对象的连续多帧图像,并通过惯性导航模块获取图像捕获单元的初始位姿;
S52:以初始位姿作为初值,基于连续多帧图像中的前一帧图像和当前帧图像,使用第一算法获取当前帧的第一计算位姿;
S54:以第一计算位姿为初值,基于当前帧图像和当前帧重建模型,使用第二算法获取当前帧的第二计算位姿;
S55:对第二计算位姿进行验证,在验证通过时,使用所述第二计算位姿、所述当前帧图像更新所述当前帧重建模型;在验证不通过时,使用重定位方法恢复对图像捕获单元的跟踪;
S56:根据第二计算位姿更新惯性导航模块的状态量,从而更新所述 图像捕获单元的初始位姿,并重复上述步骤以实现对所述图像捕获单元的位姿跟踪。
上述步骤S50、S52、S54和S56基本与实施例二的步骤S20、S22、S24和S26相同,在此不再赘述。下面对步骤S55进行详细说明。在步骤S55中,对第二计算位姿进行验证,在验证通过时,使用所述第二计算位姿、所述当前帧图像更新所述当前帧重建模型;其中,验证方法可以采用实施例一和二中所描述的方法。在验证不通过时,说明位姿跟踪失败,不对重建模型和惯性导航模块的状态量进行更新,使用重定位方法恢复对图像捕获单元的位姿跟踪。例如,可以在验证不通过时,将当前帧图像标记为跟踪失败,当连续跟踪失败的帧数超过第二阈值时,表明对图像捕获单元的位姿跟踪丢失,使用重定位方法恢复对图像捕获单元的位姿跟踪。
可选的,在本申请其中一实施例中,重定位方法可以包含特征点和词袋(Bag of Worlds,BoW)匹配,以实现在位姿跟踪丢失的情况下快速恢复对图像捕获单元的位姿跟踪。其中,词袋是一种通过图片特征点描述子来描述图像特征的方法。
具体地,步骤S55还可以包括:
步骤S550:在验证通过时,从验证通过的图像帧中,选取关键帧。
可选的,在本申请其中一实施例中,在验证通过时,可以根据第二计算位姿,每隔一定角度和距离,从验证通过的相邻若干图像帧里挑选最清晰的一帧作为关键帧。图像的清晰度可以使用低通滤波器对图像进行光顺处理,得到一幅模糊图像。通过比较原图像和模糊图像的差别,即可得到 模糊图像的模糊度。原图像与模糊图像越接近,说明原图像本身越模糊。
步骤S551:基于选取的关键帧,构建词袋数据库。
可选的,在本申请其中一实施例中,可以基于选取的关键帧,提取图像特征点,计算特征点描述子和词袋向量,并构建词袋数据库中。其中,上述词袋数据库可以通过离线的方式构建,采用一种类似文本检索的方式,从一组图像样本集中训练一个离线词袋数据库。将图像特征类比于单词,整个词袋数据库类比于词典。通过词典,对于任意一个特征,都能在词典中找到与之对应的单词。首先计算图像样本集中每个样本的图像特征以及描述子,然后根据Kmeans++算法将图像特征聚类成单词,并划分为K类子空间,将划分的子空间继续利用Kmeans++算法做聚类,按照上述循环将描述子聚类成树型结构,整个树型结构构成词典,树的叶子节点(也称为词节点)构成单词。同时在词典建立过程中,还为每个单词记录了该单词在所有的训练图像中出现的频率,出现的频率越高,表示这个单词的区分度越小,以描述该单词所表示的图像特征的区分度。
此外,步骤S55中:在验证不通过时,使用重定位方法恢复对图像捕获单元的跟踪包括:
步骤S552:在位姿跟踪丢失时,计算当前帧图像的词袋向量;
步骤S553:根据所构建的词袋数据库和当前帧图像的词袋向量,选取候选关键帧;
步骤S554:计算所述候选关键帧和所述当前帧图像之间的相对位姿,
获取当前帧的第三计算位姿;
步骤S555:根据所述第三计算位姿更新所述图像捕获单元的初始位姿,以恢复对所述图像捕获单元的位姿跟踪。
可选的,在本申请实施例的上述步骤S553中,可以计算所构建的词袋数据库中所有关键帧的词袋向量与当前帧图像的词袋向量的相似性,将相似性超过第三阈值的关键帧作为候选关键帧。具体地,在计算相似性时,可以首先在所构建的词袋数据库筛选出和当前帧图像共享了词节点的所有关键帧,作为第一次筛选后的候选关键帧,同时计算出与候选关键帧共享词节点最大的数量。然后将共享词节点的最小阈值设为最大数量的0.8倍,筛除共享词节点的数量小于最小阈值的关键帧,剩下的作为第二次筛选后的候选关键帧。接着将每个第二次筛选后的候选关键帧和其位置相近的关键帧组合成候选关键帧组,通过词袋向量计算候选关键帧组和当前帧图像的相似度得分的总和,筛选出那些总得分高于第四阈值(例如,最高总得分的0.75倍)的关键帧,作为最终筛选出的候选关键帧。
可选的,在本申请实施例的上述步骤S554中,对每个候选关键帧,筛选与当前帧图像匹配的描述子,然后通过随机抽样一致算法(Random sample consensus,RANSAC)筛除误匹配对。在当前帧图像帧深度已知的情况下,通过候选关键帧与当前帧之间的稀疏特征点匹配对,使用第三算法(例如,PnP算法)计算所述候选关键帧和所述当前帧图像之间的相对位姿,获取当前帧的第三计算位姿。在筛选出的特征点匹配对数大于第五阈值时,位姿恢复成功,根据所述第三计算位姿更新步骤S50中所述图像 捕获单元的初始位姿,由此,可以实现图像捕获单元的重定位,恢复对所述图像捕获单元的位姿跟踪。
由于在对图像捕获单元位姿跟踪丢失的情况下,惯性导航模块长时间得不到反馈,因而其系统状态很可能远远偏离真实值。为此,当图像捕获单元重定位成功并恢复对图像捕获单元的位姿跟踪时,需要将惯性导航模块重新初始化,包括将惯性导航模块的外参、偏置和方差矩阵设为默认值,使用当前帧图像的位姿反求出惯性导航模块的初始位姿等。
为了测试重定位方法的有效性,针对三种不同情况进行了实验。一、手持图像捕获单元远离被扫描物体,然后沿任意路线运动几秒至几分钟之后返回原拍摄场景;二、在拍摄过程中将图像捕获单元的镜头完全遮挡,过一段时间后再放开;三、在拍摄过程中将图像捕获单元的镜头完全遮挡,保持遮挡状态按照任意路线移动图像捕获单元一段时间,然后大致返回原先的拍摄点。在三种情况下,使用包含重定位的位姿跟踪方法能在不到1秒的时间内恢复对图像捕获单元的位姿跟踪,完全满足应用需求。
依据本申请实施例所提供的上述位姿跟踪方法,除了可以对图像捕获单元的位姿实现准确估计,在扫描对象偏离视野范围时也能提供大致准确的位姿,并且基于多传感器融合的思想,通过综合惯性导航模块和图像捕获单元的输入,对图像捕获装置实现更鲁棒的位姿跟踪,尤其是在剧烈运动情况下位姿跟踪的鲁棒性。同时,在相机跟踪丢失的情况下,能够通过重定位方法提供有效的场景重拾,快速地恢复对图像捕获装置的位姿跟踪。此外,上述位姿跟踪方法特别适用于移动平台,一方面,它充分利用了移 动平台常见的多种传感器设备;另一方面,它的计算代价比较小,满足移动平台的实时计算性能要求。
依据本申请实施例所提供的上述位姿跟踪方法,除了可以用于实现扫描对象的三维重建,也适用于增强现实应用。
【实施例四】
根据本申请其中一实施例的另一方面,还提供了一种位姿跟踪装置。参考图6,是根据本申请其中一实施例的一种可选的位姿跟踪装置的结构框图。如图6所示,位姿跟踪装置6包括:
图像捕获单元60,配置为获取扫描对象的连续多帧图像;
可选的,在本申请其中一实施例中,图像捕获单元可以为独立的摄像头或集成有摄像头的相机、手机等电子设备,摄像头的类型包括红外结构光摄像头、飞行时间摄像头(Time-of-flight,ToF)、RGB摄像头、Mono摄像头等;连续多帧图像可以是连续的RGB-D图像,RGB-D图像是由一帧深度图(Depth image)和一帧彩色图组成的图像对,深度图和彩色图通常由不同的图像捕获单元分别获取,并且可以假设每帧的彩色图与深度图在时间上是同步的,对于相对位置固定的彩色和深度摄像头,很容易通过外参标定的方式实现数据对齐,通过图像获取的时间戳实现每帧的彩色图与深度图的帧同步。
初始位姿确定单元62,配置为确定图像捕获单元的初始位姿;
可选的,在本申请其中一实施例中,初始位姿确定单元还配置为将图 像捕获单元的初始位姿设置为单位矩阵或者随机设定为任意值。其中,图像捕获单元的位姿包括图像捕获单元的空间三维位置和朝向,含6个自由度。
第一位姿获取单元64,配置为以初始位姿作为初值,基于连续多帧图像中的前一帧图像和当前帧图像,使用第一算法获取当前帧的第一计算位姿;
可选的,在本申请其中一实施例中,第一算法是一种基于逐像素颜色对齐的三维点云对齐算法,例如,稠密视觉测径法(Dense Visual Odometry,DVO)算法。以初始位姿作为初值,使用第一算法对前一帧图像和当前帧图像进行逐像素的颜色对齐,可以得到前一帧图像和当前帧图像之间的相对坐标变换,从而获取当前帧的第一计算位姿。
第二位姿获取单元66,配置为以第一计算位姿为初值,基于当前帧图像和当前帧重建模型,使用第二算法获取当前帧的第二计算位姿;
可选的,在本申请其中一实施例中,第二算法是一种迭代的三维点云对齐算法,例如,KinectFusion算法中改良的迭代最近邻(Iterative Closest Point,ICP)算法。以第一计算位姿为初值,使用第二算法将当前帧图像(例如,RGB-D图像中的深度图像)与当前帧重建模型对齐,可以得到前一帧图像和当前帧图像之间的相对坐标变换,从而获取当前帧的第二计算位姿。
位姿更新单元68,配置为根据第二计算位姿更新图像捕获单元的初始位姿,以实现对图像捕获单元的位姿跟踪。
根据本申请实施例所提供的位姿估计装置,不仅能实现准确的图像捕获单元的位姿估计,而且由于第一位姿获取单元仅涉及前一帧图像与当前帧图像,而与当前帧的重建模型无关,因此,在扫描对象偏离视野范围时也能提供大致准确的位姿,提高对图像捕获单元位姿跟踪的鲁棒性。
可选的,在本申请其中一实施例中,为了提高计算效率,可以将RGB-D图像建立3-4层图像金字塔,第一算法(例如DVO算法)在低分辨率图像上计算,第二算法(例如ICP算法)在高分辨率图像上计算,以此降低整体算法的复杂度。
虽然上述根据本申请实施例提供的位姿跟踪装置具有精度高和速度快的特点,但是不适用于图像捕获单元快速运动的情况。当图像捕获单元运动过快时,相邻帧图像内容相差过大,存在运动模糊的问题,可能会导致图像捕获单元的位姿跟踪失败。
为了进一步提高对图像捕获单元位姿跟踪的鲁棒性,初始位姿确定单元62可以为惯性导航模块。可选的,在本申请其中一实施例中,惯性导航模块是一个基于扩展卡尔曼滤波(Extended Kalman Filter,EKF)的状态估计系统。惯性导航模块可以以移动平台常见的惯性传感器(Inertial Measurement Unit,IMU)的数据作为输入,通过动力学积分方法获得图像捕获单元的初始位姿。惯性传感器是一种通过惯性力测量运动状态的传感器,常用的惯性传感器包括获取线性加速度数据的加速度计(Accelerator)和获取角速度数据的陀螺仪(Gyroscope)。考虑到移动平台常见的惯性传感器存在很大的噪声以及连续变化的偏置量,直接基于原始测量值进行计 算得到的图像捕获单元位姿误差非常大。因此,将惯性传感器读数作为测量值,使用多传感器融合的方法,通过求解动力学方程完成卡尔曼滤波的预测,预测的图像捕获单元位姿作为图像捕获单元的初始位姿。通过使用多传感器融合的方法,可以综合考虑多种传感器的测量结果和不确定性,从而得到更准确的状态估计结果。
在初始位姿确定单元62为惯性导航模块的情况下,所述位姿更新单元68还配置为根据所述第二计算位姿更新惯性导航模块的状态量,从而更新所述图像捕获单元的初始位姿。
以惯性导航模块作为初始位姿确定单元的位姿跟踪装置,可以显著提高剧烈运动情况下对图像捕获单元位姿跟踪的鲁棒性。此外,由于惯性导航模块的计算量很小,并且相较于将图像捕获单元的初始位姿设置为单位矩阵或者随机设定为任意值的方法,利用惯性导航模块可以获取基本准确的图像捕获单元的初始位姿,能够加速优化收敛的速度,提高计算性能。
根据本申请实施例提供的位姿跟踪装置还包括:位姿验证单元67,配置为对第二计算位姿进行验证。具体地,可以通过在当前帧重建模型中获取对比图像,将对比图像与当前帧图像(例如,当前帧深度图像)进行比较,实现对第二计算位姿的验证。其中,在当前帧重建模型中获取对比图像可以是使用光线投射的方法从当前帧重建模型中渲染出一张深度图作为对比图像。在获取对比图像后,将其与当前帧深度图像比较,使用鲁棒核函数计算加权均方差,然后与第一阈值进行比较,实现对第二计算位姿的验证。在验证通过时,说明位姿跟踪成功,使用所述第二计算位姿、所 述当前帧图像更新所述当前帧重建模型。否则,说明图像捕获单元跟踪失败,不对重建模型和惯性导航模块的状态量进行更新。
由此,通过使用第二计算位姿、当前帧图像更新当前帧重建模型,可以实现重建扫描对象的三维模型。
虽然上述基于惯性导航模块的位姿跟踪装置,对剧烈运动等情形下的图像捕获单元跟踪的鲁棒性有显著的改善,但是在用户遮挡摄像头,或者场景发生明显变化等情形下,位姿跟踪丢失的问题仍然是不可避免的。
为了能够在跟踪丢失的情况下快速恢复对图像捕获单元的位姿跟踪,重新估计图像捕获单元的位姿,以进一步提高对图像捕获单元位姿跟踪的鲁棒性,改善用户体验,上述位姿验证单元还配置为在验证通过时,从验证通过的图像帧中,选取关键帧;基于选取的所述关键帧,构建词袋数据库;以及在验证不通过时,使用重定位方法恢复对图像捕获单元的跟踪。例如,可以在验证不通过时,将当前帧图像标记为跟踪失败,当连续跟踪失败的帧数超过第二阈值时,表明对所述图像捕获单元的位姿跟踪丢失,使用图像捕获单元重定位方法恢复图像捕获单元跟踪。
可选的,在本申请其中一实施例中,重定位方法可以包含特征点和词袋(Bag of Worlds,BoW)匹配,以实现在位姿跟踪丢失的情况下快速恢复对图像捕获单元的位姿跟踪。其中,词袋是一种通过图片特征点描述子来描述图像特征的方法。
可选的,在本申请其中一实施例中,在验证通过时,从验证通过的图像帧中,选取关键帧包括:在验证通过时,可以根据第二计算位姿,每隔 一定角度和距离,从验证通过的相邻若干图像帧里挑选最清晰的一帧作为关键帧。图像的清晰度可以使用低通滤波器对图像进行光顺处理,得到一幅模糊图像。通过比较原图像和模糊图像的差别,即可得到模糊图像的模糊度。原图像与模糊图像越接近,说明原图像本身越模糊。
可选的,在本申请其中一实施例中,基于选取的所述关键帧,构建词袋数据库包括:基于选取的关键帧,提取图像特征点,计算特征点描述子和词袋向量,并构建词袋数据库中。其中,上述词袋数据库可以通过离线的方式构建,采用一种类似文本检索的方式,从一组图像样本集中训练一个离线词袋数据库。将图像特征类比于单词,整个词袋数据库类比于词典。通过词典,对于任意一个特征,都能在词典中找到与之对应的单词。首先计算图像样本集中每个样本的图像特征以及描述子,然后根据Kmeans++算法将图像特征聚类成单词,并划分为K类子空间,将划分的子空间继续利用Kmeans++算法做聚类,按照上述循环将描述子聚类成树型结构,整个树型结构构成词典,树的叶子节点(也称为词节点)构成单词。同时在词典建立过程中,还为每个单词记录了该单词在所有的训练图像中出现的频率,出现的频率越高,表示这个单词的区分度越小,以描述该单词所表示的图像特征的区分度。
可选的,在本申请其中一实施例中,在验证不通过时,使用重定位方法恢复对图像捕获单元的跟踪包括:在位姿跟踪丢失时,计算当前帧图像的词袋向量;根据所构建的词袋数据库和当前帧图像的词袋向量,选取候选关键帧;计算所述候选关键帧和所述当前帧图像之间的相对位姿,获取 当前帧的第三计算位姿;根据所述第三计算位姿更新所述图像捕获单元的初始位姿,以恢复对所述图像捕获单元的位姿跟踪。
由于在对图像捕获单元跟踪丢失的情况下,惯性导航模块长时间得不到视觉定位模块的反馈,因而其系统状态很可能远远偏离真实值。为此,当重定位成功并恢复对图像捕获单元的位姿跟踪时,需要将惯性导航模块重新初始化,包括将惯性导航模块的外参、偏置和方差矩阵设为默认值,使用当前图像捕获单元位姿反求出惯性导航模块的初始位姿等。
【实施例五】
根据本申请其中一实施例的另一方面,还提供了一种电子设备。参考图7,是根据本申请其中一实施例的一种可选的电子设备的结构框图。如图7所示,电子设备7包括:处理器70;以及存储器72,配置为存储所述处理器70的可执行指令;其中,所述处理器70被配置为经由执行所述可执行指令来执行实施例一至实施例三中任意一项所述的位姿跟踪方法。
【实施例六】
根据本申请其中一实施例的另一方面,还提供了一种存储介质,其中,所述存储介质包括存储的程序,其中,在所述程序运行时控制所述存储介质所在设备执行实施例一至实施例三中任意一项所述的位姿跟踪方法。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
在本申请的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的技术内容,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,可以为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全 部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅是本申请的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。
工业实用性
通过获取扫描对象的连续多帧图像和图像捕获单元的初始位姿;以所述初始位姿作为初值,基于所述连续多帧图像中的前一帧图像和当前帧图像,使用第一算法获取当前帧的第一计算位姿;以所述第一计算位姿为初值,基于所述当前帧图像和当前帧重建模型,使用第二算法获取当前帧的第二计算位姿;根据所述第二计算位姿更新所述图像捕获单元的初始位姿,并重复上述步骤以实现对所述图像捕获单元的位姿跟踪。不仅可以对图像捕获单元的位姿实现准确估计,并且在扫描对象偏离视野范围时也能提供大致准确的位姿,提高对图像捕获单元位姿跟踪的鲁棒性。进而解决现有技术中位姿跟踪鲁棒性较差,容易出现跟踪丢失的问题。
同时,在相机跟踪丢失的情况下,能够通过重定位方法提供有效的场景重拾,快速地恢复对图像捕获装置的位姿跟踪。此外,上述位姿跟踪方法特别适用于移动平台,一方面,它充分利用了移动平台常见的多种传感器设备;另一方面,它的计算代价比较小,满足移动平台的实时计算性能要求。

Claims (36)

  1. 一种位姿跟踪方法,所述方法包括以下步骤:
    获取扫描对象的连续多帧图像和图像捕获单元的初始位姿;
    以所述初始位姿作为初值,基于所述连续多帧图像中的前一帧图像和当前帧图像,使用第一算法获取当前帧的第一计算位姿;
    以所述第一计算位姿为初值,基于所述当前帧图像和当前帧重建模型,使用第二算法获取当前帧的第二计算位姿;
    根据所述第二计算位姿更新所述图像捕获单元的初始位姿,并重复上述步骤以实现对所述图像捕获单元的位姿跟踪。
  2. 根据权利要求1所述的方法,其中,所述图像捕获单元的初始位姿设置为单位矩阵或者随机设定为任意值。
  3. 根据权利要求1所述的方法,其中,所述连续多帧图像为连续的RGB-D图像。
  4. 根据权利要求1所述的方法,其中,以所述初始位姿作为初值,基于所述连续多帧图像中的前一帧图像和当前帧图像,使用第一算法获取当前帧的第一计算位姿包括:以所述初始位姿作为初值,使用所述第一算法对所述前一帧图像和所述当前帧图像进行逐像素的颜色对齐,得到所述前一帧图像和所述当前帧图像之间的相对坐标变换,从而获取所述当前帧的第一计算位姿。
  5. 根据权利要求1所述的方法,其中,以所述第一计算位姿为初值,基于所述当前帧图像和当前帧重建模型,使用第二算法获取当前帧的第二计算位姿包括:以所述第一计算位姿为初值,使用所述第二算法对所述当前帧图像与所述当前帧重建模型进行三维点云对齐,得到所述前一帧图像和所述当前帧图像之间的相对坐标变换,从而获取当前帧的第二计算位姿。
  6. 根据权利要求1所述的方法,其中,所述第一算法在低分辨率图像上计算,所述第二算法在高分辨率图像上计算。
  7. 根据权利要求1所述的方法,其中,通过惯性导航模块获取所述图像捕获单元的初始位姿。
  8. 根据权利要求7所述的方法,其中,根据所述第二计算位姿更新惯性导航模块的状态量,从而更新所述图像捕获单元的初始位姿。
  9. 根据权利要求7所述的方法,其中,通过所述惯性导航模块使用多传感器融合的方法获取所述图像捕获单元的初始位姿。
  10. 根据权利要求7所述的方法,其中,所述惯性导航模块为基于扩展卡尔曼滤波的状态估计系统。
  11. 根据权利要求1或7所述的方法,其中,还包括:对所述第二计算位姿进行验证,在验证通过时,使用所述第二计算位姿、所述当前帧图像更新所述当前帧重建模型。
  12. 根据权利要求11所述的方法,其中,对所述第二计算位姿进行验证包 括:通过在所述当前帧重建模型中获取对比图像,将所述对比图像与所述当前帧图像进行比较,实现对所述第二计算位姿的验证。
  13. 根据权利要求11所述的方法,其中,还包括:
    在验证通过时,从验证通过的图像帧中,选取关键帧;
    基于选取的所述关键帧,构建词袋数据库。
  14. 根据权利要求13所述的方法,其中,还包括:在验证不通过时,判断对所述图像捕获单元的位姿跟踪是否丢失,并使用重定位方法恢复对所述图像捕获单元的位姿跟踪。
  15. 根据权利要求14所述的方法,其中,判断对所述图像捕获单元的位姿跟踪是否丢失包括:将所述当前帧图像标记为跟踪失败,当连续跟踪失败的帧数超过第二阈值时,表明对所述图像捕获单元的位姿跟踪丢失,使用所述重定位方法恢复对所述图像捕获单元的位姿跟踪。
  16. 根据权利要求14所述的方法,其中,所述重定位方法包括:
    在位姿跟踪丢失时,计算所述当前帧图像的词袋向量;
    根据所构建的词袋数据库和所述当前帧图像的词袋向量,选取候选关键帧;
    计算所述候选关键帧和所述当前帧图像之间的相对位姿,获取当前帧的第三计算位姿;
    根据所述第三计算位姿更新所述图像捕获单元的初始位姿,以恢 复对所述图像捕获单元的位姿跟踪。
  17. 根据权利要求14所述的方法,还包括:在恢复对所述图像捕获单元的位姿跟踪后,将惯性导航模块初始化。
  18. 一种位姿跟踪装置,所述装置包括:
    图像捕获单元,配置为获取扫描对象的连续多帧图像;
    初始位姿确定单元,配置为确定所述图像捕获单元的初始位姿;
    第一位姿获取单元,配置为以所述初始位姿作为初值,基于所述连续多帧图像中的前一帧图像和当前帧图像,使用第一算法获取当前帧的第一计算位姿;
    第二位姿获取单元,配置为以所述第一计算位姿为初值,基于所述当前帧图像和当前帧重建模型,使用第二算法获取当前帧的第二计算位姿;
    位姿更新单元,配置为根据所述第二计算位姿更新所述图像捕获单元的初始位姿,以实现对所述图像捕获单元的位姿跟踪。
  19. 根据权利要求18所述的装置,其中,所述初始位姿确定单元还配置为将所述图像捕获单元的初始位姿设置为单位矩阵或者随机设定为任意值。
  20. 根据权利要求18所述的装置,其中,所述连续多帧图像为连续的RGB-D图像。
  21. 根据权利要求18所述的装置,其中,所述第一位姿获取单元配置为以所述初始位姿作为初值,使用所述第一算法对所述前一帧图像和所述当前帧图像进行逐像素的颜色对齐,得到所述前一帧图像和所述当前帧图像之间的相对坐标变换,从而获取所述当前帧的第一计算位姿。
  22. 根据权利要求18所述的装置,其中,所述第二位姿获取单元配置为以所述第一计算位姿为初值,使用所述第二算法对所述当前帧图像与所述当前帧重建模型进行三维点云对齐,得到所述前一帧图像和所述当前帧图像之间的相对坐标变换,从而获取当前帧的第二计算位姿。
  23. 根据权利要求18所述的装置,其中,所述第一算法在低分辨率图像上计算,所述第二算法在高分辨率图像上计算。
  24. 根据权利要求18所述的装置,其中,所述初始位姿确定单元为惯性导航模块。
  25. 根据权利要求24所述的装置,其中,所述位姿更新单元还配置为根据所述第二计算位姿更新惯性导航模块的状态量,从而更新所述图像捕获单元的初始位姿。
  26. 根据权利要求24所述的装置,其中,所述惯性导航模块配置为使用多传感器融合的方法获取所述图像捕获单元的初始位姿。
  27. 根据权利要求24所述的装置,其中,所述惯性导航模块为基于扩展卡尔曼滤波的状态估计系统。
  28. 根据权利要求18或24所述的装置,其中,所述位姿跟踪装置还包括:
    位姿验证单元,配置为对所述第二计算位姿进行验证,在验证通过时,
    使用所述第二计算位姿、所述当前帧图像更新所述当前帧重建模型。
  29. 根据权利要求28所述的装置,其中,所述位姿验证单元还配置为通过在所述当前帧重建模型中获取对比图像,将所述对比图像与所述当前帧图像进行比较,实现对所述第二计算位姿的验证。
  30. 根据权利要求28所述的装置,其中,所述位姿验证单元还配置为在验证通过时,从验证通过的图像帧中,选取关键帧;基于选取的所述关键帧,构建词袋数据库。
  31. 根据权利要求30所述的装置,其中,所述位姿验证单元还配置为在验证不通过时,判断对所述图像捕获单元的位姿跟踪是否丢失,并使用重定位方法恢复对所述图像捕获单元的位姿跟踪。
  32. 根据权利要求31所述的装置,其中,所述位姿验证单元还配置为将所述当前帧图像标记为跟踪失败,当连续跟踪失败的帧数超过第二阈值时,表明对所述图像捕获单元的位姿跟踪丢失,使用所述重定位方法恢复对所述图像捕获单元的跟踪。
  33. 根据权利要求31所述的装置,其中,所述重定位方法包括:
    在位姿跟踪丢失时,计算所述当前帧图像的词袋向量;
    根据所构建的词袋数据库和所述当前帧图像的词袋向量,选取候选关键帧;
    计算所述候选关键帧和所述当前帧图像之间的相对位姿,获取当 前帧的第三计算位姿;
    根据所述第三计算位姿更新所述图像捕获单元的初始位姿,以恢复对所述图像捕获单元的位姿跟踪。
  34. 根据权利要求31所述的装置,所述初始位姿确定单元还配置为在恢复对所述图像捕获单元的位姿跟踪后,将惯性导航模块初始化。
  35. 一种电子设备,包括:
    处理器;以及
    存储器,配置为存储所述处理器的可执行指令;
    其中,所述处理器配置为经由执行所述可执行指令来执行权利要求1至17中任意一项所述的位姿跟踪方法。
  36. 一种存储介质,其中,所述存储介质包括存储的程序,其中,在所述程序运行时控制所述存储介质所在设备执行权利要求1至17中任意一项所述的位姿跟踪方法。
PCT/CN2020/083893 2019-05-14 2020-04-09 位姿跟踪方法、位姿跟踪装置及电子设备 WO2020228453A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020217041032A KR20220008334A (ko) 2019-05-14 2020-04-09 포즈 추적 방법, 포즈 추적 장치 및 전자 기기
US17/610,449 US11922658B2 (en) 2019-05-14 2020-04-09 Pose tracking method, pose tracking device and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910396914.7A CN111951325B (zh) 2019-05-14 2019-05-14 位姿跟踪方法、位姿跟踪装置及电子设备
CN201910396914.7 2019-05-14

Publications (1)

Publication Number Publication Date
WO2020228453A1 true WO2020228453A1 (zh) 2020-11-19

Family

ID=73290130

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/083893 WO2020228453A1 (zh) 2019-05-14 2020-04-09 位姿跟踪方法、位姿跟踪装置及电子设备

Country Status (4)

Country Link
US (1) US11922658B2 (zh)
KR (1) KR20220008334A (zh)
CN (1) CN111951325B (zh)
WO (1) WO2020228453A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112948411A (zh) * 2021-04-15 2021-06-11 深圳市慧鲤科技有限公司 位姿数据的处理方法及接口、装置、系统、设备和介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112837424B (zh) * 2021-02-04 2024-02-06 脸萌有限公司 图像处理方法、装置、设备和计算机可读存储介质
CN112884814B (zh) * 2021-03-15 2023-01-06 南通大学 一种抗遮挡的动作跟踪方法、装置及存储介质
CN113256718B (zh) * 2021-05-27 2023-04-07 浙江商汤科技开发有限公司 定位方法和装置、设备及存储介质
CN113256711B (zh) * 2021-05-27 2024-03-12 南京航空航天大学 一种单目相机的位姿估计方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160327395A1 (en) * 2014-07-11 2016-11-10 Regents Of The University Of Minnesota Inverse sliding-window filters for vision-aided inertial navigation systems
CN106780601A (zh) * 2016-12-01 2017-05-31 北京未动科技有限公司 一种空间位置追踪方法、装置及智能设备
CN107833270A (zh) * 2017-09-28 2018-03-23 浙江大学 基于深度相机的实时物体三维重建方法
CN109410316A (zh) * 2018-09-21 2019-03-01 深圳前海达闼云端智能科技有限公司 物体的三维重建的方法、跟踪方法、相关装置及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10284794B1 (en) * 2015-01-07 2019-05-07 Car360 Inc. Three-dimensional stabilized 360-degree composite image capture
US10802147B2 (en) * 2016-05-18 2020-10-13 Google Llc System and method for concurrent odometry and mapping
CN110675426B (zh) * 2018-07-02 2022-11-22 百度在线网络技术(北京)有限公司 人体跟踪方法、装置、设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160327395A1 (en) * 2014-07-11 2016-11-10 Regents Of The University Of Minnesota Inverse sliding-window filters for vision-aided inertial navigation systems
CN106780601A (zh) * 2016-12-01 2017-05-31 北京未动科技有限公司 一种空间位置追踪方法、装置及智能设备
CN107833270A (zh) * 2017-09-28 2018-03-23 浙江大学 基于深度相机的实时物体三维重建方法
CN109410316A (zh) * 2018-09-21 2019-03-01 深圳前海达闼云端智能科技有限公司 物体的三维重建的方法、跟踪方法、相关装置及存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112948411A (zh) * 2021-04-15 2021-06-11 深圳市慧鲤科技有限公司 位姿数据的处理方法及接口、装置、系统、设备和介质

Also Published As

Publication number Publication date
US20220222849A1 (en) 2022-07-14
CN111951325A (zh) 2020-11-17
CN111951325B (zh) 2024-01-12
KR20220008334A (ko) 2022-01-20
US11922658B2 (en) 2024-03-05

Similar Documents

Publication Publication Date Title
WO2020228453A1 (zh) 位姿跟踪方法、位姿跟踪装置及电子设备
CN107747941B (zh) 一种双目视觉定位方法、装置及系统
US8798387B2 (en) Image processing device, image processing method, and program for image processing
US9495761B2 (en) Environment mapping with automatic motion model selection
CN107168532B (zh) 一种基于增强现实的虚拟同步显示方法及系统
Rambach et al. Learning to fuse: A deep learning approach to visual-inertial camera pose estimation
Oskiper et al. Multi-sensor navigation algorithm using monocular camera, IMU and GPS for large scale augmented reality
CN113706699B (zh) 数据处理方法、装置、电子设备及计算机可读存储介质
CN111127524A (zh) 一种轨迹跟踪与三维重建方法、系统及装置
CN108960045A (zh) 眼球追踪方法、电子装置及非暂态电脑可读取记录媒体
JP2015521419A (ja) コンピュータ生成された3次元オブジェクトとフィルムカメラからの映像フィードとをリアルタイムに混合または合成するシステム
CN106650965B (zh) 一种远程视频处理方法及装置
JP5774226B2 (ja) 方位センサに基づくホモグラフィ分解の曖昧性の解決
WO2013048641A1 (en) Framework for reference-free drift-corrected planar tracking using lucas-kanade optical flow
CN110533719B (zh) 基于环境视觉特征点识别技术的增强现实定位方法及装置
JP6894707B2 (ja) 情報処理装置およびその制御方法、プログラム
CN110119189B (zh) Slam系统的初始化、ar控制方法、装置和系统
JP2018014579A (ja) カメラトラッキング装置および方法
Vo et al. Spatiotemporal bundle adjustment for dynamic 3d human reconstruction in the wild
CN114882106A (zh) 位姿确定方法和装置、设备、介质
CN114022556A (zh) 定位初始化方法、装置和计算机可读存储介质
CN110009683B (zh) 基于MaskRCNN的实时平面上物体检测方法
US10977810B2 (en) Camera motion estimation
Artemciukas et al. Kalman filter for hybrid tracking technique in augmented reality
CN113344981A (zh) 位姿数据处理方法、装置及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20806626

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20217041032

Country of ref document: KR

Kind code of ref document: A

122 Ep: pct application non-entry in european phase

Ref document number: 20806626

Country of ref document: EP

Kind code of ref document: A1