WO2019205851A1 - 位姿确定方法、装置、智能设备及存储介质 - Google Patents

位姿确定方法、装置、智能设备及存储介质 Download PDF

Info

Publication number
WO2019205851A1
WO2019205851A1 PCT/CN2019/079342 CN2019079342W WO2019205851A1 WO 2019205851 A1 WO2019205851 A1 WO 2019205851A1 CN 2019079342 W CN2019079342 W CN 2019079342W WO 2019205851 A1 WO2019205851 A1 WO 2019205851A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature point
matrix
parameter
marker
Prior art date
Application number
PCT/CN2019/079342
Other languages
English (en)
French (fr)
Inventor
林祥凯
乔亮
朱峰明
左宇
杨泽宇
凌永根
暴林超
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP19793479.7A priority Critical patent/EP3786896A4/en
Publication of WO2019205851A1 publication Critical patent/WO2019205851A1/zh
Priority to US16/913,144 priority patent/US11222440B2/en
Priority to US17/543,515 priority patent/US11798190B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • G06V40/11Hand-related biometrics; Hand pose recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/12Fingerprints or palmprints
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30204Marker
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • the embodiments of the present invention relate to the field of computer technologies, and in particular, to a posture determining method, device, smart device, and storage medium.
  • AR Augmented Reality
  • a virtual image video or three-dimensional model. It can display the virtual scene together with the actual scene. It is currently the field of computer vision.
  • the related art proposes a method for determining camera position and posture by tracking feature points in a marker image, first defining a template image in advance, and extracting feature points in the template image, along with the position or posture of the camera.
  • the change tracks the extracted feature points, and each time the camera currently captures an image, the feature points of the template image are recognized in the current image, so that the position and posture of the feature point in the current image and the feature point in the template image can be
  • the position and posture are compared to obtain the pose parameter of the feature point, thereby obtaining the pose parameter of the current image relative to the template image, such as a rotation parameter and a displacement parameter, which can indicate the position of the camera when capturing the current image and attitude.
  • the inventors have found that the above related art has at least the following problem: when there is too much change in the position or posture of the camera, and there is no feature point in the current image, the feature point cannot be traced. Unable to determine the position and posture of the camera.
  • the embodiment of the present application provides a method, a device, a smart device, and a storage medium for determining a pose, which can solve the problems of the related art.
  • the technical solution is as follows:
  • a pose determination method comprising:
  • the second image is an image taken by the camera after the first image.
  • a pose determining apparatus comprising:
  • a first acquiring module configured to acquire a pose parameter of the first image captured by the camera with respect to the mark image by tracking the first feature point, where the first feature point is obtained by extracting a feature point from the mark image;
  • a feature point processing module configured to extract a second feature point from the first image when the first image does not satisfy the feature point tracking condition, the second feature point being different from the first feature point;
  • a second acquiring module configured to acquire a pose parameter of the second image captured by the camera relative to the marker image by tracking the first feature point and the second feature point, and according to the pose parameter Determining the pose of the camera, the second image being an image taken by the camera after the first image.
  • a smart device comprising: a processor and a memory, wherein the memory stores at least one instruction, at least one program, a code set or a set of instructions, the instruction, the program, The set of codes or the set of instructions is loaded and executed by the processor to:
  • the second image is an image taken by the camera after the first image.
  • a fourth aspect provides a computer readable storage medium having stored therein at least one instruction, at least one program, a code set, or a set of instructions, the instruction, the program, the code set Or the set of instructions is loaded by the processor and has operations to implement the pose determination method as described in the first aspect.
  • the method, the device, the smart device, and the storage medium provided by the embodiments of the present application when the first feature point is tracked, and the pose parameter of the image captured by the camera relative to the mark image is obtained, when the first image does not satisfy the feature point tracking In the condition, the second feature point is extracted from the first image, and the first feature point and the second feature point are tracked, and the pose parameter of the image captured by the camera relative to the mark image is obtained, thereby determining the position and posture of the camera.
  • the method of extracting new feature points is adopted, which avoids the situation that the feature point cannot be traced due to excessive change of the position or posture of the camera, enhances the robustness, and improves the tracking of the camera. Precision.
  • the method of decomposing the homography matrix is used to obtain the pose parameters, which avoids the complicated tracking algorithm, and makes the result more stable and smooth without jitter, especially suitable for AR scenes.
  • FIG. 1 is a schematic diagram of display of a scene interface provided by an embodiment of the present application.
  • FIG. 2 is a schematic diagram of display of another scene interface provided by an embodiment of the present application.
  • FIG. 3 is a flowchart of a pose determination method provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of an image provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of distribution of feature points provided by an embodiment of the present application.
  • FIG. 6 is a flowchart of a pose determination method provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of an operation flow provided by an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a pose determining apparatus according to an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • the embodiment of the present application provides a pose determination method, which is applied to a scene in which the smart device tracks the position and posture of the camera, especially in an AR scenario, when the smart device uses AR technology for display, such as displaying an AR game and an AR video. Wait, you need to track the camera's position and posture.
  • the smart device is configured with a camera and a display unit, the camera is used to capture an image of a real scene, and the display unit is configured to display a scene interface composed of a real scene and a virtual scene.
  • the smart device can track the change of the position and posture of the camera with the movement of the camera, and can also capture the image of the real scene, and sequentially display the plurality of images currently captured according to the change of the position and posture of the camera, thereby simulating the display of the three-dimensional interface. effect.
  • virtual elements such as virtual images, virtual videos, or virtual three-dimensional models, may be added to the displayed image.
  • virtual elements may be displayed in different orientations according to changes in the position and posture of the camera, thereby simulating Shows the effect of displaying 3D virtual elements.
  • the image of the real scene is displayed in combination with the virtual elements to form a scene interface, thereby simulating the effect that the real scene and the virtual element are in the same three-dimensional space.
  • the smart device adds a virtual character image to the captured image including the table and the teacup.
  • the captured image changes, and the virtual character image is also photographed.
  • the change simulates the virtual character in the image relative to the table and the cup, while the camera captures the effect of the table, the cup and the avatar with the change of position and posture, presenting a real stereo to the user. Picture.
  • FIG. 3 is a flowchart of a method for determining a pose according to an embodiment of the present application.
  • the execution body of the pose determination method is a smart device, and the smart device may be a mobile phone, a tablet, or the like configured with a camera or configured AR equipment such as AR glasses and AR helmets of the camera, see FIG. 3, the method includes:
  • the smart device acquires an image captured by the camera, and sets the captured image as a marker image.
  • the pose parameter of the camera is determined by tracking the feature points of the mark image, the bit The pose parameter is used to determine the position and pose of the smart device.
  • the smart device can capture an image through the camera, acquire an image currently captured by the camera, set the image as a marker image, thereby implementing initialization of the marker image, and the subsequent smart device continues to capture other images.
  • the pose parameters of each image can be obtained by tracking the feature points of the marked image.
  • the camera can shoot according to a preset period, and an image is taken every other preset period, and the preset period can be 0.1 seconds or 0.01 seconds.
  • the feature points may be extracted from the image to determine whether the extracted feature points are reached.
  • a preset number when the number of feature points extracted from the image reaches a preset number, the image is set as a marker image, and when the number of feature points extracted from the image does not reach a preset number,
  • the next image captured by the camera is acquired until the number of extracted feature points reaches a preset number of images, and the number of extracted feature points reaches a preset number of images as a marker image.
  • the feature extraction algorithm used in extracting feature points may be a FAST (Features from Accelerated Segment Test) detection algorithm, a Shi-Tomasi corner detection algorithm, and a Harris Corner Detection (Harris angle). Point detection algorithm, SIFT (Scale-Invariant Feature Transform) algorithm, etc., the preset number can be determined according to the requirements for tracking accuracy.
  • the marker image is first divided into a plurality of grid regions of the same size, feature points are extracted from the marker image, and the weight of each feature point extracted is obtained. Extracting a feature point with the highest weight in each divided grid area as the first feature point, other feature points with lower weight will not be considered until all the grid areas in the marker image are extracted. A feature point or until the number of first feature points extracted in the mark image reaches a preset number.
  • the size of each grid area may be determined according to the tracking accuracy requirement and the number of first feature points that are required to be extracted.
  • the weight of the feature points is used to represent the gradient size of the feature points. The greater the weight of the feature points, the larger the gradient. The easier it is to track, so tracking with feature points with larger weights will improve tracking accuracy. For example, for each feature point, the gradient of the feature point is obtained, the gradient is directly used as the weight of the feature point, or the gradient is adjusted according to a preset coefficient, and the weight of the feature point is obtained, so that the feature is The weight of the point is proportional to the gradient of the feature point.
  • the rotation parameter and displacement parameter of the marker image, the initial feature point depth, and the initial homography matrix are set.
  • the initial feature point depth s can be set to 1
  • the rotation parameter matrix is set to an identity matrix
  • the initial translation matrix is set to [0, 0, s]
  • the initial homography matrix is set to an identity matrix.
  • the first feature point is extracted from the marker image, and the first feature point extracted from the marker image is taken as the target feature point to be tracked.
  • the smart device captures at least one image through the camera, and by tracking the first feature point in the at least one image, a pose parameter of each image relative to the marker image is obtained.
  • the embodiment of the present application takes the current tracking of the first feature point as an example, and the first feature point may include the first feature point extracted from the marker image, or may include the first feature point extracted from the marker image, or may The first feature point extracted from the image captured by the camera after the image is marked is similar to the manner in which the second feature point is extracted from the first image in the following steps, and details are not described herein.
  • the first feature point extracted from the previous image is used to perform optical flow, thereby finding the first feature matched between the previous image and the next image.
  • the optical flow information of the matched first feature point is obtained, and the optical flow information is used to represent the motion information of the matched first feature point in the adjacent two images, and then the optical flow according to the matched first feature point is obtained.
  • the information may determine a pose parameter of the second of the two adjacent images relative to the first image.
  • the algorithm used for optical flow can be Lucas-Kanade optical flow algorithm or other algorithms.
  • descriptors or direct methods can be used to match feature points to find the previous one. The first feature point that matches between the image and the next image.
  • the pose parameter relative to the marker image.
  • the pose parameter of the first image relative to the marker image may include a displacement parameter and a rotation parameter, the displacement parameter being used to indicate a distance between a position when the camera captures the first image and a position when the marker image is captured,
  • the rotation parameter is used to indicate an angular difference between the rotation angle at which the camera takes the first image and the rotation angle at which the marker image is captured.
  • the pose parameter can be represented by a form of a rotational displacement matrix composed of a rotation parameter matrix and a displacement parameter matrix, wherein the rotation parameter matrix includes a rotation parameter, and the displacement parameter matrix includes a displacement parameter.
  • the camera sequentially captures image 1, image 2, and image 3, and acquires the pose parameter (R1, T1) of the image 1 with respect to the mark image, and the pose parameter of the image 2 with respect to the image 1 ( R2, T2) and the pose parameters (R3, T3) of image 3 with respect to image 2, according to these pose parameters, iteration can be performed to determine the pose parameter (R3', T3') of image 3 relative to the mark image is :
  • the pose parameter can be obtained through the homography matrix, that is, the step 302 can include the following steps 3021-3022:
  • the homography matrix is a matrix representing a conversion relationship between a feature point in the first image and a corresponding feature point in the marker image, and thus has the following relationship:
  • x c represents the homogeneous coordinate corresponding to the two-dimensional coordinates of the feature points in the image c
  • x a represents the homogeneous coordinates corresponding to the two-dimensional coordinates of the corresponding feature points in the image a
  • H ca represents the single image of the image c relative to the image a Dependency matrix.
  • the homography matrix is a 3*3 matrix, which can be expressed as
  • the plurality of first feature points may be tracked, thereby obtaining the homogeneous coordinates corresponding to the two-dimensional coordinates of the plurality of first feature points respectively in the two adjacent images captured by the camera, according to the acquired
  • the coordinates of the above relationship can be used to calculate the homography matrix between the two images.
  • the homography matrix includes 9 elements, and one of them is set to 1 and has 8 unknowns, so in order to obtain the unique solution of the homography matrix, at least 4 feature points are acquired in the two adjacent images.
  • the homogeneous coordinates of the two-dimensional coordinates is thereby obtaining the homogeneous coordinates corresponding to the two-dimensional coordinates of the plurality of first feature points respectively in the two adjacent images captured by the camera, according to the acquired
  • the coordinates of the above relationship can be used to calculate the homography matrix between the two images.
  • the homography matrix includes 9 elements, and one of them is set to 1 and has 8 unknowns, so in order to obtain the unique solution of the homography matrix, at least 4 feature points
  • a homography matrix of each image relative to the previous image may be acquired by tracking the first feature point from each of the next image of the marker image to the first image. An iterative process is performed on each image with respect to the homography matrix of the previous image to obtain a homography matrix of the first image with respect to the mark image.
  • the step 3022 includes:
  • the image coordinate system of the marked image is translated by one unit in the negative direction of the z-axis to form a first coordinate system, and the homography matrix is decomposed according to the preset constraint condition that the rotational displacement matrix should satisfy, to obtain the first image relative to A rotational displacement matrix of the marker image in the first coordinate system.
  • the rotation displacement matrix includes a rotation parameter matrix and a displacement parameter matrix of the first image relative to the marker image in the first coordinate system, and the elements in the rotation parameter matrix are the markers in the first image relative to the first coordinate system.
  • the rotation parameter of the image, the element in the displacement parameter matrix is the displacement parameter of the first image relative to the marker image in the first coordinate system.
  • the preset constraint is that the column vector of the rotation parameter matrix in the rotation displacement matrix is an identity matrix, and the product of the first column and the second column of the rotation parameter matrix is equal to the third column.
  • the feature points in the first image and the corresponding feature points in the mark image also have the following conversion relationship:
  • Rcm represents a rotation parameter matrix of the first image relative to the marker image in the first coordinate system
  • Tcm represents a displacement parameter matrix of the first image relative to the marker image in the first coordinate system
  • g represents a normalization factor
  • P represents a camera Perspective projection parameters
  • Used to align non-homogeneous items Used to convert the image coordinate system of the marker image into the first coordinate system.
  • the homography matrix is known in the above formula, P is known, and the normalization factor g can be calculated according to the condition that the column vector of the rotation parameter matrix is a unit matrix, and then the first column and the second column of the rotation parameter matrix are obtained. After the first column and the second column are multiplied, the third column is obtained, thereby calculating the rotation parameter matrix Rcm, and the displacement parameter matrix Tcm can be calculated according to the normalization factor g and the third column of the homography matrix.
  • the position of the marker image in the camera can be calculated. Since the marker image must be located in front of the camera, the product of the displacement parameter of the marker image and the ordinate of the marker image in the camera coordinate system If it is less than 0, the positive and negative of the displacement parameter matrix Tcm can be determined according to this constraint.
  • Rca represents a rotation parameter matrix of the first image with respect to the marker image
  • Tca represents a displacement parameter matrix of the first image with respect to the marker image
  • the rotation parameter and the displacement parameter of the first image relative to the marker image can be determined according to the rotation displacement matrix.
  • FIG. 4 For example, a plurality of images taken by the camera are shown in FIG. 4, and the tracking process includes the following steps:
  • the camera captures the marked image a.
  • the camera takes a plurality of images and tracks the first feature point of the marker image a until the image c is captured.
  • the feature point of the marker image a is translated by one unit in the negative direction of the z-axis to form a coordinate system m, and the homography matrix of the image c with respect to the image a is decomposed to obtain an image of the image c relative to the coordinate system m.
  • the pose parameter of the first image may also be acquired according to the pose parameter of the first image relative to the marker image and the pose parameter of the marker image.
  • s represents the depth of the first image
  • R_final represents the rotation parameter matrix of the first image
  • T_final represents the displacement parameter matrix of the first image
  • Rca represents the rotation parameter matrix of the first image relative to the marker image
  • Tca represents the first image relative to the marker
  • R_first represents the rotation parameter matrix of the marker image
  • T_first represents the displacement parameter matrix of the marker image.
  • the number of first feature points included in the captured image may gradually decrease, resulting in some of the first feature points in the previous image being in the next image. There is no matching first feature point in the middle. In this case, when the first feature points included in the adjacent two images are matched, a part of the first feature points that do not match are excluded.
  • the result of the homography matrix calculation and the optical flow matching result can be detected to exclude the unreasonable first feature point. That is, for each first feature point, the first can be calculated according to the homography matrix of the image when the first feature point is extracted with respect to the mark image and the homography matrix of the first image with respect to the mark image. A homography matrix between the image and the image when the first feature point is extracted.
  • the camera captures the first image, it is also determined whether the first image satisfies the feature point tracking condition.
  • the feature point tracking condition may be that the number of tracked feature points reaches a preset number, and when the number of feature points tracked in a certain image reaches a preset number, determining that the image satisfies the feature Point tracking condition, otherwise, it is determined that the image does not satisfy the feature point tracking condition.
  • the number of first feature points tracked in the first image is acquired, and when the number reaches a preset number, determining that the first image satisfies the feature point tracking condition.
  • the number does not reach the preset number, it is determined that the first image does not satisfy the feature point tracking condition.
  • the second feature point different from the first feature point is extracted from the first image, and the first feature point tracked in the first image and the newly extracted second feature point are extracted Both are used as target feature points to be tracked, and tracking continues, thereby increasing the number of feature points.
  • the feature extraction algorithm used in extracting the feature points may be a FAST detection algorithm, a Shi-Tomasi corner detection algorithm, a Harris Corner Detection algorithm, a SIFT algorithm, etc., and the preset quantity may be determined according to the requirement for tracking accuracy.
  • the number of feature points can be increased to ensure the smooth progress of the tracking process, avoiding the number of feature points becoming less and causing tracking failure, and improving tracking accuracy.
  • the first feature points may be concentrated in the same area, and the distribution is too dense, resulting in insufficient information or excessive distribution. Decentralized, the information provided is not accurate enough.
  • the first feature point at this time does not have the representativeness of the current image.
  • the pose parameter of the first feature point will not accurately reflect the pose parameter of the current image, which will cause a large calculation error.
  • the left image is the first feature point in the initial marker image
  • the right image is the first image.
  • the marker image becomes a first image after being enlarged, causing the first feature point to be too scattered in the first image to accurately describe the first image. If the pose parameter of the first image is obtained according to the first feature point that is too scattered, the pose parameter is not accurate enough.
  • the second feature point is extracted from the first image
  • the first image is first divided into a plurality of grid regions of the same size, the feature points are extracted from the first image, and the weight of each extracted feature point is obtained.
  • the first image is first divided into a plurality of grid regions of the same size, and the first feature point traced in the first image is obtained.
  • the weight of the extraction according to the obtained weight, extracts a first feature point with the highest weight in each of the divided grid regions, and no longer extracts the first feature point with a lower weight, thereby concentrating the plurality of The first feature point with a lower weight in a feature point is removed.
  • the size of each grid area can be determined according to the tracking accuracy requirement and the number of feature points required to be extracted.
  • the weight of the feature points is used to represent the gradient size of the feature points. The greater the weight of the feature points, the larger the gradient, the easier it is to track. Therefore, tracking with feature points with larger weights can improve tracking accuracy. For example, for each feature point, the gradient of the feature point is obtained, the gradient is directly used as the weight of the feature point, or the gradient is adjusted according to a preset coefficient, and the weight of the feature point is obtained, so that the feature is The weight of the point is proportional to the gradient of the feature point.
  • the homography matrix of the first image is recorded, so as to subsequently match the homography matrix and the optical flow of the image when extracting the second feature point with respect to the mark image. As a result, it is detected whether the motion of the second feature point is unreasonable, thereby determining whether or not the second feature point is to be deleted.
  • the first feature point and the second feature point are continuously tracked in the image captured by the camera.
  • the pose parameter of the second image relative to the mark image may include at least one of a displacement parameter and a rotation parameter, where the displacement parameter is used to indicate a position when the camera captures the second image and a position when the mark image is captured.
  • the distance between the rotation parameters is used to indicate the angular difference between the rotation angle when the camera captures the second image and the rotation angle when the marker image is captured.
  • the pose parameter can be represented by a form of a rotational displacement matrix composed of a rotation parameter matrix and a displacement parameter matrix, wherein the rotation parameter matrix includes a rotation parameter, and the displacement parameter matrix includes a displacement parameter.
  • the pose parameter can be obtained by the homography matrix, that is, the step 304 can include the following steps 3041-3042:
  • each of the images may be acquired relative to the previous image by tracking the first feature point and the second feature point in each of the images from the next image of the mark image to the second image
  • the homography matrix iteratively processes the homography matrix of each image with respect to the previous image to obtain a homography matrix of the second image with respect to the marker image.
  • the homography matrix is decomposed to obtain a rotational displacement matrix of the second image relative to the mark image, and the pose of the second image relative to the mark image is obtained from the rotational displacement matrix. parameter.
  • the step 3042 includes:
  • the image coordinate system of the second image is translated by one unit in the negative direction of the z-axis to form a second coordinate system, and the homography matrix is decomposed according to the preset constraint condition that the rotation displacement matrix should satisfy to obtain the second coordinate system.
  • the rotation parameter and the displacement parameter of the second image relative to the marker image can be determined according to the rotation displacement matrix.
  • the pose parameter of the second image may be acquired according to the pose parameter of the second image relative to the marker image and the pose parameter of the marker image, and the specific process is similar to the process of acquiring the pose parameter of the first image. , will not repeat them here.
  • the obtained pose parameter can be smoothed and outputted by using a filter to avoid jitter.
  • the filter can be a kalman filter or other filter.
  • the embodiment of the present application is only described by taking one marker image as an example.
  • not only feature points but also marker images may be replaced during the tracking process. If the current image does not satisfy the feature point tracking condition, the previous image of the current image is used as the replaced marker image, and the tracking is continued based on the replaced marker image. By replacing the marked image, it is also possible to avoid tracking failure due to excessive changes in the position or posture of the camera.
  • the method provided by the embodiment of the present application in the process of acquiring the pose parameter of the camera-photographed image relative to the mark image while tracking the first feature point, when the first image does not satisfy the feature point tracking condition, from the first image Extracting the second feature point, and acquiring the pose parameter of the image captured by the camera with respect to the mark image by tracking the first feature point and the second feature point, thereby determining the position and posture of the camera, thereby avoiding changes due to the position or posture of the camera.
  • the situation where the feature points cannot be traced is enhanced, the robustness is enhanced, and the tracking accuracy of the camera is improved.
  • the method provided by the embodiment of the present application is light and simple, and has no complicated back-end optimization, so the calculation speed is fast, and even real-time tracking can be achieved. Compared with the traditional slam (simultaneous localization and mapping) algorithm, the method provided by the embodiment of the present application is more robust and can achieve very high calculation precision.
  • the method of decomposing the homography matrix is used to obtain the pose parameters, which avoids the complicated tracking algorithm, and makes the result more stable and smooth without jitter, especially suitable for AR scenes.
  • the pose parameter may include a displacement parameter and a rotation parameter
  • the displacement parameter is used to indicate the displacement of the camera
  • the change of the position of the camera in the three-dimensional space may be determined
  • the rotation parameter is used to indicate the change of the rotation angle of the camera.
  • FIG. 6 is a flowchart of a method for determining a pose according to an embodiment of the present application.
  • the execution body of the pose determination method is a smart device, and the smart device may be a terminal such as a mobile phone or a tablet computer configured with a camera or configured AR equipment such as AR glasses and AR helmets of the camera, see FIG. 6, the method includes:
  • the timestamp corresponding to each rotation parameter refers to a timestamp when the rotation parameter is obtained.
  • the interpolation algorithm may use a Spherp (Spherical Linear Interpolation) algorithm or other algorithms.
  • the rotation parameter curve is obtained by interpolating according to the plurality of rotation parameters and the corresponding time stamp, and the rotation parameter curve can represent a variation rule of the rotation parameter of the camera with the shooting time.
  • the rotation parameter curve is obtained by interpolation, and the data alignment can be performed according to the rotation parameter curve, thereby obtaining the rotation parameter corresponding to the image, and the posture of the camera is determined according to the rotation parameter.
  • the smart device is equipped with a gyroscope, an accelerometer and a geomagnetic sensor. Through the gyroscope and the geomagnetic sensor, the only rotation parameter in the earth coordinate system can be obtained.
  • the map coordinate system has the following characteristics:
  • the X-axis is defined using the vector product, which is tangent to the ground at the current position of the smart device and points to the east.
  • the Y-axis is tangent to the ground at the current position of the smart device and points to the north pole of the earth's magnetic field.
  • the Z axis points to the sky and is perpendicular to the ground.
  • the rotation parameters obtained through the map coordinate system can be considered as no error, and do not need to rely on the parameters of the IMU, avoiding the calibration problem of the IMU, and being compatible with various types of devices.
  • the smart device provides an interface for obtaining rotation parameters: a rotation-vector interface, which can call the rotation-vector interface according to the sampling frequency of the IMU to obtain the rotation parameter.
  • the smart device can store multiple rotation parameters and corresponding timestamps into the IMU queue, and interpolate the data in the IMU queue to obtain a rotation parameter curve. Or, considering the above data, there may be noise. Therefore, in order to ensure the accuracy of the data, the angle difference between the obtained rotation parameter and the previous rotation parameter may be calculated. If the angle difference is greater than the preset threshold, it may be considered as If the rotation parameter is a noise item, the rotation parameter is deleted. The noise item can be deleted by the above detection, and only the detected rotation parameters and their corresponding time stamps are stored in the IMU queue.
  • the method provided by the embodiment of the present invention obtains a rotation parameter curve by interpolating according to a plurality of rotation parameters measured by the IMU and a corresponding time stamp, and the data alignment can be performed according to the rotation parameter curve, thereby according to the time stamp and the rotation parameter curve of the captured image.
  • FIG. 7 The operation flow of the embodiment of the present application can be as shown in FIG. 7. Referring to FIG. 7, the functions of the smart device are divided into multiple modules, and the operation flow is as follows:
  • the data measured by the IMU is read by the module 701, including the rotation parameter and the corresponding timestamp, and the module 702 detects whether the data is reasonable. If not, the data is discarded. If it is reasonable, the data is stored in the IMU queue through the module 703. in.
  • the captured image is read by the module 704, and the module 705 determines whether the marker image has been set currently. If the marker image is not set, a marker image is initialized by the module 706 with the currently captured image, and if the marker image has been set, the connection to the marker image is established directly by the module 707, tracking the feature points of the marker image.
  • the module 708 combines the data in the IMU queue and the data obtained by tracking the feature points, obtains the displacement parameter and the rotation parameter, and calculates a rotation displacement matrix of the current image relative to the current marker image.
  • the obtained data results are smoothed and output through modules 711 and 712. Kalan filters or other filters can be used for smoothing.
  • the embodiment of the present application provides a camera attitude tracking algorithm, which regards the motion process of the camera as a tracking process of the feature points of the marked image, and the new feature points are used to maintain the connection during the tracking process.
  • the IMU is used to obtain the rotation parameters of the camera relative to the initial scene, and the image of the real scene is used as the marker image, and the displacement parameter of the camera relative to the marker image is obtained by tracking and matching.
  • the position and attitude of the initial scene change, thus realizing a stable, fast and robust camera attitude tracking system in a real natural scene, which does not depend on the predetermined mark image, and enhances the calculation speed while enhancing the system's Lu Great, camera positioning accuracy is very high.
  • complicated IMU and image fusion algorithms are avoided, and the sensitivity to parameters is also reduced.
  • the method provided by the embodiment of the present application can run smoothly on the mobile end and does not require accurate calibration.
  • the embodiment of the present application corresponds to a scene in which a human eye observes a three-dimensional space, and the influence of the rotation parameter is large, and it is assumed that the displacement on the plane is not large.
  • the user usually interacts with the virtual elements in a flat scene, such as a table of coffee tables, etc., and the camera can be considered to move on a plane, and the rotation parameters have a greater influence. Therefore, the embodiment of the present application is very suitable for an AR scenario.
  • the embodiment of the present application does not need to switch the mark image too frequently, but avoids the tracking failure by adding feature points in real time, avoids the error caused by switching the mark image, and ensures the data. The result is smoother and more precise.
  • FIG. 8 is a schematic structural diagram of a pose determining apparatus according to an embodiment of the present application. Referring to Figure 8, the device is applied to a smart device, the device comprising:
  • a first obtaining module 801 configured to perform the step of acquiring a pose parameter of the first image relative to the mark image by tracking the first feature point in the foregoing embodiment
  • a feature point processing module 802 configured to perform the step of extracting a second feature point from the first image when the first image does not satisfy the feature point tracking condition in the foregoing embodiment
  • the second obtaining module 803 is configured to perform the step of acquiring the pose parameter of the second image relative to the marker image by tracking the first feature point and the second feature point in the foregoing embodiment, and determining the pose according to the pose parameter.
  • the device further includes:
  • a zoning module configured to perform the step of dividing the mark image into a plurality of grid regions of the same size in the above embodiment
  • a weight obtaining module configured to perform the step of acquiring weights of each feature point extracted from the mark image in the above embodiment
  • an extracting module configured to perform the step of extracting a feature point with the highest weight in each of the divided grid regions in the foregoing embodiment.
  • the device further includes:
  • a quantity acquisition module configured to perform the step of acquiring the number of first feature points tracked in the first image in the foregoing embodiment
  • a determining module configured to perform the step of determining that the first image does not satisfy the feature point tracking condition when the number does not reach the preset number in the foregoing embodiment.
  • the feature point processing module 802 is configured to perform, in the foregoing embodiment, dividing the first image into a plurality of grid regions of the same size, and acquiring weights of each feature point extracted from the first image, without including the first A step of extracting feature points in a raster region of a feature point.
  • the first obtaining module 801 is configured to perform a method for obtaining a homography matrix of the first image relative to the mark image in the foregoing embodiment, and performing decomposition to obtain a rotational displacement matrix of the first image relative to the mark image, from the rotational displacement The step of obtaining a pose parameter of the first image relative to the marker image in the matrix.
  • the first obtaining module 801 is configured to perform iterative processing on the homography matrix of each image with respect to the previous image in the foregoing embodiment, to obtain a homography matrix of the first image relative to the mark image. .
  • the first obtaining module 801 is configured to perform the step of decomposing the homography matrix in the foregoing embodiment to obtain a rotational displacement matrix of the first image relative to the marker image in the first coordinate system, and to the first The image is converted relative to the rotational displacement matrix of the marker image in the first coordinate system to obtain a rotational displacement matrix of the first image relative to the marker image.
  • the device further includes:
  • An initialization module for performing the step of setting the captured image as a marker image in the above embodiment.
  • the pose parameter includes a displacement parameter
  • the device further includes:
  • An interpolation processing module configured to acquire a plurality of rotation parameters of the camera and a corresponding time stamp by using the inertial measurement unit IMU, and perform interpolation according to the plurality of rotation parameters and the corresponding time stamp to obtain a rotation parameter curve;
  • the rotation parameter acquisition module is configured to acquire a rotation parameter corresponding to a time stamp of the first image in the rotation parameter curve as a rotation parameter of the first image.
  • the pose determining device provided by the above embodiment is only illustrated by the division of the above functional modules. In actual applications, the function assignment can be completed by different functional modules as needed. That is, the internal structure of the smart device is divided into different functional modules to complete all or part of the functions described above.
  • the posture determining device and the posture determining method embodiment provided in the above embodiments are in the same concept, and the specific implementation process is described in detail in the method embodiment, and details are not described herein again.
  • FIG. 9 is a structural block diagram of a terminal 900 according to an exemplary embodiment of the present application.
  • the terminal 900 is configured to perform the steps performed by the smart device in the foregoing method embodiment.
  • the terminal 900 can be a portable mobile terminal, such as: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic Image experts compress standard audio layers 4) players, laptops or desktops, or AR devices such as AR glasses and AR helmets.
  • Terminal 900 may also be referred to as a user device, a portable terminal, a laptop terminal, a desktop terminal, and the like.
  • the terminal includes a processor 901 and a memory 902.
  • the memory 902 stores at least one instruction, at least one program, code set or instruction set, and the instruction, program, code set or instruction set is loaded and executed by the processor 901 to implement the above implementation. The operation performed by the smart device in the example.
  • Processor 901 can include one or more processing cores, such as a 4-core processor, a 5-core processor, and the like.
  • the processor 901 can be implemented by at least one of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). achieve.
  • the processor 901 may also include a main processor and a coprocessor.
  • the main processor is a processor for processing data in an awake state, which is also called a CPU (Central Processing Unit); the coprocessor is A low-power processor for processing data in standby.
  • the processor 901 can be integrated with a GPU (Graphics Processing Unit) that is responsible for rendering and rendering of the content that the display screen needs to display.
  • the processor 901 may further include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.
  • AI Artificial Intelligence
  • Memory 902 can include one or more computer readable storage media, which can be non-transitory. Memory 902 can also include high speed random access memory, as well as non-volatile memory, such as one or more disk storage devices, flash storage devices. In some embodiments, the non-transitory computer readable storage medium in memory 902 is configured to store at least one instruction for being used by processor 901 to implement the pose provided by the method embodiments of the present application. Determine the method.
  • the terminal 900 can also optionally include: a peripheral device interface 903 and at least one peripheral device.
  • the processor 901, the memory 902, and the peripheral device interface 903 can be connected by a bus or a signal line.
  • Each peripheral device can be connected to the peripheral device interface 903 via a bus, signal line or circuit board.
  • the peripheral device includes at least one of a radio frequency circuit 904, a touch display screen 905, a camera 906, an audio circuit 907, a positioning component 908, and a power source 909.
  • the peripheral device interface 903 can be used to connect at least one peripheral device associated with an I/O (Input/Output) to the processor 901 and the memory 902.
  • processor 901, memory 902, and peripheral interface 903 are integrated on the same chip or circuit board; in some other embodiments, any of processor 901, memory 902, and peripheral interface 903 or The two can be implemented on a separate chip or circuit board, which is not limited in this embodiment.
  • the radio frequency circuit 904 is configured to receive and transmit an RF (Radio Frequency) signal, also called an electromagnetic signal. Radio frequency circuit 904 communicates with the communication network and other communication devices via electromagnetic signals. The radio frequency circuit 904 converts the electrical signal into an electromagnetic signal for transmission, or converts the received electromagnetic signal into an electrical signal.
  • the radio frequency circuit 904 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and the like. Radio frequency circuit 904 can communicate with other terminals via at least one wireless communication protocol.
  • the wireless communication protocol includes, but is not limited to, a metropolitan area network, various generations of mobile communication networks (2G, 3G, 4G, and 13G), a wireless local area network, and/or a WiFi (Wireless Fidelity) network.
  • the radio frequency circuit 904 may also include NFC (Near Field Communication) related circuits, which is not limited in this application.
  • the display screen 905 is used to display a UI (User Interface).
  • the UI can include graphics, text, icons, video, and any combination thereof.
  • display 905 is a touch display
  • display 905 also has the ability to capture touch signals over the surface or surface of display 905.
  • the touch signal can be input to the processor 901 as a control signal for processing.
  • the display screen 905 can also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards.
  • the display screen 905 may be one, and the front panel of the terminal 900 is disposed; in other embodiments, the display screen 905 may be at least two, respectively disposed on different surfaces of the terminal 900 or in a folded design; In still other embodiments, the display screen 905 can be a flexible display screen disposed on a curved surface or a folded surface of the terminal 900. Even the display screen 905 can be set to a non-rectangular irregular pattern, that is, a profiled screen.
  • the display screen 905 can be prepared by using an LCD (Liquid Crystal Display) or an OLED (Organic Light-Emitting Diode).
  • Camera component 906 is used to capture images or video.
  • camera assembly 906 includes a front camera and a rear camera.
  • the front camera is disposed on the front panel of the terminal 900
  • the rear camera is disposed on the back of the terminal.
  • the rear camera is at least two, which are respectively a main camera, a depth camera, a wide-angle camera, and a telephoto camera, so as to realize the background blur function of the main camera and the depth camera, and the main camera Combine with a wide-angle camera for panoramic shooting and VR (Virtual Reality) shooting or other integrated shooting functions.
  • camera assembly 906 can also include a flash.
  • the flash can be a monochrome temperature flash or a two-color temperature flash.
  • the two-color temperature flash is a combination of a warm flash and a cool flash that can be used for light compensation at different color temperatures.
  • the audio circuit 907 can include a microphone and a speaker.
  • the microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals for input to the processor 901 for processing, or to the radio frequency circuit 904 for voice communication.
  • the microphones may be multiple, and are respectively disposed at different parts of the terminal 900.
  • the microphone can also be an array microphone or an omnidirectional acquisition microphone.
  • the speaker is then used to convert electrical signals from processor 901 or radio frequency circuit 904 into sound waves.
  • the speaker can be a conventional film speaker or a piezoelectric ceramic speaker.
  • audio circuit 907 can also include a headphone jack.
  • the location component 908 is used to locate the current geographic location of the terminal 900 to implement navigation or LBS (Location Based Service).
  • the positioning component 908 can be a positioning component based on a US-based GPS (Global Positioning System), a Chinese Beidou system, a Russian Greiner system, or an EU Galileo system.
  • Power source 909 is used to power various components in terminal 900.
  • the power source 909 can be an alternating current, a direct current, a disposable battery, or a rechargeable battery.
  • the rechargeable battery can support wired charging or wireless charging.
  • the rechargeable battery can also be used to support fast charging technology.
  • terminal 900 also includes one or more sensors 910.
  • the one or more sensors 910 include, but are not limited to, an acceleration sensor 911, a gyro sensor 912, a pressure sensor 913, a fingerprint sensor 914, an optical sensor 915, and a proximity sensor 916.
  • the acceleration sensor 911 can detect the magnitude of the acceleration on the three coordinate axes of the coordinate system established by the terminal 900.
  • the acceleration sensor 911 can be used to detect components of gravity acceleration on three coordinate axes.
  • the processor 901 can control the touch display screen 905 to display the user interface in a landscape view or a portrait view according to the gravity acceleration signal collected by the acceleration sensor 911.
  • the acceleration sensor 911 can also be used for the acquisition of game or user motion data.
  • the gyro sensor 912 can detect the body direction and the rotation angle of the terminal 900, and the gyro sensor 912 can cooperate with the acceleration sensor 911 to collect the 3D motion of the user to the terminal 900.
  • the processor 901 can implement functions such as motion sensing (such as changing the UI according to the user's tilting operation), image stabilization at the time of shooting, game control, and inertial navigation, based on the data collected by the gyro sensor 912.
  • the pressure sensor 913 may be disposed at a side border of the terminal 900 and/or a lower layer of the touch display screen 905.
  • the pressure sensor 913 When the pressure sensor 913 is disposed on the side frame of the terminal 900, the user's holding signal to the terminal 900 can be detected, and the processor 901 performs left and right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 913.
  • the operability control on the UI interface is controlled by the processor 901 according to the user's pressure on the touch display screen 905.
  • the operability control includes at least one of a button control, a scroll bar control, an icon control, and a menu control.
  • the fingerprint sensor 914 is configured to collect the fingerprint of the user, and the processor 901 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 914, or the fingerprint sensor 914 identifies the identity of the user according to the collected fingerprint. Upon identifying that the user's identity is a trusted identity, the processor 901 authorizes the user to have associated sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying and changing settings, and the like.
  • the fingerprint sensor 914 can be disposed on the front, back, or side of the terminal 900. When the physical button or vendor logo is provided on the terminal 900, the fingerprint sensor 914 can be integrated with a physical button or a vendor logo.
  • Optical sensor 915 is used to collect ambient light intensity.
  • the processor 901 can control the display brightness of the touch display screen 905 according to the ambient light intensity collected by the optical sensor 915. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 905 is raised; when the ambient light intensity is low, the display brightness of the touch display screen 905 is lowered.
  • the processor 901 can also dynamically adjust the shooting parameters of the camera assembly 906 according to the ambient light intensity collected by the optical sensor 915.
  • Proximity sensor 916 also referred to as a distance sensor, is typically disposed on the front panel of terminal 900. Proximity sensor 916 is used to collect the distance between the user and the front of terminal 900. In one embodiment, when the proximity sensor 916 detects that the distance between the user and the front side of the terminal 900 is gradually decreasing, the touch screen 905 is controlled by the processor 901 to switch from the bright screen state to the screen state; when the proximity sensor 916 detects When the distance between the user and the front side of the terminal 900 gradually becomes larger, the processor 901 controls the touch display screen 905 to switch from the screen state to the bright screen state.
  • FIG. 9 does not constitute a limitation to the terminal 900, and may include more or less components than those illustrated, or may combine some components or adopt different component arrangements.
  • the embodiment of the present application further provides a pose determining apparatus, the pose determining apparatus includes a processor and a memory, where the memory stores at least one instruction, at least one program, a code set or a command set, an instruction, a program, a code set or The set of instructions is loaded by the processor and has operations to be implemented in the pose determining method of the above embodiment.
  • the embodiment of the present application further provides a computer readable storage medium, where the computer readable storage medium stores at least one instruction, at least one program, a code set or a set of instructions, the program, the program, the code set or the instruction
  • the set is loaded by the processor and has the operations possessed in the pose determining method of implementing the above embodiment.
  • a person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium.
  • the storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

一种位姿确定方法、装置、智能设备及存储介质,属于计算机技术领域。方法包括:在未设置标记图像的情况下,智能设备获取相机拍摄的图像,将拍摄的图像设置为标记图像(301),通过追踪第一特征点,获取相机拍摄的第一图像相对于标记图像的位姿参数(302);当第一图像不满足特征点追踪条件时,从第一图像中提取第二特征点(303);通过追踪第一特征点和第二特征点,获取相机拍摄的第二图像相对于标记图像的位姿参数,并根据位姿参数确定相机的位姿(304)。通过在图像不满足特征点追踪条件时采用提取新特征点的方式,避免了由于相机的位置或姿态的变化过多而导致无法追踪到特征点的的情况,增强了鲁棒性,提高了相机的追踪精度。

Description

位姿确定方法、装置、智能设备及存储介质
本申请要求于2018年4月27日提交、申请号为201810391549.6、发明名称为“位姿确定方法、装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及计算机技术领域,特别涉及一种位姿确定方法、装置、智能设备及存储介质。
背景技术
AR(Augmented Reality,增强现实)技术是一种实时地追踪相机的位置和姿态,结合虚拟的图像、视频或者三维模型进行显示的技术,能够将虚拟场景与实际场景结合显示,是目前计算机视觉领域的重要研究方向之一。AR技术中最重要的问题在于如何准确确定相机的位置和姿态。
相关技术提出了一种通过追踪模板(marker)图像中的特征点来确定相机位置和姿态的方法,先预先定义好一个模板图像,提取模板图像中的特征点,随着相机的位置或姿态的变化对提取的特征点进行追踪,每当相机当前拍摄到一个图像时,在当前图像中识别模板图像的特征点,从而能够将特征点在当前图像中的位置和姿态与该特征点在模板图像中的位置和姿态进行对比,得到特征点的位姿参数,进而得到当前图像相对于模板图像的位姿参数,如旋转参数和位移参数,该位姿参数可以表示相机拍摄当前图像时的位置和姿态。
在实现本申请实施例的过程中,发明人发现上述相关技术至少存在以下问题:当相机的位置或姿态的变化过多而导致当前图像中不存在特征点时,无法追踪到特征点,也就无法确定相机的位置和姿态。
发明内容
本申请实施例提供了一种位姿确定方法、装置、智能设备及存储介质,可以解决相关技术的问题。所述技术方案如下:
第一方面,提供了一种位姿确定方法,所述方法包括:
通过追踪第一特征点,获取相机拍摄的第一图像相对于标记图像的位姿参数,所述第一特征点通过从所述标记图像中提取特征点得到;
当所述第一图像不满足特征点追踪条件时,从所述第一图像中提取第二特征点,所述第二特征点与所述第一特征点不同;
通过追踪所述第一特征点和所述第二特征点,获取所述相机拍摄的第二图像相对于所述标记图像的位姿参数,并根据所述位姿参数确定所述相机的位姿,所述第二图像为所述相机在所述第一图像之后拍摄的图像。
第二方面,提供了一种位姿确定装置,所述装置包括:
第一获取模块,用于通过追踪第一特征点,获取相机拍摄的第一图像相对于标记图像的位姿参数,所述第一特征点通过从所述标记图像中提取特征点得到;
特征点处理模块,用于当所述第一图像不满足特征点追踪条件时,从所述第一图像中提取第二特征点,所述第二特征点与所述第一特征点不同;
第二获取模块,用于通过追踪所述第一特征点和所述第二特征点,获取所述相机拍摄的第二图像相对于所述标记图像的位姿参数,并根据所述位姿参数确定所述相机的位姿,所述第二图像为所述相机在所述第一图像之后拍摄的图像。
第三方面,提供了一种智能设备,所述智能设备包括:处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
通过追踪第一特征点,获取相机拍摄的第一图像相对于标记图像的位姿参数,所述第一特征点通过从所述标记图像中提取特征点得到;
当所述第一图像不满足特征点追踪条件时,从所述第一图像中提取第二特征点,所述第二特征点与所述第一特征点不同;
通过追踪所述第一特征点和所述第二特征点,获取所述相机拍摄的第二图像相对于所述标记图像的位姿参数,并根据所述位姿参数确定所述相机的位姿,所述第二图像为所述相机在所述第一图像之后拍摄的图像。
第四方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述指令、所述程序、所述代码集或所述指令集由处理器加载并具有以实现如第一方面所述的位姿确定方法中所具有的操作。
本申请实施例提供的技术方案带来的有益效果至少包括:
本申请实施例提供的方法、装置、智能设备及存储介质,通过在追踪第一特征点,获取相机拍摄的图像相对于标记图像的位姿参数的过程中,当第一图像不满足特征点追踪条件时,从第一图像中提取第二特征点,通过追踪第一特征点和第二特征点,获取相机拍摄的图像相对于标记图像的位姿参数,从而确定相机的位置和姿态,通过在图像不满足特征点追踪条件时采用提取新特征点的方式,避免了由于相机的位置或姿态的变化过多而导致无法追踪到特征点的的情况,增强了鲁棒性,提高了相机的追踪精度。
另外,无需预先给定标记图像,只需拍摄当前的场景得到一个图像,设置为初始标记图像,即可实现标记图像的初始化,摆脱了必须预先给定标记图像的限制,扩展了应用范围。
另外,采用栅格区域来筛选特征点,可以保证一个栅格区域中仅包括一个特征点,不会出现多个特征点集中在同一个区域的情况,保证了特征点之间的空间分散性,从而提高了追踪精确度。
另外,采用分解单应性矩阵的方式来获取位姿参数,避免了复杂的追踪算法,使结果更加的稳定平滑,不会出现抖动,尤其适用于AR场景。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供的场景界面的显示示意图;
图2是本申请实施例提供的另一场景界面的显示示意图;
图3是本申请实施例提供的一种位姿确定方法的流程图;
图4是本申请实施例提供的一种图像示意图;
图5是本申请实施例提供的一种特征点的分布示意图;
图6是本申请实施例提供的一种位姿确定方法的流程图;
图7是本申请实施例提供的一种操作流程的示意图;
图8是本申请实施例提供的一种位姿确定装置的结构示意图;
图9是本申请实施例提供的一种终端的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请实施例提供了一种位姿确定方法,应用于智能设备追踪相机的位置和姿态的场景下,尤其是在AR场景下,智能设备采用AR技术进行显示时,如显示AR游戏、AR视频等,需要追踪相机的位置和姿态。
其中,智能设备配置有相机和显示单元,相机用于拍摄现实场景的图像,显示单元用于显示由现实场景与虚拟场景结合构成的场景界面。智能设备随着相机的运动可以追踪相机的位置和姿态的变化,还可以拍摄现实场景的图像,按照相机的位置和姿态的变化依次显示当前拍摄到的多个图像,从而模拟出显示三维界面的效果。并且,在显示的图像中可以添加虚拟元素,如虚拟图像、虚拟视频或者虚拟三维模型等,随着相机的运动,可以按照相机的位置和姿态的变化,以不同的方位显示虚拟元素,从而模拟出显示三维虚拟元素的效果。现实场景的图像与虚拟元素结合显示,构成了场景界面,从而模拟出现实场景与虚拟元素同处于同一个三维空间的效果。
例如,参见图1和参见图2,智能设备在拍摄到的包含桌子和茶杯的图像中添加了一个虚拟人物形象,随着相机的运动,拍摄到的图像发生变化,虚拟人物形象的拍摄方位也发生变化,模拟出了虚拟人物形象在图像中相对于桌子和茶杯静止不动,而相机随着位置和姿态的变化同时拍摄桌子、茶杯和虚拟人物形象的效果,为用户呈现了一幅真实立体的画面。
图3是本申请实施例提供的一种位姿确定方法的流程图,该位姿确定方法 的执行主体为智能设备,该智能设备可以为配置有相机的手机、平板电脑等终端或者为配置有相机的AR眼镜、AR头盔等AR设备,参见图3,该方法包括:
301、在未设置标记图像的情况下,智能设备获取相机拍摄的图像,将拍摄的图像设置为标记图像。
本申请实施例中,为了追踪相机的位置和姿态的变化,需要以标记图像作为基准,在相机拍摄至少一个图像的过程中,通过追踪标记图像的特征点来确定相机的位姿参数,该位姿参数用于确定智能设备的位置和姿态。
为此,在未设置标记图像的情况下,智能设备可以通过相机拍摄图像,获取相机当前拍摄的图像,将该图像设置为标记图像,从而实现标记图像的初始化,后续智能设备继续拍摄其他图像的过程中,即可通过追踪标记图像的特征点来获取每个图像的位姿参数。
其中,相机可以按照预设周期进行拍摄,每隔一个预设周期拍摄一个图像,该预设周期可以为0.1秒或者0.01秒等。
在一种可能实现方式中,为了防止标记图像中特征点数量较少而导致追踪失败,当获取到拍摄的图像后,可以先从该图像中提取特征点,判断提取到的特征点数量是否达到预设数量,当从该图像中提取到的特征点数量达到预设数量时,再将该图像设置为标记图像,而当从该图像中提取到的特征点数量未达到预设数量时,可以不将该图像设置为标记图像,而是获取相机拍摄的下一个图像,直至提取的特征点数量达到预设数量的图像时,将该提取的特征点数量达到预设数量的图像设置为标记图像。
其中,提取特征点时采用的特征提取算法可以为FAST(Features from Accelerated Segment Test,加速段测试特征点)检测算法、Shi-Tomasi(史托马西)角点检测算法、Harris Corner Detection(Harris角点检测)算法、SIFT(Scale-Invariant Feature Transform,尺度不变特征转换)算法等,预设数量可以根据对追踪精确度的需求确定。
在另一种可能实现方式中,考虑到不仅要提取足够数量的特征点,而且为了避免提取的特征点集中在同一个区域而导致提供的信息不足,还要提取具有空间分散性的特征点。为此,从标记图像中提取第一特征点时,先将标记图像划分为多个尺寸相同的栅格区域,从标记图像中提取特征点,获取提取的每个特征点的权重。在划分出的每个栅格区域中提取一个权重最高的特征点,作为第一特征点,其他的权重较低的特征点将不再考虑,直至标记图像中的所有栅 格区域均提取了第一特征点或者直至标记图像中提取的第一特征点数量达到预设数量时为止。
其中,每个栅格区域的尺寸可以根据追踪精度需求以及要求提取的第一特征点的数量确定,特征点的权重用于表示特征点的梯度大小,特征点的权重越大表示梯度越大,越容易追踪,因此采用权重较大的特征点进行追踪会提高追踪精确度。例如,对于每个特征点来说,获取该特征点的梯度,将该梯度直接作为该特征点的权重,或者按照预设系数对该梯度进行调整,得到该特征点的权重,以使该特征点的权重与该特征点的梯度呈正比关系。
采用上述栅格区域来筛选特征点,可以保证一个栅格区域中仅包括一个特征点,不会出现多个特征点集中在同一个区域的情况,保证了特征点之间的空间分散性。
初始化标记图像成功后,设置该标记图像的旋转参数和位移参数、初始特征点深度和初始单应性矩阵。例如可以将初始特征点深度s设置为1,将旋转参数矩阵设置为单位矩阵,将初始平移矩阵设置为[0,0,s],将初始单应性矩阵设置为单位矩阵。并且,为了保证算法的统一,需要保证初始特征点深度与初始相机姿态的深度相同。
302、通过追踪第一特征点,获取相机拍摄的第一图像相对于标记图像的位姿参数。
从标记图像中提取第一特征点,将从标记图像中提取的第一特征点作为要追踪的目标特征点。随着相机的位置或姿态的变化,智能设备通过相机拍摄至少一个图像,并且通过在该至少一个图像中追踪第一特征点,得到每个图像相对于标记图像的位姿参数。
本申请实施例以当前追踪第一特征点为例,该第一特征点可以包括从标记图像中提取的第一特征点,或者,既可以包括从标记图像中提取的第一特征点,也可以包括相机在标记图像之后拍摄的图像中提取的第一特征点,具体提取方式与下述步骤中从第一图像中提取第二特征点的方式类似,在此暂不做赘述。
追踪第一特征点时,对于相机拍摄的相邻两个图像,使用从上一图像中提取的该第一特征点进行光流,从而找到上一图像与下一图像之间匹配的第一特征点,得到匹配的第一特征点的光流信息,该光流信息用于表示匹配的第一特征点在该相邻两个图像中的运动信息,则根据匹配的第一特征点的光流信息可以确定相邻两个图像中第二个图像相对于第一个图像的位姿参数。进行光流时 采用的算法可以为Lucas-Kanade(卢卡斯-卡纳德)光流算法或者其他算法,除光流外,也可以采用描述子或者直接法对特征点进行匹配,找到上一图像与下一图像之间匹配的第一特征点。
那么,对于相机在标记图像之后拍摄的第一图像来说,根据从标记图像至该第一图像中的每个图像相对于上一个图像的位姿参数,可以进行迭代,从而确定该第一图像相对于标记图像的位姿参数。其中,该第一图像相对于标记图像的位姿参数可以包括位移参数和旋转参数,该位移参数用于表示相机拍摄该第一图像时的位置与拍摄标记图像时的位置之间的距离,该旋转参数用于表示相机拍摄该第一图像时的旋转角度与拍摄标记图像时的旋转角度之间的角度差。且,该位姿参数可以采用旋转位移矩阵的形式来表示,该旋转位移矩阵由旋转参数矩阵和位移参数矩阵组成,旋转参数矩阵中包括旋转参数,位移参数矩阵中包括位移参数。
例如,从标记图像开始,相机依次拍摄到图像1、图像2、图像3,并获取到了图像1相对于标记图像的位姿参数(R1,T1)、图像2相对于图像1的位姿参数(R2,T2)以及图像3相对于图像2的位姿参数(R3,T3),则根据这些位姿参数可以进行迭代,确定图像3相对于标记图像的位姿参数(R3’,T3’)为:
Figure PCTCN2019079342-appb-000001
在一种可能实现方式中,可以通过单应性矩阵来获取位姿参数,也即是该步骤302可以包括以下步骤3021-3022:
3021、通过追踪第一特征点,获取第一图像相对于标记图像的单应性矩阵。
其中,该单应性矩阵是表示第一图像中的特征点与标记图像中的相应特征点之间的转换关系的矩阵,因此具有以下关系:
x c=H ca*x a
其中,x c表示图像c中特征点的二维坐标对应的齐次坐标,x a表示图像a中相应特征点的二维坐标对应的齐次坐标,H ca表示图像c相对于图像a的单应性矩阵。
由于特征点的二维坐标对应的齐次坐标均为3*1的向量,因此单应性矩阵为3*3的矩阵,可以表示为
Figure PCTCN2019079342-appb-000002
因此,在相机拍摄过程中,可以追踪多个第一特征点,从而获取到多个第 一特征点分别在相机拍摄的两个相邻图像中的二维坐标对应的齐次坐标,根据获取到的坐标利用上述关系即可计算出两个图像之间的单应性矩阵。其中,单应性矩阵中包括9个元素,将其中一个设置为1后具有8个未知量,因此为了取得单应性矩阵的唯一解,至少获取4个特征点在这两个相邻图像中的二维坐标对应的齐次坐标。
针对该标记图像和该第一图像,可以通过在从标记图像的下一个图像至第一图像的每个图像中追踪第一特征点,获取每个图像相对于上一个图像的单应性矩阵,对每个图像相对于上一个图像的单应性矩阵进行迭代处理,得到第一图像相对于标记图像的单应性矩阵。
3022、根据旋转位移矩阵应满足的预设约束条件,对单应性矩阵进行分解得到第一图像相对于标记图像的旋转位移矩阵,从旋转位移矩阵中获取第一图像相对于标记图像的位姿参数。
在一种可能实现方式中,该步骤3022包括:
(1)将标记图像的图像坐标系向z轴的负方向平移一个单位形成第一坐标系,根据旋转位移矩阵应满足的预设约束条件对单应性矩阵进行分解,得到第一图像相对于第一坐标系中的标记图像的旋转位移矩阵。
其中,该旋转位移矩阵包括第一图像相对于第一坐标系中的标记图像的旋转参数矩阵和位移参数矩阵,且旋转参数矩阵中的元素即为第一图像相对于第一坐标系中的标记图像的旋转参数,位移参数矩阵中的元素即为第一图像相对于第一坐标系中的标记图像的位移参数。该预设约束条件为旋转位移矩阵中旋转参数矩阵的列向量为单位矩阵,且旋转参数矩阵的第一列与第二列的乘积等于第三列。
在一种可能实现方式中,第一图像中的特征点与标记图像中的相应特征点之间还具有如下转换关系:
Figure PCTCN2019079342-appb-000003
其中,
Figure PCTCN2019079342-appb-000004
Rcm表示第一图像相对于第一坐标系中的标记图像的旋转参数矩阵,Tcm表示第一图像相对于第一坐标系中的标记图像的位移参数矩阵,g表示归一化因子,P表示相机的透视投影参数;
Figure PCTCN2019079342-appb-000005
用于对齐非齐次项,
Figure PCTCN2019079342-appb-000006
用于将标记图像的图像坐标系转换为第一坐标系。
因此可以确定
Figure PCTCN2019079342-appb-000007
并且由于第一坐标系中特征点的z轴坐标均为0,因此旋转位移矩阵中的第三列为0,将第三列删去可以确定:
Figure PCTCN2019079342-appb-000008
上述公式中单应性矩阵已知,P已知,而根据旋转参数矩阵的列向量为单位矩阵的条件可以计算出归一化因子g,进而求出旋转参数矩阵的第一列和第二列,将第一列和第二列叉乘后求出第三列,从而计算出旋转参数矩阵Rcm,根据归一化因子g和单应性矩阵的第三列可以计算出位移参数矩阵Tcm。
另外,针对位移参数矩阵Tcm的正负,可以计算标记图像在相机中的位置,由于标记图像一定是位于相机的前方,因此标记图像的位移参数与标记图像在相机坐标系中的纵坐标的乘积小于0,根据此约束条件可以确定位移参数矩阵Tcm的正负。
(2)根据第一坐标系与标记图像的图像坐标系之间的转换关系,对第一图像相对于第一坐标系中的标记图像的旋转位移矩阵进行转换,得到第一图像相对于标记图像的旋转位移矩阵。
即采用如下公式进行转换,得到第一图像相对于标记图像的旋转位移矩阵:
Figure PCTCN2019079342-appb-000009
其中,Rca表示第一图像相对于标记图像的旋转参数矩阵,Tca表示第一图像相对于标记图像的位移参数矩阵。
计算出旋转位移矩阵后,即可根据该旋转位移矩阵确定第一图像相对于标记图像的旋转参数和位移参数。
举例来说,相机拍摄的多个图像如图4所示,追踪过程包括以下步骤:
1、相机拍摄得到标记图像a。
2、相机拍摄多个图像,追踪标记图像a的第一特征点,直至拍摄到图像c。
3、将标记图像a的特征点向z轴的负方向平移一个单位,形成坐标系m,对图像c相对于图像a的单应性矩阵进行分解,得到图像c相对于坐标系m中标记图像a的旋转平移矩阵[Rcm/Tcm]。
4、根据坐标系m与标记图像a的图像坐标系之间的转换关系,对图像c相对于坐标系m中的标记图像a的旋转位移矩阵进行转换,得到图像c相对于标记图像a的旋转位移矩阵[Rca/Tca]。
在步骤302之后,还可以根据第一图像相对于标记图像的位姿参数以及标记图像的位姿参数,获取第一图像的位姿参数。
基于上述步骤3021-3022,在一种可能实现方式中,在计算出第一图像相对于标记图像的旋转位移矩阵后,根据第一图像相对于标记图像的旋转位移矩阵,以及标记图像的旋转位移矩阵,采用以下公式,获取第一图像的旋转位移矩阵:
Figure PCTCN2019079342-appb-000010
s表示第一图像的深度;R_final表示第一图像的旋转参数矩阵,T_final表示第一图像的位移参数矩阵;Rca表示第一图像相对于标记图像的旋转参数矩阵,Tca表示第一图像相对于标记图像的位移参数矩阵;R_first表示标记图像的旋转参数矩阵,T_first表示标记图像的位移参数矩阵。
303、当第一图像不满足特征点追踪条件时,从第一图像中提取第二特征点,第二特征点与第一特征点不同。
在追踪特征点的过程中,随着相机的位置和姿态的变化,拍摄的图像中包含的第一特征点的数量可能会逐渐减少,导致上一图像中的某些第一特征点在下一图像中不存在匹配的第一特征点,此时对相邻两个图像包括的第一特征点进行匹配时,会排除掉一部分不匹配的第一特征点。
除此之外,还可以根据单应性矩阵计算的结果和光流匹配结果进行检测,排除不合理的第一特征点。也即是,针对每个第一特征点,根据提取第一特征点时的图像相对于标记图像的单应性矩阵和第一图像相对于标记图像的单应性矩阵,可以计算出该第一图像与提取第一特征点时的图像之间的单应性矩阵。根据该第一特征点在从上述两个图像之间的任两个相邻图像之间的光流信息进行迭代,得到该第一特征点在上述两个图像之间的光流信息,将单应性矩阵与光流信息进行对比,如果单应性矩阵与光流信息之间相差较大,表示该第一特征点的运动情况不符合应有的旋转平移关系,误差过大,因此为了避免对后续 追踪过程的影响,删除该第一特征点。
当第一特征点的数量过少时,很可能会导致追踪失败。因此,相机拍摄第一图像后,还要判断第一图像是否满足特征点追踪条件。
在一种可能实现方式中,特征点追踪条件可以为追踪到的特征点的数量达到预设数量,则当某一图像中追踪到的特征点的数量达到预设数量时,确定该图像满足特征点追踪条件,否则,确定该图像不满足特征点追踪条件。
相应地,针对第一图像,获取第一图像中追踪到的第一特征点的数量,当该数量达到预设数量时,确定第一图像满足特征点追踪条件。当该数量未达到预设数量时,确定第一图像不满足特征点追踪条件。
当确定第一图像不满足特征点追踪条件时,从第一图像提取与第一特征点不同的第二特征点,将第一图像中追踪到的第一特征点以及新提取的第二特征点均作为要追踪的目标特征点,继续进行追踪,从而增加了特征点的数量。
在一种可能实现方式中,为了防止第一图像中提取的特征点数量较少而导致追踪失败,从第一图像中提取第二特征点,判断提取到的第二特征点的数量与第一图像中追踪到的第一特征点的数量之和是否达到预设数量,当从第一图像中提取到的第二特征点数量与第一图像中追踪到的第一特征点的数量之和达到预设数量时,提取特征点完成。
其中,提取特征点时采用的特征提取算法可以为FAST检测算法、Shi-Tomasi角点检测算法、Harris Corner Detection算法、SIFT算法等,预设数量可以根据对追踪精确度的需求确定。
通过补充新的特征点,可以增加特征点的数量,保证追踪过程的顺利进行,避免了特征点数量变少而导致追踪失败,提高了追踪精确度。
另外,随着相机的位置或姿态发生变化,第一图像中即使能够追踪到第一特征点,这些第一特征点也可能集中在同一区域,分布过于密集,导致提供的信息不足,或者分布过于分散,导致提供的信息不够准确,此时的第一特征点不具有当前图像的代表性,第一特征点的位姿参数将无法准确体现当前图像的位姿参数,会造成计算误差较大。例如,参见图5,左图为初始的标记图像中的第一特征点,右图为第一图像。随着相机的运动,标记图像经过放大以后成为第一图像,导致第一特征点在第一图像中过于分散,不能准确描述第一图像。若根据过于分散的第一特征点获取第一图像的位姿参数,会导致位姿参数不够准确。
考虑到不仅要提取足够数量的特征点,而且为了避免提取的特征点集中在同一个区域,还要提取具有空间分散性的特征点。为此,从第一图像中提取第二特征点时,先将第一图像划分为多个尺寸相同的栅格区域,从第一图像中提取特征点,获取提取的每个特征点的权重。在划分出的不包括第一特征点的每个栅格区域中提取一个权重最高的特征点,作为第二特征点,其他的权重较低的特征点将不再考虑,直至第一图像中的所有栅格区域均提取了特征点(第一特征点或第二特征点),或者直至第一图像中提取的第二特征点的数量和第一图像中追踪到的第一特征点的数量之和达到预设数量时为止。
在另一种可能实现方式中,从第一图像中提取第二特征点时,先将第一图像划分为多个尺寸相同的栅格区域,获取第一图像中追踪到的第一特征点之前提取时的权重,根据获取的权重,在划分出的每个栅格区域提取一个权重最高的第一特征点,而不再提取权重较低的第一特征点,从而将集中分布的多个第一特征点中权重较低的第一特征点去除掉。之后从第一图像中提取特征点,获取提取的每个特征点的权重,在不包括第一特征点的每个栅格区域中提取一个权重最高的特征点,作为第二特征点,其他的权重较低的特征点将不再考虑,直至第一图像中的所有栅格区域均提取了特征点(第一特征点或第二特征点),或者直至第一图像中提取的第二特征点的数量和第一图像中剩余的第一特征点的数量之和达到预设数量时为止。
其中,每个栅格区域的尺寸可以根据追踪精度需求以及要求提取的特征点数量确定,特征点的权重用于表示特征点的梯度大小,特征点的权重越大表示梯度越大,越容易追踪,因此采用权重较大的特征点进行追踪可以提高追踪精确度。例如,对于每个特征点来说,获取该特征点的梯度,将该梯度直接作为该特征点的权重,或者按照预设系数对该梯度进行调整,得到该特征点的权重,以使该特征点的权重与该特征点的梯度呈正比关系。
采用上述栅格区域来筛选特征点,可以保证一个栅格区域中仅包括一个特征点,不会出现多个特征点集中在同一个区域的情况,保证了特征点之间的空间分散性。
另外,针对第一图像中新提取的第二特征点会记录下第一图像的单应性矩阵,以便后续根据提取第二特征点时的图像相对于标记图像的单应性矩阵以及光流匹配结果,检测第二特征点的运动情况是否不合理,从而确定是否要删除第二特征点。
304、通过追踪第一特征点和第二特征点,获取相机拍摄的第二图像相对于标记图像的位姿参数,并根据位姿参数确定相机的位姿,第二图像为相机在第一图像之后拍摄的图像。
增加了第二特征点之后,继续在相机拍摄的图像中追踪第一特征点和第二特征点。
以第二图像为例,根据从标记图像至该第二图像中的每个图像相对于上一个图像的位姿参数,可以进行迭代,从而确定该第二图像相对于标记图像的位姿参数,根据该第二图像相对于标记图像的位姿参数确定相机在拍摄该第二图像时的位姿与拍摄该标记图像时的位姿的变化情况,从而确定相机的位姿。其中,该第二图像相对于标记图像的位姿参数可以包括位移参数和旋转参数中的至少一项,该位移参数用于表示相机拍摄该第二图像时的位置与拍摄标记图像时的位置之间的距离,该旋转参数用于表示相机拍摄该第二图像时的旋转角度与拍摄标记图像时的旋转角度之间的角度差。且,该位姿参数可以采用旋转位移矩阵的形式来表示,该旋转位移矩阵由旋转参数矩阵和位移参数矩阵组成,旋转参数矩阵中包括旋转参数,位移参数矩阵中包括位移参数。
在一种可能实现方式中,可以通过单应性矩阵来获取位姿参数,也即是该步骤304可以包括以下步骤3041-3042:
3041、通过追踪第一特征点和第二特征点,获取第二图像相对于标记图像的单应性矩阵。
针对该标记图像和该第二图像,可以通过在从标记图像的下一个图像至第二图像的每个图像中追踪第一特征点和第二特征点,获取每个图像相对于上一个图像的单应性矩阵;对每个图像相对于上一个图像的单应性矩阵进行迭代处理,得到第二图像相对于标记图像的单应性矩阵。
3042、根据旋转位移矩阵应满足的预设约束条件,对单应性矩阵进行分解得到第二图像相对于标记图像的旋转位移矩阵,从旋转位移矩阵中获取第二图像相对于标记图像的位姿参数。
在一种可能实现方式中,该步骤3042包括:
(1)将第二图像的图像坐标系向z轴的负方向平移一个单位形成第二坐标系,根据旋转位移矩阵应满足的预设约束条件对单应性矩阵进行分解,得到第二坐标系中第二图像相对于标记图像的旋转位移矩阵。
(2)根据第二坐标系与第二图像的图像坐标系之间的转换关系,对第二坐 标系中第二图像相对于标记图像的旋转位移矩阵进行转换,得到第二图像相对于标记图像的旋转位移矩阵。
步骤3041-3042的具体过程与上述步骤3021-3022的过程类似,在此不再赘述。
计算出旋转位移矩阵后,即可根据该旋转位移矩阵确定第二图像相对于标记图像的旋转参数和位移参数。
在步骤304之后,还可以根据第二图像相对于标记图像的位姿参数以及标记图像的位姿参数,获取第二图像的位姿参数,具体过程与获取第一图像的位姿参数的过程类似,在此不再赘述。并且,获取到第二图像的位姿参数后,可以采用滤波器对获取到的位姿参数进行平滑后再输出,避免结果出现抖动。该滤波器可以为kalman(卡尔曼)滤波器或者其他滤波器。
需要说明的是,本申请实施例仅是以一个标记图像为例进行说明,在另一实施例中,在追踪过程中不仅可以增加特征点,还可以更换标记图像。如当前图像不满足特征点追踪条件时,将当前图像的上一个图像作为更换后的标记图像,基于更换后的标记图像继续进行追踪。通过更换标记图像的方式也可以避免由于相机的位置或姿态变化过多而导致追踪失败。
本申请实施例提供的方法,通过在追踪第一特征点,获取相机拍摄的图像相对于标记图像的位姿参数的过程中,当第一图像不满足特征点追踪条件时,从第一图像中提取第二特征点,通过追踪第一特征点和第二特征点,获取相机拍摄的图像相对于标记图像的位姿参数,从而确定相机的位置和姿态,避免了由于相机的位置或姿态的变化过多而导致无法追踪到特征点的的情况,增强了鲁棒性,提高了相机的追踪精度。本申请实施例提供的方法,轻量简单,没有复杂的后端优化,因此计算速度很快,甚至可以做到实时追踪。相对于传统的slam(simultaneous localization and mapping,即时定位与地图构建)算法,本申请实施例提供的方法鲁棒性更强,可以达到非常高的计算精度。
另外,无需预先给定标记图像,只需拍摄当前的场景得到一个图像,设置为初始标记图像,即可实现标记图像的初始化,摆脱了必须预先给定标记图像的限制,扩展了应用范围。
另外,采用栅格区域来筛选特征点,可以保证一个栅格区域中仅包括一个特征点,不会出现多个特征点集中在同一个区域的情况,保证了特征点之间的空间分散性,从而提高了追踪精确度。
另外,采用分解单应性矩阵的方式来获取位姿参数,避免了复杂的追踪算法,使结果更加的稳定平滑,不会出现抖动,尤其适用于AR场景。
本申请实施例中,位姿参数可以包括位移参数和旋转参数,位移参数用于表示相机的位移情况,可以确定相机在三维空间内位置的变化,而旋转参数用于表示相机的旋转角度的变化,可以确定相机在三维空间内姿态的变化。通过执行上述步骤可以获取到相机的位移参数和旋转参数。或者,通过执行上述步骤可以获取到相机的位移参数而不获取旋转参数,相机的旋转参数的获取过程详见下述实施例。
图6是本申请实施例提供的一种位姿确定方法的流程图,该位姿确定方法的执行主体为智能设备,该智能设备可以为配置有相机的手机、平板电脑等终端或者为配置有相机的AR眼镜、AR头盔等AR设备,参见图6,该方法包括:
601、通过IMU(Inertial Measurement Unit,惯性测量单元)获取相机的多个旋转参数以及对应的时间戳。
其中,每个旋转参数对应的时间戳是指获取该旋转参数时的时间戳。
602、根据多个旋转参数以及对应的时间戳进行插值得到旋转参数曲线。
其中,插值算法可以采用Slerp(Spherical Linear Interpolation,球面线性插值)算法或者其他算法。
根据多个旋转参数以及对应的时间戳进行插值得到旋转参数曲线,该旋转参数曲线可以表示相机的旋转参数随拍摄时间的变化规律。
603、当相机拍摄到一个图像时,获取相机拍摄的图像的时间戳,获取该时间戳在旋转参数曲线中对应的旋转参数,作为相机拍摄的图像的旋转参数,根据该旋转参数确定相机的姿态。
由于图像的拍摄频率与IMU的采样频率不匹配,因此通过插值得到旋转参数曲线,根据旋转参数曲线可以进行数据对齐,从而得到图像对应的旋转参数,根据该旋转参数确定相机的姿态。
实际应用中,智能设备配置有陀螺仪、加速度计和地磁传感器,通过陀螺仪和地磁传感器,可以得到在地球坐标系中唯一的旋转参数。该地图坐标系有以下特点:
1、X轴使用向量积来定义的,在智能设备当前的位置上与地面相切,并指向东方。
2、Y轴在智能设备当前的位置上与地面相切,且指向地磁场的北极。
3、Z轴指向天空,并垂直于地面。
通过该地图坐标系得到的旋转参数可以认为没有误差,而且无需依赖于IMU的参数,避免了IMU的标定问题,可以兼容多种类型的设备。
智能设备提供了获取旋转参数的接口:rotation-vector(旋转矢量)接口,可以按照IMU的采样频率调用rotation-vector接口,从而获取到旋转参数。
智能设备可以将获取到多个旋转参数以及对应的时间戳均存储至IMU队列中,通过读取IMU队列中的数据进行插值得到旋转参数曲线。或者,考虑到上述数据可能会存在噪声,因此为了保证数据的准确性,可以计算获取到的旋转参数与上一个旋转参数之间的角度差,如果该角度差大于预设阈值,可以认为获取到的旋转参数为噪声项,则将该旋转参数删除。通过上述检测可以删除噪声项,仅将通过检测的旋转参数及其对应的时间戳存储至IMU队列中。
本申请实施例提供的方法,通过根据IMU测量的多个旋转参数以及对应的时间戳进行插值得到旋转参数曲线,根据旋转参数曲线可以进行数据对齐,从而根据所拍摄图像的时间戳和旋转参数曲线,获取图像的旋转参数,提高了精确度,无需依赖于IMU的参数,避免了IMU的标定问题,并且考虑到了智能设备计算能力低的问题,通过IMU获取旋转参数可以降低计算量,提高计算速度。另外,将噪声项删除,可以提高数据的准确性,进一步提高精确度。
本申请实施例的操作流程可以如图7所示,参见图7,将智能设备的各个功能划分为多个模块,操作流程如下:
1、通过模块701读取到IMU测量的数据,包括旋转参数和对应的时间戳,通过模块702检测数据是否合理,如果不合理则丢弃该数据,如果合理则通过模块703将数据存储至IMU队列中。
2、通过模块704读取拍摄的图像,通过模块705判断当前是否已设置标记图像。如果未设置标记图像,则通过模块706利用当前拍摄的图像初始化一个标记图像,如果已设置标记图像,则直接通过模块707建立与标记图像的连接,追踪标记图像的特征点。
3、通过模块708联合IMU队列中的数据以及追踪特征点得到的数据,获取到位移参数和旋转参数,计算出当前图像相对于当前标记图像的旋转位移矩阵。
4、通过模块709检测图像的旋转参数和位移参数是否合理,如果是,则送入模块710,通过模块710从当前的图像中扩展新的特征点,并通过追踪特征点计算相机拍摄图像相对于标记图像的旋转位移矩阵。如果否,返回至模块706,利用当前图像重新进行初始化。
5、通过模块711和712对获得的数据结果进行平滑并输出。平滑时可以采用kalman滤波器或者其他滤波器。
综上所述,本申请实施例提供了一套相机姿态追踪算法,将相机的运动过程看做是对标记图像的特征点的追踪过程,追踪过程中通过补充新的特征点来保持连接。针对智能设备计算能力低的特点,利用IMU得到相机相对于初始场景的旋转参数,并将真实场景的图像作为标记图像,通过追踪匹配得到相机相对于标记图像的位移参数,两者结合得到相对于初始场景的位置和姿态变化,从而实现了一套真实自然场景下稳定、快速、鲁棒的相机姿态跟踪系统,不依赖于预先给定的标记图像,在提高计算速度的同时增强了系统的鲁棒性,相机定位精度很高。同时避免了复杂的IMU和图像融合算法,也降低了对参数的敏感性。本申请实施例提供的方法能在移动端流畅运行,且不需要精确的标定。
本申请实施例对应于人眼观测三维空间的场景,旋转参数的影响较大,而假设平面上的位移不大。而在AR场景下,用户通常是在平面场景下和虚拟元素进行互动,如茶几桌子等,则可以认为相机在平面上移动,旋转参数的影响较大。因此本申请实施例非常适用于AR场景。
并且,与切换标记图像的方案相比,本申请实施例无需过于频繁地切换标记图像,而是通过实时补充特征点的方式来避免追踪失败,避免了切换标记图像带来的误差,同时保证数据结果更加平滑精确。
图8是本申请实施例提供的一种位姿确定装置的结构示意图。参见图8,该装置应用于智能设备中,该装置包括:
第一获取模块801,用于执行上述实施例中通过追踪第一特征点获取第一图像相对于标记图像的位姿参数的步骤;
特征点处理模块802,用于执行上述实施例中当第一图像不满足特征点追踪条件时,从第一图像中提取第二特征点的步骤;
第二获取模块803,用于执行上述实施例中通过追踪第一特征点和第二特征 点,获取第二图像相对于标记图像的位姿参数,并根据位姿参数确定位姿的步骤。
可选地,该装置还包括:
区域划分模块,用于执行上述实施例中将标记图像划分为多个尺寸相同的栅格区域的步骤;
权重获取模块,用于执行上述实施例中获取从标记图像中提取的每个特征点的权重的步骤;
提取模块,用于执行上述实施例中在划分出的每个栅格区域中提取一个权重最高的特征点的步骤。
可选地,该装置还包括:
数量获取模块,用于执行上述实施例中获取第一图像中追踪到的第一特征点的数量的步骤;
确定模块,用于执行上述实施例中当数量未达到预设数量时,确定第一图像不满足特征点追踪条件的步骤。
可选地,特征点处理模块802用于执行上述实施例中将第一图像划分为多个尺寸相同的栅格区域,获取从第一图像中提取的每个特征点的权重,在不包括第一特征点的栅格区域中提取特征点的步骤。
可选地,第一获取模块801,用于执行上述实施例中通获取第一图像相对于标记图像的单应性矩阵,进行分解得到第一图像相对于标记图像的旋转位移矩阵,从旋转位移矩阵中获取第一图像相对于标记图像的位姿参数的步骤。
可选地,第一获取模块801,用于执行上述实施例中对每个图像相对于上一个图像的单应性矩阵进行迭代处理,得到第一图像相对于标记图像的单应性矩阵的步骤。
可选地,第一获取模块801,用于执行上述实施例中对单应性矩阵进行分解,得到第一图像相对于第一坐标系中的标记图像的旋转位移矩阵的步骤,以及对第一图像相对于第一坐标系中的标记图像的旋转位移矩阵进行转换,得到第一图像相对于标记图像的旋转位移矩阵的步骤。
可选地,该装置还包括:
初始化模块,用于执行上述实施例中将拍摄的图像设置为标记图像的步骤。
可选地,位姿参数包括位移参数,该装置还包括:
插值处理模块,用于通过惯性测量单元IMU,获取相机的多个旋转参数以 及对应的时间戳,根据多个旋转参数以及对应的时间戳进行插值得到旋转参数曲线;
旋转参数获取模块,用于获取第一图像的时间戳在旋转参数曲线中对应的旋转参数,作为第一图像的旋转参数。
需要说明的是:上述实施例提供的位姿确定装置在确定位姿时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将智能设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的位姿确定装置与位姿确定方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图9示出了本申请一个示例性实施例提供的终端900的结构框图,终端900用于执行上述方法实施例中智能设备所执行的步骤。
该终端900可以是便携式移动终端,比如:智能手机、平板电脑、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、笔记本电脑或台式电脑,也可以是AR眼镜、AR头盔等AR设备。终端900还可能被称为用户设备、便携式终端、膝上型终端、台式终端等其他名称。
该终端包括:处理器901和存储器902,存储器902中存储有至少一条指令、至少一段程序、代码集或指令集,指令、程序、代码集或指令集由处理器901加载并执行以实现上述实施例中智能设备所执行的操作。
处理器901可以包括一个或多个处理核心,比如4核心处理器、5核心处理器等。处理器901可以采用DSP(Digital Signal Processing,数字信号处理)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器901也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central Processing Unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器901可以在集成有GPU(Graphics Processing Unit,图像处理器),GPU用于负责显 示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器901还可以包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。
存储器902可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器902还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器902中的非暂态的计算机可读存储介质用于存储至少一个指令,该至少一个指令用于被处理器901所具有以实现本申请中方法实施例提供的位姿确定方法。
在一些实施例中,终端900还可选包括有:外围设备接口903和至少一个外围设备。处理器901、存储器902和外围设备接口903之间可以通过总线或信号线相连。各个外围设备可以通过总线、信号线或电路板与外围设备接口903相连。具体地,外围设备包括:射频电路904、触摸显示屏905、摄像头906、音频电路907、定位组件908和电源909中的至少一种。
外围设备接口903可被用于将I/O(Input/Output,输入/输出)相关的至少一个外围设备连接到处理器901和存储器902。在一些实施例中,处理器901、存储器902和外围设备接口903被集成在同一芯片或电路板上;在一些其他实施例中,处理器901、存储器902和外围设备接口903中的任意一个或两个可以在单独的芯片或电路板上实现,本实施例对此不加以限定。
射频电路904用于接收和发射RF(Radio Frequency,射频)信号,也称电磁信号。射频电路904通过电磁信号与通信网络以及其他通信设备进行通信。射频电路904将电信号转换为电磁信号进行发送,或者,将接收到的电磁信号转换为电信号。可选地,射频电路904包括:天线系统、RF收发器、一个或多个放大器、调谐器、振荡器、数字信号处理器、编解码芯片组、用户身份模块卡等等。射频电路904可以通过至少一种无线通信协议来与其它终端进行通信。该无线通信协议包括但不限于:城域网、各代移动通信网络(2G、3G、4G及13G)、无线局域网和/或WiFi(Wireless Fidelity,无线保真)网络。在一些实施例中,射频电路904还可以包括NFC(Near Field Communication,近距离无线通信)有关的电路,本申请对此不加以限定。
显示屏905用于显示UI(User Interface,用户界面)。该UI可以包括图形、文本、图标、视频及其它们的任意组合。当显示屏905是触摸显示屏时,显示 屏905还具有采集在显示屏905的表面或表面上方的触摸信号的能力。该触摸信号可以作为控制信号输入至处理器901进行处理。此时,显示屏905还可以用于提供虚拟按钮和/或虚拟键盘,也称软按钮和/或软键盘。在一些实施例中,显示屏905可以为一个,设置终端900的前面板;在另一些实施例中,显示屏905可以为至少两个,分别设置在终端900的不同表面或呈折叠设计;在再一些实施例中,显示屏905可以是柔性显示屏,设置在终端900的弯曲表面上或折叠面上。甚至,显示屏905还可以设置成非矩形的不规则图形,也即异形屏。显示屏905可以采用LCD(Liquid Crystal Display,液晶显示屏)、OLED(Organic Light-Emitting Diode,有机发光二极管)等材质制备。
摄像头组件906用于采集图像或视频。可选地,摄像头组件906包括前置摄像头和后置摄像头。通常,前置摄像头设置在终端900的前面板,后置摄像头设置在终端的背面。在一些实施例中,后置摄像头为至少两个,分别为主摄像头、景深摄像头、广角摄像头、长焦摄像头中的任意一种,以实现主摄像头和景深摄像头融合实现背景虚化功能、主摄像头和广角摄像头融合实现全景拍摄以及VR(Virtual Reality,虚拟现实)拍摄功能或者其它融合拍摄功能。在一些实施例中,摄像头组件906还可以包括闪光灯。闪光灯可以是单色温闪光灯,也可以是双色温闪光灯。双色温闪光灯是指暖光闪光灯和冷光闪光灯的组合,可以用于不同色温下的光线补偿。
音频电路907可以包括麦克风和扬声器。麦克风用于采集用户及环境的声波,并将声波转换为电信号输入至处理器901进行处理,或者输入至射频电路904以实现语音通信。出于立体声采集或降噪的目的,麦克风可以为多个,分别设置在终端900的不同部位。麦克风还可以是阵列麦克风或全向采集型麦克风。扬声器则用于将来自处理器901或射频电路904的电信号转换为声波。扬声器可以是传统的薄膜扬声器,也可以是压电陶瓷扬声器。当扬声器是压电陶瓷扬声器时,不仅可以将电信号转换为人类可听见的声波,也可以将电信号转换为人类听不见的声波以进行测距等用途。在一些实施例中,音频电路907还可以包括耳机插孔。
定位组件908用于定位终端900的当前地理位置,以实现导航或LBS(Location Based Service,基于位置的服务)。定位组件908可以是基于美国的GPS(Global Positioning System,全球定位系统)、中国的北斗系统、俄罗斯的格雷纳斯系统或欧盟的伽利略系统的定位组件。
电源909用于为终端900中的各个组件进行供电。电源909可以是交流电、直流电、一次性电池或可充电电池。当电源909包括可充电电池时,该可充电电池可以支持有线充电或无线充电。该可充电电池还可以用于支持快充技术。
在一些实施例中,终端900还包括有一个或多个传感器910。该一个或多个传感器910包括但不限于:加速度传感器911、陀螺仪传感器912、压力传感器913、指纹传感器914、光学传感器915以及接近传感器916。
加速度传感器911可以检测以终端900建立的坐标系的三个坐标轴上的加速度大小。比如,加速度传感器911可以用于检测重力加速度在三个坐标轴上的分量。处理器901可以根据加速度传感器911采集的重力加速度信号,控制触摸显示屏905以横向视图或纵向视图进行用户界面的显示。加速度传感器911还可以用于游戏或者用户的运动数据的采集。
陀螺仪传感器912可以检测终端900的机体方向及转动角度,陀螺仪传感器912可以与加速度传感器911协同采集用户对终端900的3D动作。处理器901根据陀螺仪传感器912采集的数据,可以实现如下功能:动作感应(比如根据用户的倾斜操作来改变UI)、拍摄时的图像稳定、游戏控制以及惯性导航。
压力传感器913可以设置在终端900的侧边框和/或触摸显示屏905的下层。当压力传感器913设置在终端900的侧边框时,可以检测用户对终端900的握持信号,由处理器901根据压力传感器913采集的握持信号进行左右手识别或快捷操作。当压力传感器913设置在触摸显示屏905的下层时,由处理器901根据用户对触摸显示屏905的压力操作,实现对UI界面上的可操作性控件进行控制。可操作性控件包括按钮控件、滚动条控件、图标控件、菜单控件中的至少一种。
指纹传感器914用于采集用户的指纹,由处理器901根据指纹传感器914采集到的指纹识别用户的身份,或者,由指纹传感器914根据采集到的指纹识别用户的身份。在识别出用户的身份为可信身份时,由处理器901授权该用户具有相关的敏感操作,该敏感操作包括解锁屏幕、查看加密信息、下载软件、支付及更改设置等。指纹传感器914可以被设置终端900的正面、背面或侧面。当终端900上设置有物理按键或厂商Logo时,指纹传感器914可以与物理按键或厂商标志集成在一起。
光学传感器915用于采集环境光强度。在一个实施例中,处理器901可以根据光学传感器915采集的环境光强度,控制触摸显示屏905的显示亮度。具 体地,当环境光强度较高时,调高触摸显示屏905的显示亮度;当环境光强度较低时,调低触摸显示屏905的显示亮度。在另一个实施例中,处理器901还可以根据光学传感器915采集的环境光强度,动态调整摄像头组件906的拍摄参数。
接近传感器916,也称距离传感器,通常设置在终端900的前面板。接近传感器916用于采集用户与终端900的正面之间的距离。在一个实施例中,当接近传感器916检测到用户与终端900的正面之间的距离逐渐变小时,由处理器901控制触摸显示屏905从亮屏状态切换为息屏状态;当接近传感器916检测到用户与终端900的正面之间的距离逐渐变大时,由处理器901控制触摸显示屏905从息屏状态切换为亮屏状态。
本领域技术人员可以理解,图9中示出的结构并不构成对终端900的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。
本申请实施例还提供了一种位姿确定装置,该位姿确定装置包括处理器和存储器,存储器中存储有至少一条指令、至少一段程序、代码集或指令集,指令、程序、代码集或指令集由处理器加载并具有以实现上述实施例的位姿确定方法中所具有的操作。
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,该指令、该程序、该代码集或该指令集由处理器加载并具有以实现上述实施例的位姿确定方法中所具有的操作。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本申请的可选实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (26)

  1. 一种位姿确定方法,其特征在于,应用于智能设备,所述方法包括:
    通过追踪第一特征点,获取相机拍摄的第一图像相对于标记图像的位姿参数,所述第一特征点通过从所述标记图像中提取特征点得到;
    当所述第一图像不满足特征点追踪条件时,从所述第一图像中提取第二特征点,所述第二特征点与所述第一特征点不同;
    通过追踪所述第一特征点和所述第二特征点,获取所述相机拍摄的第二图像相对于所述标记图像的位姿参数,并根据所述位姿参数确定所述相机的位姿,所述第二图像为所述相机在所述第一图像之后拍摄的图像。
  2. 根据权利要求1所述的方法,其特征在于,所述通过追踪第一特征点,获取相机拍摄的第一图像相对于标记图像的位姿参数之前,所述方法还包括:
    将所述标记图像划分为多个尺寸相同的栅格区域;
    获取从所述标记图像中提取的每个特征点的权重,所述特征点的权重用于表示所述特征点的梯度大小;
    在划分出的每个栅格区域中提取一个权重最高的特征点,作为第一特征点,直至所述标记图像中所有栅格区域中均提取了所述第一特征点或者直至所述标记图像中提取的所述第一特征点的数量达到预设数量时为止。
  3. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    获取所述第一图像中追踪到的所述第一特征点的数量;
    当所述数量未达到预设数量时,确定所述第一图像不满足所述特征点追踪条件。
  4. 根据权利要求1所述的方法,其特征在于,所述当所述第一图像不满足特征点追踪条件时,从所述第一图像中提取第二特征点,包括:
    将所述第一图像划分为多个尺寸相同的栅格区域;
    获取从所述第一图像中提取的每个特征点的权重,所述特征点的权重用于表示所述特征点的梯度大小;
    在不包括所述第一特征点的栅格区域中提取一个权重最高的特征点,作为 第二特征点,直至所述第一图像中所有栅格区域均提取了所述第一特征点或所述第二特征点,或者直至所述第一图像中提取的所述第二特征点的数量和所述第一图像中追踪到的所述第一特征点的数量之和达到预设数量时为止。
  5. 根据权利要求1所述的方法,其特征在于,所述通过追踪第一特征点,获取相机拍摄的第一图像相对于标记图像的位姿参数,包括:
    通过追踪所述第一特征点,获取所述第一图像相对于所述标记图像的单应性矩阵;
    根据旋转位移矩阵应满足的预设约束条件,对所述单应性矩阵进行分解,得到所述第一图像相对于所述标记图像的旋转位移矩阵;
    从所述旋转位移矩阵中获取所述第一图像相对于所述标记图像的位姿参数。
  6. 根据权利要求5所述的方法,其特征在于,所述通过追踪所述第一特征点,获取所述第一图像相对于所述标记图像的单应性矩阵,包括:
    通过在从所述标记图像的下一个图像至所述第一图像的每个图像中追踪所述第一特征点,获取所述每个图像相对于上一个图像的单应性矩阵;
    对所述每个图像相对于上一个图像的单应性矩阵进行迭代处理,得到所述第一图像相对于所述标记图像的单应性矩阵。
  7. 根据权利要求5所述的方法,其特征在于,所述根据旋转位移矩阵应满足的预设约束条件,对所述单应性矩阵进行分解得到所述第一图像相对于所述标记图像的旋转位移矩阵,包括:
    根据所述预设约束条件对所述单应性矩阵进行分解,得到所述第一图像相对于第一坐标系中的所述标记图像的旋转位移矩阵,所述第一坐标系为所述标记图像的图像坐标系向z轴的负方向平移一个单位形成的坐标系;
    根据所述第一坐标系与所述标记图像的图像坐标系之间的转换关系,对所述第一图像相对于所述第一坐标系中的所述标记图像的旋转位移矩阵进行转换,得到所述第一图像相对于所述标记图像的旋转位移矩阵。
  8. 根据权利要求7所述的方法,其特征在于,所述预设约束条件包括旋转位移矩阵中旋转参数矩阵的列向量为单位矩阵,且旋转参数矩阵的第一列与第二列的乘积等于第三列;所述根据所述预设约束条件对所述单应性矩阵进行分解,得到所述第一图像相对于第一坐标系中的所述标记图像的旋转位移矩阵,包括:
    采用以下公式,对所述单应性矩阵进行分解:
    Figure PCTCN2019079342-appb-100001
    其中,
    Figure PCTCN2019079342-appb-100002
    表示所述单应性矩阵,
    Figure PCTCN2019079342-appb-100003
    Rcm表示所述第一图像相对于所述第一坐标系中的所述标记图像的旋转参数矩阵,Tcm表示所述第一图像相对于所述第一坐标系中的所述标记图像的位移参数矩阵,g表示归一化因子,P表示所述相机的透视投影参数;
    根据所述预设约束条件,计算出所述第一图像相对于所述第一坐标系中的所述标记图像的旋转参数矩阵Rcm和位移参数矩阵Tcm。
  9. 根据权利要求7所述的方法,其特征在于,所述根据所述第一坐标系与所述标记图像的图像坐标系之间的转换关系,对所述第一图像相对于所述第一坐标系中的所述标记图像的旋转位移矩阵进行转换,得到所述第一图像相对于所述标记图像的旋转位移矩阵,包括:
    采用如下公式进行转换,得到所述第一图像相对于所述标记图像的旋转位移矩阵:
    Figure PCTCN2019079342-appb-100004
    其中,Rcm表示所述第一图像相对于所述第一坐标系中的所述标记图像的旋转参数矩阵,Tcm表示所述第一图像相对于所述第一坐标系中的所述标记图像的位移参数矩阵;Rca表示所述第一图像相对于所述标记图像的旋转参数矩阵,Tca 表示所述第一图像相对于所述标记图像的位移参数矩阵。
  10. 根据权利要求5所述的方法,其特征在于,所述获取相机拍摄的第一图像相对于标记图像的位姿参数之后,所述方法还包括:
    根据所述第一图像相对于所述标记图像的旋转位移矩阵,以及所述标记图像的旋转位移矩阵,采用以下公式,获取所述第一图像的旋转位移矩阵:
    Figure PCTCN2019079342-appb-100005
    s表示所述第一图像的深度;
    R_final表示所述第一图像的旋转参数矩阵,T_final表示所述第一图像的位移参数矩阵;
    Rca表示所述第一图像相对于所述标记图像的旋转参数矩阵,Tca表示所述第一图像相对于所述标记图像的位移参数矩阵;
    R_first表示所述标记图像的旋转参数矩阵,T_first表示所述标记图像的位移参数矩阵。
  11. 根据权利要求1所述的方法,其特征在于,所述通过追踪第一特征点,获取相机拍摄的第一图像相对于标记图像的位姿参数之前,所述方法还包括:
    如果未设置标记图像,则获取所述相机拍摄的图像;
    当从所述拍摄的图像中提取到的特征点数量达到预设数量时,将所述拍摄的图像设置为所述标记图像。
  12. 根据权利要求1-11任一项所述的方法,其特征在于,所述位姿参数包括位移参数,所述方法还包括:
    通过惯性测量单元IMU,获取所述相机的多个旋转参数以及对应的时间戳,根据所述多个旋转参数以及对应的时间戳进行插值得到旋转参数曲线;
    获取所述第一图像的时间戳在所述旋转参数曲线中对应的旋转参数,作为所述第一图像的旋转参数。
  13. 一种位姿确定装置,其特征在于,所述装置包括:
    第一获取模块,用于通过追踪第一特征点,获取相机拍摄的第一图像相对 于标记图像的位姿参数,所述第一特征点通过从所述标记图像中提取特征点得到;
    特征点处理模块,用于当所述第一图像不满足特征点追踪条件时,从所述第一图像中提取第二特征点,所述第二特征点与所述第一特征点不同;
    第二获取模块,用于通过追踪所述第一特征点和所述第二特征点,获取所述相机拍摄的第二图像相对于所述标记图像的位姿参数,并根据所述位姿参数确定所述相机的位姿,所述第二图像为所述相机在所述第一图像之后拍摄的图像。
  14. 一种智能设备,其特征在于,所述智能设备包括:处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    通过追踪第一特征点,获取相机拍摄的第一图像相对于标记图像的位姿参数,所述第一特征点通过从所述标记图像中提取特征点得到;
    当所述第一图像不满足特征点追踪条件时,从所述第一图像中提取第二特征点,所述第二特征点与所述第一特征点不同;
    通过追踪所述第一特征点和所述第二特征点,获取所述相机拍摄的第二图像相对于所述标记图像的位姿参数,并根据所述位姿参数确定所述相机的位姿,所述第二图像为所述相机在所述第一图像之后拍摄的图像。
  15. 根据权利要求14所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    将所述标记图像划分为多个尺寸相同的栅格区域;
    获取从所述标记图像中提取的每个特征点的权重,所述特征点的权重用于表示所述特征点的梯度大小;
    在划分出的每个栅格区域中提取一个权重最高的特征点,作为第一特征点,直至所述标记图像中所有栅格区域中均提取了所述第一特征点或者直至所述标记图像中提取的所述第一特征点的数量达到预设数量时为止。
  16. 根据权利要求14所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    获取所述第一图像中追踪到的所述第一特征点的数量;
    当所述数量未达到预设数量时,确定所述第一图像不满足所述特征点追踪条件。
  17. 根据权利要求14所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    将所述第一图像划分为多个尺寸相同的栅格区域;
    获取从所述第一图像中提取的每个特征点的权重,所述特征点的权重用于表示所述特征点的梯度大小;
    在不包括所述第一特征点的栅格区域中提取一个权重最高的特征点,作为第二特征点,直至所述第一图像中所有栅格区域均提取了所述第一特征点或所述第二特征点,或者直至所述第一图像中提取的所述第二特征点的数量和所述第一图像中追踪到的所述第一特征点的数量之和达到预设数量时为止。
  18. 根据权利要求14所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    通过追踪所述第一特征点,获取所述第一图像相对于所述标记图像的单应性矩阵;
    根据旋转位移矩阵应满足的预设约束条件,对所述单应性矩阵进行分解得到所述第一图像相对于所述标记图像的旋转位移矩阵,从所述旋转位移矩阵中获取所述第一图像相对于所述标记图像的位姿参数。
  19. 根据权利要求18所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    通过在从所述标记图像的下一个图像至所述第一图像的每个图像中追踪所述第一特征点,获取所述每个图像相对于上一个图像的单应性矩阵;
    对所述每个图像相对于上一个图像的单应性矩阵进行迭代处理,得到所述第一图像相对于所述标记图像的单应性矩阵。
  20. 根据权利要求18所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    根据所述预设约束条件对所述单应性矩阵进行分解,得到所述第一图像相对于第一坐标系中的所述标记图像的旋转位移矩阵,所述第一坐标系为所述标记图像的图像坐标系向z轴的负方向平移一个单位形成的坐标系;
    根据所述第一坐标系与所述标记图像的图像坐标系之间的转换关系,对所述第一图像相对于所述第一坐标系中的所述标记图像的旋转位移矩阵进行转换,得到所述第一图像相对于所述标记图像的旋转位移矩阵。
  21. 根据权利要求20所述的智能设备,其特征在于,所述预设约束条件包括旋转位移矩阵中旋转参数矩阵的列向量为单位矩阵,且旋转参数矩阵的第一列与第二列的乘积等于第三列;所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    采用以下公式,对所述单应性矩阵进行分解:
    Figure PCTCN2019079342-appb-100006
    其中,
    Figure PCTCN2019079342-appb-100007
    表示所述单应性矩阵,
    Figure PCTCN2019079342-appb-100008
    Rcm表示所述第一图像相对于所述第一坐标系中的所述标记图像的旋转参数矩阵,Tcm表示所述第一图像相对于所述第一坐标系中的所述标记图像的位移参数矩阵,g表示归一化因子,P表示所述相机的透视投影参数;
    根据所述预设约束条件,计算出所述第一图像相对于所述第一坐标系中的所述标记图像的旋转参数矩阵Rcm和位移参数矩阵Tcm。
  22. 根据权利要求20所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    采用如下公式进行转换,得到所述第一图像相对于所述标记图像的旋转位 移矩阵:
    Figure PCTCN2019079342-appb-100009
    其中,Rcm表示所述第一图像相对于所述第一坐标系中的所述标记图像的旋转参数矩阵,Tcm表示所述第一图像相对于所述第一坐标系中的所述标记图像的位移参数矩阵;Rca表示所述第一图像相对于所述标记图像的旋转参数矩阵,Tca表示所述第一图像相对于所述标记图像的位移参数矩阵。
  23. 根据权利要求18所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    根据所述第一图像相对于所述标记图像的旋转位移矩阵,以及所述标记图像的旋转位移矩阵,采用以下公式,获取所述第一图像的旋转位移矩阵:
    Figure PCTCN2019079342-appb-100010
    s表示所述第一图像的深度;
    R_final表示所述第一图像的旋转参数矩阵,T_final表示所述第一图像的位移参数矩阵;
    Rca表示所述第一图像相对于所述标记图像的旋转参数矩阵,Tca表示所述第一图像相对于所述标记图像的位移参数矩阵;
    R_first表示所述标记图像的旋转参数矩阵,T_first表示所述标记图像的位移参数矩阵。
  24. 根据权利要求14所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    如果未设置标记图像,则获取所述相机拍摄的图像;
    当从所述拍摄的图像中提取到的特征点数量达到预设数量时,将所述拍摄的图像设置为所述标记图像。
  25. 根据权利要求14-24任一项所述的智能设备,其特征在于,所述位姿参数包括位移参数,所述指令、所述程序、所述代码集或所述指令集由所述处理 器加载并执行以实现如下操作:
    通过惯性测量单元IMU,获取所述相机的多个旋转参数以及对应的时间戳,根据所述多个旋转参数以及对应的时间戳进行插值得到旋转参数曲线;
    获取所述第一图像的时间戳在所述旋转参数曲线中对应的旋转参数,作为所述第一图像的旋转参数。
  26. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述指令、所述程序、所述代码集或所述指令集由处理器加载并具有以实现如权利要求1至12任一权利要求所述的位姿确定方法中所具有的操作。
PCT/CN2019/079342 2018-04-27 2019-03-22 位姿确定方法、装置、智能设备及存储介质 WO2019205851A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP19793479.7A EP3786896A4 (en) 2018-04-27 2019-03-22 METHOD AND DEVICE FOR DETERMINING POSE, INTELLIGENT DEVICE AND INFORMATION MEDIA
US16/913,144 US11222440B2 (en) 2018-04-27 2020-06-26 Position and pose determining method, apparatus, smart device, and storage medium
US17/543,515 US11798190B2 (en) 2018-04-27 2021-12-06 Position and pose determining method, apparatus, smart device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810391549.6 2018-04-27
CN201810391549.6A CN108682036B (zh) 2018-04-27 2018-04-27 位姿确定方法、装置及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/913,144 Continuation US11222440B2 (en) 2018-04-27 2020-06-26 Position and pose determining method, apparatus, smart device, and storage medium

Publications (1)

Publication Number Publication Date
WO2019205851A1 true WO2019205851A1 (zh) 2019-10-31

Family

ID=63802533

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/079342 WO2019205851A1 (zh) 2018-04-27 2019-03-22 位姿确定方法、装置、智能设备及存储介质

Country Status (4)

Country Link
US (2) US11222440B2 (zh)
EP (1) EP3786896A4 (zh)
CN (2) CN110599549B (zh)
WO (1) WO2019205851A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114302214A (zh) * 2021-01-18 2022-04-08 海信视像科技股份有限公司 一种虚拟现实设备及防抖动录屏方法

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108876854B (zh) 2018-04-27 2022-03-08 腾讯科技(深圳)有限公司 相机姿态追踪过程的重定位方法、装置、设备及存储介质
CN110599549B (zh) * 2018-04-27 2023-01-10 腾讯科技(深圳)有限公司 界面显示方法、装置及存储介质
CN108734736B (zh) 2018-05-22 2021-10-26 腾讯科技(深圳)有限公司 相机姿态追踪方法、装置、设备及存储介质
CN110557522A (zh) * 2018-05-31 2019-12-10 阿里巴巴集团控股有限公司 一种去除视频抖动的方法及装置
CN109801379B (zh) * 2019-01-21 2023-02-17 视辰信息科技(上海)有限公司 通用的增强现实眼镜及其标定方法
CN112406608B (zh) * 2019-08-23 2022-06-21 国创移动能源创新中心(江苏)有限公司 充电桩及其自动充电装置和方法
CN111147741B (zh) * 2019-12-27 2021-08-13 Oppo广东移动通信有限公司 基于对焦处理的防抖方法和装置、电子设备、存储介质
US11195303B2 (en) * 2020-01-29 2021-12-07 Boston Polarimetrics, Inc. Systems and methods for characterizing object pose detection and measurement systems
CN113382156A (zh) * 2020-03-10 2021-09-10 华为技术有限公司 获取位姿的方法及装置
CN112037261A (zh) * 2020-09-03 2020-12-04 北京华捷艾米科技有限公司 一种图像动态特征去除方法及装置
CN112798812B (zh) * 2020-12-30 2023-09-26 中山联合汽车技术有限公司 基于单目视觉的目标测速方法
KR20220122287A (ko) * 2021-02-26 2022-09-02 삼성전자주식회사 증강 현실 제공 장치의 포즈 결정 방법 및 장치
CN113094457B (zh) * 2021-04-15 2023-11-03 成都纵横自动化技术股份有限公司 一种数字正射影像地图的增量式生成方法及相关组件
CN113390408A (zh) * 2021-06-30 2021-09-14 深圳市优必选科技股份有限公司 一种机器人定位方法、装置、机器人及存储介质
CN113628275B (zh) * 2021-08-18 2022-12-02 北京理工大学深圳汽车研究院(电动车辆国家工程实验室深圳研究院) 一种充电口位姿估计方法、系统、充电机器人及存储介质
CN113674320B (zh) * 2021-08-24 2024-03-22 湖南国科微电子股份有限公司 视觉导航特征点获取方法、装置和计算机设备
GB2616644B (en) * 2022-03-16 2024-06-19 Sony Interactive Entertainment Inc Input system
CN114723924A (zh) * 2022-03-23 2022-07-08 杭州易现先进科技有限公司 一种大场景增强现实的定位方法、系统、装置和介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120293635A1 (en) * 2011-05-17 2012-11-22 Qualcomm Incorporated Head pose estimation using rgbd camera
CN107869989A (zh) * 2017-11-06 2018-04-03 东北大学 一种基于视觉惯导信息融合的定位方法及系统
CN108682036A (zh) * 2018-04-27 2018-10-19 腾讯科技(深圳)有限公司 位姿确定方法、装置及存储介质

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002073955A1 (en) * 2001-03-13 2002-09-19 Canon Kabushiki Kaisha Image processing apparatus, image processing method, studio apparatus, storage medium, and program
NO327279B1 (no) * 2007-05-22 2009-06-02 Metaio Gmbh Kamerapositurestimeringsanordning og- fremgangsmate for foroket virkelighetsavbildning
US8406507B2 (en) * 2009-01-14 2013-03-26 A9.Com, Inc. Method and system for representing image patches
CN102819845A (zh) * 2011-06-07 2012-12-12 中兴通讯股份有限公司 一种混合特征的跟踪方法和装置
US9311883B2 (en) * 2011-11-11 2016-04-12 Microsoft Technology Licensing, Llc Recalibration of a flexible mixed reality device
GB2506338A (en) * 2012-07-30 2014-04-02 Sony Comp Entertainment Europe A method of localisation and mapping
CN103198488B (zh) * 2013-04-16 2016-08-24 北京天睿空间科技有限公司 Ptz监控摄像机实时姿态快速估算方法
CN104748746B (zh) * 2013-12-29 2017-11-03 刘进 智能机姿态测定及虚拟现实漫游方法
WO2016017253A1 (ja) * 2014-08-01 2016-02-04 ソニー株式会社 情報処理装置、および情報処理方法、並びにプログラム
US9928656B2 (en) * 2015-09-11 2018-03-27 Futurewei Technologies, Inc. Markerless multi-user, multi-object augmented reality on mobile devices
US9648303B1 (en) * 2015-12-15 2017-05-09 Disney Enterprises, Inc. Systems and methods for facilitating three-dimensional reconstruction of scenes from videos
JP2017129567A (ja) * 2016-01-20 2017-07-27 キヤノン株式会社 情報処理装置、情報処理方法、プログラム
JP6701930B2 (ja) * 2016-04-28 2020-05-27 富士通株式会社 オーサリング装置、オーサリング方法およびオーサリングプログラム
EP3264372A1 (en) * 2016-06-30 2018-01-03 Alcatel Lucent Image processing device and method
CN106843456B (zh) * 2016-08-16 2018-06-29 深圳超多维光电子有限公司 一种基于姿态追踪的显示方法、装置和虚拟现实设备
CN106373141B (zh) * 2016-09-14 2019-05-28 上海航天控制技术研究所 空间慢旋碎片相对运动角度和角速度的跟踪系统和跟踪方法
WO2018152214A1 (en) * 2017-02-14 2018-08-23 The Trustees Of The University Of Pennsylvania Event-based feature tracking
CN106920259B (zh) * 2017-02-28 2019-12-06 武汉工程大学 一种定位方法及系统
JP6924799B2 (ja) * 2019-07-05 2021-08-25 株式会社スクウェア・エニックス プログラム、画像処理方法及び画像処理システム

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120293635A1 (en) * 2011-05-17 2012-11-22 Qualcomm Incorporated Head pose estimation using rgbd camera
CN107869989A (zh) * 2017-11-06 2018-04-03 东北大学 一种基于视觉惯导信息融合的定位方法及系统
CN108682036A (zh) * 2018-04-27 2018-10-19 腾讯科技(深圳)有限公司 位姿确定方法、装置及存储介质

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MIAO JINGHUA , SUN YANKUI: "Real-time Camera Pose Tracking with Locating Image Patching Scales and Regions", JOURNAL OF IMAGE AND GRAPHICS, vol. 22, no. 7, 16 July 2017 (2017-07-16), pages 957 - 968, XP055736534, ISSN: 1006-8961, DOI: 10.11834/jig.160612 *
ZHANG PEIKE1, WU YUANXIN ,CAI QI: "Improved Solution for Relative Pose Based on Homography Matrix", COMPUTER ENGINEERING AND APPLICATIONS, vol. 53, no. 15, 31 August 2017 (2017-08-31), pages 25 - 30, XP055736538, ISSN: 1002-8331, DOI: 10.3778/j.issn.1002-8331.1704-0406 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114302214A (zh) * 2021-01-18 2022-04-08 海信视像科技股份有限公司 一种虚拟现实设备及防抖动录屏方法

Also Published As

Publication number Publication date
US20220092813A1 (en) 2022-03-24
CN108682036B (zh) 2022-10-25
EP3786896A4 (en) 2022-01-12
CN108682036A (zh) 2018-10-19
US11222440B2 (en) 2022-01-11
US11798190B2 (en) 2023-10-24
CN110599549A (zh) 2019-12-20
EP3786896A1 (en) 2021-03-03
US20200327692A1 (en) 2020-10-15
CN110599549B (zh) 2023-01-10

Similar Documents

Publication Publication Date Title
WO2019205851A1 (zh) 位姿确定方法、装置、智能设备及存储介质
WO2019205850A1 (zh) 位姿确定方法、装置、智能设备及存储介质
CN108682038B (zh) 位姿确定方法、装置及存储介质
US11205282B2 (en) Relocalization method and apparatus in camera pose tracking process and storage medium
CN108734736B (zh) 相机姿态追踪方法、装置、设备及存储介质
CN109947886B (zh) 图像处理方法、装置、电子设备及存储介质
WO2019205853A1 (zh) 相机姿态追踪过程的重定位方法、装置、设备及存储介质
US11276183B2 (en) Relocalization method and apparatus in camera pose tracking process, device, and storage medium
CN110148178B (zh) 相机定位方法、装置、终端及存储介质
WO2019154231A1 (zh) 图像处理方法、电子设备及存储介质
CN109886208B (zh) 物体检测的方法、装置、计算机设备及存储介质
WO2019134305A1 (zh) 确定姿态的方法、装置、智能设备、存储介质和程序产品
CN113160031B (zh) 图像处理方法、装置、电子设备及存储介质
CN113033590B (zh) 图像特征匹配方法、装置、图像处理设备及存储介质
CN114093020A (zh) 动作捕捉方法、装置、电子设备及存储介质
CN113409235B (zh) 一种灭点估计的方法及装置
CN116502382A (zh) 传感器数据的处理方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19793479

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2019793479

Country of ref document: EP