WO2019205850A1 - 位姿确定方法、装置、智能设备及存储介质 - Google Patents

位姿确定方法、装置、智能设备及存储介质 Download PDF

Info

Publication number
WO2019205850A1
WO2019205850A1 PCT/CN2019/079341 CN2019079341W WO2019205850A1 WO 2019205850 A1 WO2019205850 A1 WO 2019205850A1 CN 2019079341 W CN2019079341 W CN 2019079341W WO 2019205850 A1 WO2019205850 A1 WO 2019205850A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
parameter
marker
pose
camera
Prior art date
Application number
PCT/CN2019/079341
Other languages
English (en)
French (fr)
Inventor
林祥凯
乔亮
朱峰明
左宇
杨泽宇
凌永根
暴林超
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP19792476.4A priority Critical patent/EP3786893A4/en
Publication of WO2019205850A1 publication Critical patent/WO2019205850A1/zh
Priority to US16/917,069 priority patent/US11158083B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30204Marker

Definitions

  • the embodiments of the present invention relate to the field of computer technologies, and in particular, to a posture determining method, device, smart device, and storage medium.
  • AR Augmented Reality
  • a virtual image video or three-dimensional model. It can display the virtual scene together with the actual scene. It is currently the field of computer vision.
  • the related art proposes a method for determining camera position and posture by tracking feature points in a marker image, first defining a template image in advance, and extracting feature points in the template image, along with the position or posture of the camera.
  • the change tracks the extracted feature points, and each time the camera currently captures an image, the feature points of the template image are recognized in the current image, so that the position and posture of the feature point in the current image and the feature points can be in the template image.
  • the position and posture are compared to obtain the pose parameter of the feature point, thereby obtaining the pose parameter of the current image relative to the template image, such as a rotation parameter and a displacement parameter, which can indicate the position and posture of the camera when the current image is captured.
  • the inventors have found that the above related art has at least the following problem: when the position or posture of the camera changes too much and the feature points of the template image do not exist in the current image, the feature points cannot be traced. It is impossible to determine the position and posture of the camera.
  • the embodiment of the present application provides a method, a device, a smart device, and a storage medium for determining a pose, which can solve the problems of the related art.
  • the technical solution is as follows:
  • a pose determination method comprising:
  • the previous image of the first image satisfies the feature point tracking condition, and the first image does not satisfy the feature point tracking condition, the previous image of the first image is used as the second marker image;
  • a pose determining apparatus comprising:
  • a first acquiring module configured to acquire a pose parameter of an image captured by the camera by tracking a feature point of the first marked image
  • a switching module configured to: when the previous image of the first image satisfies the feature point tracking condition, and the first image does not satisfy the feature point tracking condition, use the previous image of the first image as the second marker image;
  • a second acquiring module configured to acquire a pose parameter of the image captured by the camera with respect to the second marker image by tracking feature points of the second marker image
  • the second acquiring module is further configured to acquire a pose of the image according to a pose parameter of the image relative to the second marker image, and a pose parameter of each marker image relative to a previous marker image. And determining a pose of the camera according to the pose parameter.
  • a smart device comprising: a processor and a memory, wherein the memory stores at least one instruction, at least one program, a code set or a set of instructions, the instruction, the program, The set of codes or the set of instructions is loaded and executed by the processor to:
  • the previous image of the first image satisfies the feature point tracking condition, and the first image does not satisfy the feature point tracking condition, the previous image of the first image is used as the second marker image;
  • a fourth aspect provides a computer readable storage medium having stored therein at least one instruction, at least one program, a code set, or a set of instructions, the instruction, the program, the code set Or the set of instructions is loaded by the processor and has operations to implement the pose determination method as described in the first aspect.
  • the method, the device, the smart device and the storage medium provided by the embodiments of the present application in the process of acquiring the pose parameter of the image captured by the camera while tracking the feature points of the first mark image, when the previous image of the first image satisfies the feature Point tracking condition, when the first image does not satisfy the feature point tracking condition, the previous image of the first image is used as the second marker image, and then the feature point of the second marker image is tracked, and the image captured by the camera is relative to the second image
  • the pose parameter of the mark image, and the pose parameter of each mark image relative to the previous mark image acquire the pose parameter of the image, and determine the pose of the camera according to the pose parameter.
  • the position and posture of the camera are determined by tracking the feature points of the switched marker image, thereby avoiding the inability to track due to excessive changes in the position or posture of the camera.
  • the problem of feature points enhances the robustness and improves the tracking accuracy of the camera.
  • FIG. 1 is a schematic diagram of display of a scene interface provided by an embodiment of the present application.
  • FIG. 2 is a schematic diagram of display of another scene interface provided by an embodiment of the present application.
  • FIG. 3 is a flowchart of a pose determination method provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of an image provided by an embodiment of the present application.
  • FIG. 5 is a flowchart of a pose determination method according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of an operation flow provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a pose determining apparatus according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • the embodiment of the present application provides a pose determination method, which is applied to a scene in which the smart device tracks the position and posture of the camera, especially in an AR scenario, when the smart device uses AR technology for display, such as displaying an AR game and an AR video. Wait, you need to track the camera's position and posture.
  • the smart device is configured with a camera and a display unit, the camera is used to capture an image of a real scene, and the display unit is configured to display a scene interface composed of a real scene and a virtual scene.
  • the smart device can track the change of the position and posture of the camera with the movement of the camera, and can also capture the image of the real scene, and sequentially display the plurality of images currently captured according to the change of the position and posture of the camera, thereby simulating the display of the three-dimensional interface. effect.
  • virtual elements such as virtual images, virtual videos, or virtual three-dimensional models, may be added to the displayed image.
  • virtual elements may be displayed in different orientations according to changes in the position and posture of the camera, thereby simulating Shows the effect of displaying 3D virtual elements.
  • the image of the real scene is displayed in combination with the virtual elements to form a scene interface, thereby simulating the effect that the real scene and the virtual element are in the same three-dimensional space.
  • the smart device adds a virtual character image to the captured image including the table and the teacup.
  • the captured image changes, and the virtual character image is also photographed.
  • the change simulates the virtual character in the image relative to the table and the cup, while the camera captures the effect of the table, the cup and the avatar with the change of position and posture, presenting a real stereo to the user. Picture.
  • FIG. 3 is a flowchart of a method for determining a pose according to an embodiment of the present application.
  • the execution body of the pose determination method is a smart device, and the smart device may be a mobile phone, a tablet, or the like configured with a camera or configured AR equipment such as AR glasses and AR helmets of the camera, see FIG. 3, the method includes:
  • the smart device acquires an image captured by the camera, and sets the captured image as the first marker image.
  • the first mark image is the initial mark image.
  • the smart device can capture the third image through the camera, acquire the image currently captured by the camera, and set the image as the first marker image, thereby implementing initialization of the marker image, and the subsequent smart device continues.
  • the pose parameters of each image can be obtained by tracking the feature points of the first marker image.
  • the camera can shoot according to a preset period, and an image is taken every other preset period, and the preset period can be 0.1 seconds or 0.01 seconds.
  • the feature points may be extracted from the image, and the number of extracted feature points may be determined. Whether the preset number is reached, when the number of feature points extracted from the image reaches a preset number, the image is set as the first marker image, and when the number of feature points extracted from the image does not reach the preset In the case of quantity, the image may not be set as the first mark image, but the next image taken by the camera may be acquired until the number of extracted feature points reaches a preset number of images, and the number of extracted feature points is pre- The set number of images is set as the first mark image.
  • the feature extraction algorithm used in extracting feature points may be a FAST (Features from Accelerated Segment Test) detection algorithm, a Shi-Tomasi corner detection algorithm, and a Harris Corner Detection (Harris angle). Point detection) algorithms, etc., the preset number can be determined according to the need for tracking accuracy.
  • FAST Features from Accelerated Segment Test
  • Shi-Tomasi corner detection algorithm Shi-Tomasi corner detection algorithm
  • Harris Corner Detection Harris Corner Detection (Harris angle).
  • Point detection) algorithms, etc. the preset number can be determined according to the need for tracking accuracy.
  • the mark image may be switched with the movement of the camera.
  • the initial mark image is used as a reference, and each image is used.
  • the pose parameter relative to the initial marker image can be used as the pose parameter of the corresponding image, and the pose parameter is used to indicate the position and posture of the camera when the corresponding image is captured.
  • the second point to be described is that the first marker image is taken as an initial marker image as an example.
  • the first marker image may also be a marker image set after the initial marker image. That is, in another embodiment, before the first mark image, the smart device may also set another mark image, and after one or more switches, switch to the first mark image, and the specific switching process is as follows: The process of switching the first marker image to the second marker image is similar, and will not be described here.
  • the feature points extracted from the first marker image are taken as target feature points to be tracked.
  • the smart device captures at least one image through the camera, and by tracking the feature points in the at least one image, a pose parameter of each image relative to the previous image is obtained.
  • the feature points extracted from the previous image are used to perform optical flow, thereby finding matching feature points between the previous image and the next image, and obtaining matching feature points.
  • the optical flow information is used to indicate the motion information of the matching feature points in the adjacent two images, and the second image of the adjacent two images may be determined according to the optical flow information of the matching feature points.
  • the algorithm used for optical flow can be Lucas-Kanade optical flow algorithm or other algorithms. In addition to optical flow, descriptors or direct methods can be used to match feature points to find the previous one. Matching feature points between the image and the next image.
  • the pose parameter of each image from the first marker image to the image is acquired relative to the previous image according to each image.
  • the pose parameter can be iterated to determine the pose parameter of the image relative to the first marker image.
  • the pose parameter may include a displacement parameter and a rotation parameter, wherein the displacement parameter is used to indicate a distance between a position when the camera captures the image and a position when the first marker image is captured, and the rotation parameter is used to indicate that the camera captures the The difference in angle between the angle of rotation of the image and the angle of rotation when the first marker image is captured.
  • the camera sequentially captures the image 1, the image 2, and the image 3, and acquires the pose parameter (R1, T1) of the image 1 with respect to the first mark image, and the bit of the image 2 with respect to the image 1.
  • the pose parameter (R2, T2) and the pose parameter (R3, T3) of the image 3 relative to the image 2, according to these pose parameters can be iterated, determining the pose parameter of the image 3 relative to the first marker image (R3' , T3') is:
  • the number of feature points in the captured image may decrease as the position and posture of the camera change, resulting in some feature points in the previous image not in the next image.
  • some unmatched feature points are excluded.
  • the smart device can also check the optical flow matching results to eliminate unreasonable feature points. That is, for any image captured by the camera after the first marker image, according to the three-dimensional coordinates of the plurality of feature points in the first marker image and the pose parameters of the image relative to the first marker image, the position of the feature point and The variation of the attitude is simulated, the estimated three-dimensional coordinates of each feature point in the image are calculated, and the estimated three-dimensional coordinates of each feature point in the image are transformed to obtain an estimated two-dimensionality of each feature point in the image.
  • Coordinates comparing the estimated two-dimensional coordinates of each feature point in the image with the actual two-dimensional coordinates, and obtaining an estimated two-dimensional coordinate of each feature point in the image and an actual two-dimensional coordinate in the image
  • the distance between the estimated two-dimensional coordinates of the feature point in the image and the actual two-dimensional coordinates in the image is greater than the preset distance, indicating that the first pose is calculated according to the calculated pose parameter
  • the camera pose change at the beginning of the image is simulated, and the position of the obtained feature point is too different from the actual position. It can be considered that the position and posture of the feature point do not match. Some relationship between rotation and translation, the error is too large, in order to avoid the influence of the subsequent characteristic point tracking process, the feature point deletion.
  • the first mark image is an initial mark image
  • the pose parameter of the image relative to the first mark image may indicate the position and posture of the camera when the image is captured.
  • the pose parameter is obtained according to the pose parameter of the image relative to the first marker image and the pose parameter of the first marker image relative to the initial marker image.
  • the pose parameter of the image relative to the initial marker image which can represent the position and pose of the camera when the image was taken.
  • the pose parameter of the image is obtained by using the following formula:
  • R_final represents the rotation parameter of the image
  • T_final represents the displacement parameter of the image
  • Rca represents the rotation parameter of the image relative to the first marker image
  • Tca represents the displacement parameter of the image relative to the first marker image
  • R_old represents the first marker image relative to the initial marker
  • the rotation parameter of the image, T_old represents the displacement parameter of the first marker image relative to the initial marker image.
  • the first point to be explained is that in the above tracking process, the three-dimensional coordinates of the feature points need to be determined, and the position and posture changes of the camera in the three-dimensional space can be determined by tracking the feature points.
  • the feature points are extracted in the first marker image, after determining the two-dimensional coordinates of the feature points in the first marker image, the homogeneous coordinates corresponding to the two-dimensional coordinates of the feature points are acquired, and the homogeneous coordinates are used for two-dimensional coordinates.
  • the coordinates are represented in three dimensions, using the following coordinate transformation relationships to convert the homogeneous coordinates to the corresponding three-dimensional coordinates:
  • M represents three-dimensional coordinates
  • m represents homogeneous coordinates
  • s represents the depth of the marker image in which the feature points are located
  • fx, fy, cx, and cy represent parameters of the camera.
  • the homogeneous coordinates of the feature points may be [ ⁇ , ⁇ , 1], and the three-dimensional coordinates of the feature points may be
  • the second point to be explained is that in the tracking process of each independent marker image, it is assumed that the depth of all three-dimensional feature points on the marker image is s.
  • the smart device can determine the mark image, the three-dimensional coordinates of the feature points, and the depth of the mark image.
  • the PnP (Pespective-n-Point) algorithm is used to calculate the parameters, and the camera can be obtained. Pose parameters.
  • the PnP algorithm may be a direct linear transform, P3P, ePnP, uPnP, or the like, or may be calculated by an algorithm other than the PnP algorithm, such as a BA (Bundle Adjustment) algorithm for optimizing the PnP.
  • the previous image of the first image satisfies the feature point tracking condition, and the first image does not satisfy the feature point tracking condition, the previous image of the first image is used as the second marker image.
  • the feature point tracking condition is a condition for tracking the feature point of the current marker image. If the image captured by the smart device satisfies the feature point tracking condition, the tracking may continue, and if the image captured by the smart device does not satisfy the feature point tracking condition, In order to prevent tracking failure, you need to switch the marker image.
  • the smart device when the smart device captures the image, it also determines whether the image satisfies the feature point tracking condition.
  • the camera first captures the previous image of the first image, and the previous image of the first image satisfies the feature point tracking condition, and then passes the above step 302.
  • the previous image of the first image is used as the second marker image, and the pose parameter of the previous image of the first image is the second marker.
  • the pose parameter of the image is used as the second marker image.
  • the feature point tracking condition may be that the number of tracked feature points reaches a preset number, and when the number of feature points of the first mark image tracked in a certain image reaches a preset number, It is determined that the image satisfies the feature point tracking condition. When the number of feature points of the first marker image tracked in a certain image does not reach the preset number, it is determined that the image does not satisfy the feature point tracking condition.
  • the number of feature points tracked in the previous image of the first image is acquired, and when the number reaches a preset number, determining that the previous image of the first image satisfies the feature point tracking condition.
  • the number of feature points tracked in the first image is obtained. When the number does not reach the preset number, it is determined that the first image does not satisfy the feature point tracking condition.
  • a plurality of feature points will be extracted from the second marker image as the updated target feature point, and the smart device shoots through the camera as the position or posture of the camera changes.
  • At least one image obtains a pose parameter of each image relative to the previous image by tracking feature points of the second marker image in the at least one image.
  • the feature points extracted from the previous image are used to perform optical flow, thereby finding matching feature points between the previous image and the next image, and obtaining matching feature points.
  • the optical flow information is used to indicate the motion information of the matching feature points in the adjacent two images, and the second image of the adjacent two images may be determined according to the optical flow information of the matching feature points.
  • the algorithm used for optical flow can be Lucas-Kanade optical flow algorithm or other algorithms.
  • the feature points can be matched by descriptor or direct method to find the match between the previous image and the next image. Feature points.
  • the pose parameter can be iterated to determine the pose parameter of the second image relative to the second marker image.
  • the pose parameter may include at least one of a displacement parameter and a rotation parameter, the displacement parameter being used to indicate a distance between a position when the camera captures the second image and a position when the second marker image is captured, the rotation The parameter is used to indicate the angular difference between the angle of rotation when the camera takes the second image and the angle of rotation when the second marker image is captured.
  • the pose parameter according to the second image relative to the second mark image and the second mark image relative to the first mark image a pose parameter (ie, a pose parameter of the second marker image relative to the initial marker image), obtaining a pose parameter of the second image relative to the initial marker image, that is, a pose parameter of the second image, according to the pose parameter You can determine the pose of the camera.
  • the pose parameter of the second image relative to the second mark image, the pose parameter of the second mark image relative to the first mark image, and the first A pose parameter of the marker image relative to the initial marker image acquires a pose parameter of the second image relative to the initial marker image, that is, a pose parameter of the second image, and the pose of the camera can be determined according to the pose parameter.
  • the image captured after the second image is the second marker image may be the first image or any image captured after the first image.
  • acquiring the pose parameter of the first image for the first image For acquiring the pose parameter of the first image for the first image, acquiring the second position according to the pose parameter of the second marker image relative to the first marker image and the pose parameter of the first marker image relative to the initial marker image Determining the pose parameter of the image relative to the initial marker image; obtaining the first image according to the pose parameter of the first image relative to the second marker image and the pose parameter of the second marker image relative to the initial marker image Pose parameters:
  • R_final represents the rotation parameter of the first image
  • T_final represents the displacement parameter of the first image
  • Rcl represents the rotation parameter of the first image relative to the second marker image
  • Tcl represents the displacement parameter of the first image relative to the second marker image
  • R_old represents The rotation parameter of the second marker image relative to the initial marker image
  • T_old represents the displacement parameter of the second marker image relative to the initial marker image
  • the first point to be explained is that in the above tracking process, the three-dimensional coordinates of the feature points need to be determined, and the position and posture changes of the camera in the three-dimensional space can be determined by tracking the feature points.
  • the feature points are extracted in the second marker image, after determining the two-dimensional coordinates of the feature points in the second marker image, the homogeneous coordinates corresponding to the two-dimensional coordinates of the feature points are acquired, and the homogeneous coordinates are used for two-dimensional coordinates.
  • the coordinates are represented in three dimensions, using the following coordinate transformation relationships to convert the homogeneous coordinates to the corresponding three-dimensional coordinates:
  • M represents three-dimensional coordinates
  • m represents homogeneous coordinates
  • s represents the depth of the marker image in which the feature points are located
  • fx, fy, cx, and cy represent parameters of the camera.
  • the homogeneous coordinates of the feature points may be [ ⁇ , ⁇ , 1], and the three-dimensional coordinates of the feature points may be
  • the second point to be explained is that in the process of tracking feature points, as the position and posture of the camera change, the number of feature points in the adjacent two images may be reduced, resulting in some of the previous images.
  • the feature points do not have matching feature points in the next image.
  • some of the unmatched feature points are excluded.
  • the smart device can also check the optical flow matching results to eliminate unreasonable feature points.
  • the second image captured by the camera after the second marker image as an example, the position of the feature point according to the three-dimensional coordinates of the plurality of feature points in the second marker image and the pose parameter of the second image relative to the second marker image And the change of the posture is simulated, the estimated three-dimensional coordinates of each feature point in the second image are calculated, and the estimated three-dimensional coordinates of each feature point in the second image are transformed, and each feature point is obtained in the second image.
  • the representation According to the calculated pose parameter, the camera pose change simulated by the second marker image is simulated, and the position of the obtained feature point is too different from the actual position, and the position of the feature point can be considered as And changes in posture does not meet proper turn the pan relationship, the error is too large, so in order to avoid the impact of the feature points on the follow-up process, the deletion feature points.
  • the inverse transformation of the coordinate transformation relationship may be performed, that is, the following three-dimensional coordinates are transformed into the estimated two-dimensional coordinates by using the following inverse transformation relationship:
  • M represents the estimated three-dimensional coordinates
  • m represents the estimated two-dimensional coordinates
  • s represents the depth of the marker image in which the feature points are located
  • fx, fy, cx, and cy represent the parameters of the camera.
  • the smart device After the unmatched feature points are excluded or the feature points with excessive errors are excluded, the smart device acquires the number of feature points in the second image, and continues to determine whether the second image satisfies the feature point tracking condition of the second tag image, thereby determining whether To switch the marker image.
  • the third point to be explained is that, in order to ensure the continuity of the depth, in the tracking process of the first marker image, it is assumed that the depth of all feature points on the first marker image is s, and in the tracking process of the second marker image, Not only must the depths of all the feature points on the second marker image be equal, but also the depth of the feature points on the first marker image is still s. It is therefore possible to iteratively calculate the depth of each marker image during the tracking process.
  • S n the depth of the second mark image
  • d the depth of the first mark image feature point in the second mark image
  • S n-1 the depth of a first mark image
  • d by the second mark bit image The pose parameter is calculated.
  • the depth of the feature points in the first image captured by the camera is 1.
  • the second mark image, the three-dimensional coordinates of the feature points extracted in the second mark image, and the depth S n of the second mark image are calculated by using the PnP algorithm, so that the displacement parameter of the camera can be tracked. .
  • the method provided by the embodiment of the present application in the process of acquiring the pose parameter of the image captured by the camera while tracking the feature point of the first mark image, when the previous image of the first image satisfies the feature point tracking condition, and the first image
  • the previous image of the first image is used as the second marker image, and then by tracking the feature points of the second marker image, according to the pose parameter of the image captured by the camera with respect to the second marker image, and
  • the pose parameter of each marker image relative to the previous marker image acquires the pose parameter of the image, and determines the pose of the camera according to the pose parameter.
  • the position and posture of the camera are determined by tracking the feature points of the switched marker image, thereby avoiding the inability to track due to excessive changes in the position or posture of the camera.
  • the problem of feature points enhances the robustness and improves the tracking accuracy of the camera.
  • the method provided by the embodiment of the present application is light and simple, and has no complicated back-end optimization, so the calculation speed is fast, and even real-time tracking can be achieved. Compared with the traditional slam (simultaneous localization and mapping) algorithm, the method provided by the embodiment of the present application is more robust and can achieve very high calculation precision.
  • FIG. 4 For example, a plurality of images taken by the camera are shown in FIG. 4, and the tracking process includes the following steps:
  • the camera captures the first image and uses the first image as the initial marker image.
  • the pose parameter of the image captured by the camera relative to the initial marker image by tracking the feature points of the initial marker image until the next image of the image a does not satisfy the feature point tracking condition, and using the image a as the first marker image, At this time, the pose parameter of the current marker image relative to the initial marker image Is the pose parameter of image a relative to the first image.
  • the image 1 is used as the second marker image.
  • the pose parameter of the current marker image relative to the initial marker image R_old, T_old
  • the pose parameter of the image 1 relative to the first image R_old, T_old
  • the pose parameter may include a displacement parameter and a rotation parameter
  • the displacement parameter is used to indicate the translation of the camera
  • the change of the position of the camera in the three-dimensional space may be determined
  • the rotation parameter is used to indicate the change of the rotation angle of the camera.
  • the execution body of the pose determination method is a smart device, and the smart device may be a terminal such as a mobile phone or a tablet computer configured with a camera or configured AR equipment such as AR glasses and AR helmets of the camera, see FIG. 5, the method includes:
  • the timestamp corresponding to each rotation parameter refers to a timestamp when the rotation parameter is obtained.
  • the interpolation algorithm may use a Spherp (Spherical Linear Interpolation) algorithm or other algorithms.
  • the rotation parameter curve is obtained by interpolating according to the plurality of rotation parameters and the corresponding time stamp, and the rotation parameter curve can represent a variation rule of the rotation parameter of the camera with the shooting time.
  • the rotation parameter curve is obtained by interpolation, and the data alignment can be performed according to the rotation parameter curve, thereby obtaining the rotation parameter corresponding to the image, and the posture of the camera is determined according to the rotation parameter.
  • the smart device is equipped with a gyroscope, an accelerometer and a geomagnetic sensor. Through the gyroscope and the geomagnetic sensor, the only rotation parameter in the earth coordinate system can be obtained.
  • the map coordinate system has the following characteristics:
  • the X-axis is defined using the vector product, which is tangent to the ground at the current position of the smart device and points to the east.
  • the Y-axis is tangent to the ground at the current position of the smart device and points to the north pole of the earth's magnetic field.
  • the Z axis points to the sky and is perpendicular to the ground.
  • the rotation parameters obtained through the map coordinate system can be considered as no error, and do not need to rely on the parameters of the IMU, avoiding the calibration problem of the IMU, and being compatible with various types of devices.
  • the smart device provides an interface for obtaining rotation parameters: a rotation-vector interface, which can call the rotation-vector interface according to the sampling frequency of the IMU to obtain the rotation parameter.
  • the smart device can store multiple rotation parameters and corresponding timestamps into the IMU queue, and interpolate the data in the IMU queue to obtain a rotation parameter curve. Or, considering the above data, there may be noise. Therefore, in order to ensure the accuracy of the data, the angle difference between the obtained rotation parameter and the previous rotation parameter may be calculated. If the angle difference is greater than the preset threshold, it may be considered as If the rotation parameter is a noise item, the rotation parameter is deleted. The noise item can be deleted by the above detection, and only the detected rotation parameters and their corresponding time stamps are stored in the IMU queue.
  • the method provided by the embodiment of the present invention obtains a rotation parameter curve by interpolating according to a plurality of rotation parameters measured by the IMU and a corresponding time stamp, and the data alignment can be performed according to the rotation parameter curve, thereby according to the time stamp and the rotation parameter curve of the captured image.
  • the operation procedure of the embodiment of the present application may be as shown in FIG. 6.
  • the functions of the smart device are divided into multiple modules, and the operation process is as follows:
  • the data measured by the IMU is read by the module 601, including the rotation parameter and the corresponding timestamp.
  • the module 602 detects whether the data is reasonable. If not, the data is discarded. If it is reasonable, the data is stored in the IMU queue by the module 603. in.
  • the captured image is read by the module 604 to determine whether the marker image has been set currently. If the marker image is not set, a marker image is initialized with the currently captured image, and if the marker image has been set, the connection with the marker image is established directly by the module 607, and the feature points of the marker image are tracked.
  • the module 608 combines the data in the IMU queue and the data obtained by tracking the feature points, obtains the displacement parameter and the rotation parameter, and calculates a rotation translation matrix from the current image relative to the current marker image.
  • the module 609 detects whether the rotation parameter and the displacement parameter of the image are reasonable. If yes, the module 612 is sent to the module 612 to convert the current image relative to the rotation matrix of the current marker image to the current image relative to the initial marker image. Rotating the translation matrix; if not, switching the marker image by the module 610, calculating a rotation translation matrix of the current image relative to the current marker image, and detecting whether the result is reasonable by the module 611, if yes, feeding the module 612, if not, Then, the module 606 is transferred back to initialize with the current image.
  • the obtained data results are smoothed and output through modules 613 and 614.
  • a kalman filter or other filter can be used for smoothing.
  • the embodiment of the present application provides a camera attitude tracking algorithm: Anchor-Switching algorithm.
  • Anchor-Switching algorithm The process of dividing the motion of the camera into a tracking process of multi-segment mark images, each process is an independent mark image tracking process, and is connected by switching the mark image in the previous frame image when the tracking fails.
  • the IMU is used to obtain the rotation parameter of the camera relative to the initial scene, and the image of the real scene is used as the marker image, and the displacement parameter of the camera relative to the current marker image is obtained by tracking, and the marker image is obtained by switching the marker image.
  • the displacement parameters of the initial scene are combined to obtain the position and attitude changes relative to the initial scene, thereby realizing a stable, fast and robust camera attitude tracking system in a real natural scene, independent of the predetermined marker image.
  • the robustness of the system is enhanced, and the camera positioning accuracy is high.
  • complicated IMU and image fusion algorithms are avoided, and the sensitivity to parameters is also reduced.
  • the method provided by the embodiment of the present application can run smoothly on the mobile end and does not require accurate calibration.
  • the embodiment of the present application corresponds to a scene in which a human eye observes a three-dimensional space, and the influence of the rotation parameter is large, and it is assumed that the movement on the plane is not large.
  • the user usually interacts with the virtual elements in a flat scene, such as a table of coffee tables, etc., and the camera can be considered to move on a plane, and the rotation parameters have a greater influence. Therefore, the embodiment of the present application is very suitable for an AR scenario.
  • FIG. 7 is a schematic structural diagram of a pose determining apparatus according to an embodiment of the present application. Referring to Figure 7, the device is applied to a smart device, the device comprising:
  • a first obtaining module 701 configured to perform the step of acquiring a pose parameter of an image captured by the camera by tracking a feature point of the first marker image in the above embodiment
  • a switching module 702 configured to perform the step of using a previous image of the first image as the second marker image in the foregoing embodiment
  • the second obtaining module 703 is configured to perform the tracking of the feature points of the second mark image by capturing the feature points of the image captured by the camera with respect to the second mark image, thereby acquiring the pose parameter of the image, according to the pose The step of the parameter determining the pose.
  • the second obtaining module 703 includes:
  • An extracting unit configured to perform the step of extracting a plurality of feature points from the second marker image in the above embodiment
  • a tracking unit configured to perform the step of tracking a plurality of feature points in the above embodiment to obtain a pose parameter of each image relative to a previous image
  • a determining unit configured to perform the step of determining a pose parameter of the second image relative to the second marker image in the above embodiment.
  • the device further includes:
  • a three-dimensional coordinate calculation module configured to perform the step of calculating the estimated three-dimensional coordinates of each feature point in the second image in the above embodiment
  • a coordinate transformation module configured to perform the step of transforming the estimated three-dimensional coordinates to obtain estimated two-dimensional coordinates in the above embodiment
  • the deleting module is configured to perform the step of deleting feature points in the above embodiment.
  • the first obtaining module 701 is further configured to perform the posture parameter according to the first mark image relative to the initial mark image and the pose parameter of the image relative to the first mark image in the foregoing embodiment, and acquire an image by using a formula. The steps of the pose parameter.
  • the second obtaining module 703 is further configured to perform the pose parameter according to the second mark image relative to the first mark image in the foregoing embodiment, and the pose parameter of the first mark image relative to the initial mark image, and The step of acquiring the pose parameter of the first image using a formula with respect to the pose parameter of the second marker image.
  • the device further includes:
  • a quantity acquisition module configured to perform the step of acquiring the number of feature points in the foregoing embodiment
  • a determining module configured to perform the step of determining that the first image does not satisfy the feature point tracking condition when the number reaches a preset number in the above embodiment.
  • the device further includes:
  • a homogeneous coordinate acquisition module configured to perform the step of acquiring the homogeneous coordinates corresponding to the two-dimensional coordinates of the feature points in the above embodiment
  • the coordinate conversion module is configured to perform the step of converting the homogeneous coordinates into the corresponding three-dimensional coordinates by using the coordinate transformation relationship in the above embodiment.
  • the device further includes:
  • a depth calculation module configured to perform the step of calculating a depth of the second mark image by using a formula in the above embodiment.
  • the device further includes:
  • an initialization module configured to perform the step of setting the captured image as the first marker image in the above embodiment.
  • the pose parameter includes a displacement parameter
  • the device further includes:
  • An interpolation module configured to perform the step of performing interpolation on the data acquired by the IMU in the above embodiment to obtain a rotation parameter curve
  • the rotation parameter acquisition module is configured to perform the step of acquiring the rotation parameter of the image according to the rotation parameter curve in the above embodiment.
  • the pose determining device when determining the pose parameter, the pose determining device provided by the above embodiment is only illustrated by the division of the above functional modules. In actual applications, the functions may be assigned to different functional modules according to needs. Completion, that is, the internal structure of the smart device is divided into different functional modules to complete all or part of the functions described above.
  • the posture determining device and the posture determining method embodiment provided in the above embodiments are in the same concept, and the specific implementation process is described in detail in the method embodiment, and details are not described herein again.
  • FIG. 8 is a structural block diagram of a terminal 800 according to an exemplary embodiment of the present application.
  • the terminal 800 is configured to perform the steps performed by the smart device in the foregoing method embodiment.
  • the terminal 800 can be a portable mobile terminal, such as a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III), and a MP4 (Moving Picture Experts Group Audio Layer IV). Image experts compress standard audio layers 4) players, laptops or desktops, or AR devices such as AR glasses and AR helmets. Terminal 800 may also be referred to as a user device, a portable terminal, a laptop terminal, a desktop terminal, and the like.
  • the terminal 800 includes a processor 801 and a memory 802.
  • the memory 802 stores at least one instruction, at least one program, code set or instruction set, and the instruction, program, code set or instruction set is loaded and executed by the processor 801 to implement the above. The operations performed by the smart device in an embodiment.
  • Processor 801 can include one or more processing cores, such as a 4-core processor, a 5-core processor, and the like.
  • the processor 801 can be implemented by at least one of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). achieve.
  • the processor 801 may also include a main processor and a coprocessor.
  • the main processor is a processor for processing data in an awake state, which is also called a CPU (Central Processing Unit); the coprocessor is A low-power processor for processing data in standby.
  • the processor 801 can be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and rendering of the content that the display needs to display.
  • the processor 801 may also include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.
  • AI Artificial Intelligence
  • Memory 802 can include one or more computer readable storage media, which can be non-transitory. Memory 802 can also include high speed random access memory, as well as non-volatile memory, such as one or more disk storage devices, flash memory devices. In some embodiments, the non-transitory computer readable storage medium in memory 802 is for storing at least one instruction for being used by processor 801 to implement the pose provided by the method embodiments of the present application. Determine the method.
  • terminal 800 also optionally includes a peripheral device interface 803 and at least one peripheral device.
  • the processor 801, the memory 802, and the peripheral device interface 803 can be connected by a bus or a signal line.
  • Each peripheral device can be connected to the peripheral device interface 803 via a bus, signal line or circuit board.
  • the peripheral device includes at least one of a radio frequency circuit 804, a touch display screen 805, a camera 806, an audio circuit 807, a positioning component 808, and a power source 809.
  • Peripheral device interface 803 can be used to connect at least one peripheral device associated with I/O (Input/Output) to processor 801 and memory 802.
  • processor 801, memory 802, and peripheral interface 803 are integrated on the same chip or circuit board; in some other embodiments, any of processor 801, memory 802, and peripheral interface 803 or The two can be implemented on a separate chip or circuit board, which is not limited in this embodiment.
  • the radio frequency circuit 804 is configured to receive and transmit an RF (Radio Frequency) signal, also called an electromagnetic signal. Radio frequency circuit 804 communicates with the communication network and other communication devices via electromagnetic signals. The radio frequency circuit 804 converts the electrical signal into an electromagnetic signal for transmission, or converts the received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 804 includes an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and the like. The radio frequency circuit 804 can communicate with other terminals via at least one wireless communication protocol.
  • RF Radio Frequency
  • the wireless communication protocol includes, but is not limited to, a metropolitan area network, various generations of mobile communication networks (2G, 3G, 4G, and 13G), a wireless local area network, and/or a WiFi (Wireless Fidelity) network.
  • the radio frequency circuit 804 may also include a circuit related to NFC (Near Field Communication), which is not limited in this application.
  • the display screen 805 is used to display a UI (User Interface).
  • the UI can include graphics, text, icons, video, and any combination thereof.
  • display 805 is a touch display
  • display 805 also has the ability to capture touch signals over the surface or surface of display 805.
  • the touch signal can be input to the processor 801 for processing as a control signal.
  • display 805 can also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards.
  • the display screen 805 may be one, and the front panel of the terminal 800 is disposed; in other embodiments, the display screen 805 may be at least two, respectively disposed on different surfaces of the terminal 800 or in a folded design; In still other embodiments, the display screen 805 can be a flexible display screen disposed on a curved surface or a folded surface of the terminal 800. Even the display screen 805 can be set to a non-rectangular irregular pattern, that is, a profiled screen.
  • the display screen 805 can be prepared by using an LCD (Liquid Crystal Display) or an OLED (Organic Light-Emitting Diode).
  • Camera component 806 is used to capture images or video.
  • camera assembly 806 includes a front camera and a rear camera.
  • the front camera is disposed on the front panel of the terminal 800
  • the rear camera is disposed on the back of the terminal.
  • the rear camera is at least two, which are respectively a main camera, a depth camera, a wide-angle camera, and a telephoto camera, so as to realize the background blur function of the main camera and the depth camera, and the main camera Combine with a wide-angle camera for panoramic shooting and VR (Virtual Reality) shooting or other integrated shooting functions.
  • camera assembly 806 can also include a flash.
  • the flash can be a monochrome temperature flash or a two-color temperature flash.
  • the two-color temperature flash is a combination of a warm flash and a cool flash that can be used for light compensation at different color temperatures.
  • the audio circuit 807 can include a microphone and a speaker.
  • the microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals for processing to the processor 801 for processing, or input to the radio frequency circuit 804 for voice communication.
  • the microphones may be multiple, and are respectively disposed at different parts of the terminal 800.
  • the microphone can also be an array microphone or an omnidirectional acquisition microphone.
  • the speaker is then used to convert electrical signals from the processor 801 or the RF circuit 804 into sound waves.
  • the speaker can be a conventional film speaker or a piezoelectric ceramic speaker.
  • audio circuit 807 can also include a headphone jack.
  • the location component 808 is used to locate the current geographic location of the terminal 800 to implement navigation or LBS (Location Based Service).
  • the positioning component 808 can be a positioning component based on a US-based GPS (Global Positioning System), a Chinese Beidou system, a Russian Greiner system, or an EU Galileo system.
  • Power source 809 is used to power various components in terminal 800.
  • the power source 809 can be an alternating current, a direct current, a disposable battery, or a rechargeable battery.
  • the rechargeable battery can support wired charging or wireless charging.
  • the rechargeable battery can also be used to support fast charging technology.
  • terminal 800 also includes one or more sensors 810.
  • the one or more sensors 810 include, but are not limited to, an acceleration sensor 811, a gyro sensor 812, a pressure sensor 813, a fingerprint sensor 814, an optical sensor 815, and a proximity sensor 816.
  • the acceleration sensor 811 can detect the magnitude of the acceleration on the three coordinate axes of the coordinate system established by the terminal 800.
  • the acceleration sensor 811 can be used to detect components of gravity acceleration on three coordinate axes.
  • the processor 801 can control the touch display screen 805 to display the user interface in a landscape view or a portrait view according to the gravity acceleration signal collected by the acceleration sensor 811.
  • the acceleration sensor 811 can also be used for the acquisition of game or user motion data.
  • the gyro sensor 812 can detect the body direction and the rotation angle of the terminal 800, and the gyro sensor 812 can cooperate with the acceleration sensor 811 to collect the 3D motion of the user to the terminal 800. Based on the data collected by the gyro sensor 812, the processor 801 can implement functions such as motion sensing (such as changing the UI according to the user's tilting operation), image stabilization at the time of shooting, game control, and inertial navigation.
  • functions such as motion sensing (such as changing the UI according to the user's tilting operation), image stabilization at the time of shooting, game control, and inertial navigation.
  • the pressure sensor 813 may be disposed at a side border of the terminal 800 and/or a lower layer of the touch display screen 805.
  • the pressure sensor 813 When the pressure sensor 813 is disposed on the side frame of the terminal 800, the user's holding signal to the terminal 800 can be detected, and the processor 801 performs left and right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 813.
  • the operability control on the UI interface is controlled by the processor 801 according to the user's pressure on the touch display screen 805.
  • the operability control includes at least one of a button control, a scroll bar control, an icon control, and a menu control.
  • the fingerprint sensor 814 is configured to collect the fingerprint of the user, and the processor 801 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 814, or the fingerprint sensor 814 identifies the identity of the user according to the collected fingerprint. Upon identifying that the identity of the user is a trusted identity, the processor 801 authorizes the user to have associated sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying and changing settings, and the like.
  • the fingerprint sensor 814 can be provided with the front, back or side of the terminal 800. When the physical button or vendor logo is provided on the terminal 800, the fingerprint sensor 814 can be integrated with a physical button or a vendor logo.
  • Optical sensor 815 is used to collect ambient light intensity.
  • the processor 801 can control the display brightness of the touch display screen 805 according to the ambient light intensity collected by the optical sensor 815. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 805 is raised; when the ambient light intensity is low, the display brightness of the touch display screen 805 is lowered.
  • the processor 801 can also dynamically adjust the shooting parameters of the camera assembly 806 based on the ambient light intensity acquired by the optical sensor 815.
  • Proximity sensor 816 also referred to as a distance sensor, is typically disposed on the front panel of terminal 800. Proximity sensor 816 is used to capture the distance between the user and the front of terminal 800. In one embodiment, when the proximity sensor 816 detects that the distance between the user and the front side of the terminal 800 is gradually decreasing, the touch screen 805 is controlled by the processor 801 to switch from the bright screen state to the screen state; when the proximity sensor 816 detects When the distance between the user and the front side of the terminal 800 gradually becomes larger, the processor 801 controls the touch display screen 805 to switch from the state of the screen to the bright state.
  • FIG. 8 does not constitute a limitation to the terminal 800, and may include more or less components than those illustrated, or may combine some components or adopt different component arrangements.
  • the embodiment of the present application further provides a pose determining apparatus, the pose determining apparatus includes a processor and a memory, where the memory stores at least one instruction, at least one program, a code set or a command set, an instruction, a program, a code set or The set of instructions is loaded by the processor and has operations to be implemented in the pose determining method of the above embodiment.
  • the embodiment of the present application further provides a computer readable storage medium, where the computer readable storage medium stores at least one instruction, at least one program, a code set or a set of instructions, the program, the program, the code set or the instruction
  • the set is loaded by the processor and has the operations possessed in the pose determining method of implementing the above embodiment.
  • a person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium.
  • the storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

本申请实施例公开了一种位姿确定方法、装置、智能设备及存储介质,属于计算机技术领域。方法包括:通过追踪第一标记图像的特征点,获取相机拍摄的图像的位姿参数;当第一图像的上一个图像满足特征点追踪条件,而第一图像不满足特征点追踪条件时,将第一图像的上一个图像作为第二标记图像;通过追踪第二标记图像的特征点,获取相机拍摄的图像相对于第二标记图像的位姿参数;根据图像相对于第二标记图像的位姿参数,以及每一个标记图像相对于上一个标记图像的位姿参数,获取图像的位姿参数,根据位姿参数确定相机的位姿。通过在第一图像不满足特征点追踪条件时切换标记图像,避免了由于相机的位置或姿态的变化过多而导致无法追踪到特征点的问题,增强了鲁棒性,提高了追踪精度。

Description

位姿确定方法、装置、智能设备及存储介质
本申请要求于2018年4月27日提交、申请号为201810392212.7、发明名称为“位姿确定方法、装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及计算机技术领域,特别涉及一种位姿确定方法、装置、智能设备及存储介质。
背景技术
AR(Augmented Reality,增强现实)技术是一种实时地追踪相机的位置和姿态,结合虚拟的图像、视频或者三维模型进行显示的技术,能够将虚拟场景与实际场景结合显示,是目前计算机视觉领域的重要研究方向之一。AR技术中最重要的问题在于如何准确确定相机的位置和姿态。
相关技术提出了一种通过追踪模板(marker)图像中的特征点来确定相机位置和姿态的方法,先预先定义好一个模板图像,提取模板图像中的特征点,随着相机的位置或姿态的变化对提取的特征点进行追踪,每当相机当前拍摄到一个图像时,在当前图像中识别模板图像的特征点,从而能够将特征点在当前图像中的位置和姿态与特征点在模板图像中的位置和姿态进行对比,得到特征点的位姿参数,进而得到当前图像相对于模板图像的位姿参数,如旋转参数和位移参数,该位姿参数可以表示相机拍摄当前图像时的位置和姿态。
在实现本申请实施例的过程中,发明人发现上述相关技术至少存在以下问题:当相机的位置或姿态的变化过多而导致当前图像中不存在模板图像的特征点时,无法追踪到特征点,也就无法确定相机的位置和姿态。
发明内容
本申请实施例提供了一种位姿确定方法、装置、智能设备及存储介质,可 以解决相关技术的问题。所述技术方案如下:
第一方面,提供了一种位姿确定方法,所述方法包括:
通过追踪第一标记图像的特征点,获取相机拍摄的图像的位姿参数;
当第一图像的上一个图像满足特征点追踪条件,而所述第一图像不满足特征点追踪条件时,将所述第一图像的上一个图像作为第二标记图像;
通过追踪所述第二标记图像的特征点,获取所述相机拍摄的图像相对于所述第二标记图像的位姿参数;
根据所述图像相对于所述第二标记图像的位姿参数,以及每一个标记图像相对于上一个标记图像的位姿参数,获取所述图像的位姿参数,根据所述位姿参数确定所述相机的位姿。
第二方面,提供了一种位姿确定装置,所述装置包括:
第一获取模块,用于通过追踪第一标记图像的特征点,获取相机拍摄的图像的位姿参数;
切换模块,用于当第一图像的上一个图像满足特征点追踪条件,而所述第一图像不满足特征点追踪条件时,将所述第一图像的上一个图像作为第二标记图像;
第二获取模块,用于通过追踪所述第二标记图像的特征点,获取所述相机拍摄的图像相对于所述第二标记图像的位姿参数;
所述第二获取模块,还用于根据所述图像相对于所述第二标记图像的位姿参数,以及每一个标记图像相对于上一个标记图像的位姿参数,获取所述图像的位姿参数,根据所述位姿参数确定所述相机的位姿。
第三方面,提供了一种智能设备,所述智能设备包括:处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
通过追踪第一标记图像的特征点,获取相机拍摄的图像的位姿参数;
当第一图像的上一个图像满足特征点追踪条件,而所述第一图像不满足特征点追踪条件时,将所述第一图像的上一个图像作为第二标记图像;
通过追踪所述第二标记图像的特征点,获取所述相机拍摄的图像相对于所 述第二标记图像的位姿参数;
根据所述图像相对于所述第二标记图像的位姿参数,以及每一个标记图像相对于上一个标记图像的位姿参数,获取所述图像的位姿参数,根据所述位姿参数确定所述相机的位姿。
第四方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述指令、所述程序、所述代码集或所述指令集由处理器加载并具有以实现如第一方面所述的位姿确定方法中所具有的操作。
本申请实施例提供的技术方案带来的有益效果至少包括:
本申请实施例提供的方法、装置、智能设备及存储介质,通过在追踪第一标记图像的特征点,获取相机拍摄的图像的位姿参数的过程中,当第一图像的上一个图像满足特征点追踪条件,而第一图像不满足特征点追踪条件时,将第一图像的上一个图像作为第二标记图像,之后通过追踪第二标记图像的特征点,根据相机拍摄的图像相对于第二标记图像的位姿参数,以及每一个标记图像相对于上一个标记图像的位姿参数,获取图像的位姿参数,根据位姿参数确定相机的位姿。通过在第一图像不满足特征点追踪条件时切换标记图像,通过追踪切换后的标记图像的特征点来确定相机的位置和姿态,避免了由于相机的位置或姿态的变化过多而导致无法追踪到特征点的问题,增强了鲁棒性,提高了相机的追踪精度。
另外,无需预先给定标记图像,只需拍摄当前的场景得到一个图像,设置为初始标记图像,即可实现标记图像的初始化,摆脱了必须预先给定标记图像的限制,扩展了应用范围。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供的场景界面的显示示意图;
图2是本申请实施例提供的另一场景界面的显示示意图;
图3是本申请实施例提供的一种位姿确定方法的流程图;
图4是本申请实施例提供的一种图像示意图;
图5是本申请实施例提供的一种位姿确定方法的流程图;
图6是本申请实施例提供的一种操作流程的示意图;
图7是本申请实施例提供的一种位姿确定装置的结构示意图;
图8是本申请实施例提供的一种终端的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请实施例提供了一种位姿确定方法,应用于智能设备追踪相机的位置和姿态的场景下,尤其是在AR场景下,智能设备采用AR技术进行显示时,如显示AR游戏、AR视频等,需要追踪相机的位置和姿态。
其中,智能设备配置有相机和显示单元,相机用于拍摄现实场景的图像,显示单元用于显示由现实场景与虚拟场景结合构成的场景界面。智能设备随着相机的运动可以追踪相机的位置和姿态的变化,还可以拍摄现实场景的图像,按照相机的位置和姿态的变化依次显示当前拍摄到的多个图像,从而模拟出显示三维界面的效果。并且,在显示的图像中可以添加虚拟元素,如虚拟图像、虚拟视频或者虚拟三维模型等,随着相机的运动,可以按照相机的位置和姿态的变化,以不同的方位显示虚拟元素,从而模拟出显示三维虚拟元素的效果。现实场景的图像与虚拟元素结合显示,构成了场景界面,从而模拟出现实场景与虚拟元素同处于同一个三维空间的效果。
例如,参见图1和参见图2,智能设备在拍摄到的包含桌子和茶杯的图像中添加了一个虚拟人物形象,随着相机的运动,拍摄到的图像发生变化,虚拟人物形象的拍摄方位也发生变化,模拟出了虚拟人物形象在图像中相对于桌子和茶杯静止不动,而相机随着位置和姿态的变化同时拍摄桌子、茶杯和虚拟人物形象的效果,为用户呈现了一幅真实立体的画面。
图3是本申请实施例提供的一种位姿确定方法的流程图,该位姿确定方法的执行主体为智能设备,该智能设备可以为配置有相机的手机、平板电脑等终端或者为配置有相机的AR眼镜、AR头盔等AR设备,参见图3,该方法包括:
301、在未设置标记图像的情况下,智能设备获取相机拍摄的图像,将拍摄的图像设置为第一标记图像。此时第一标记图像即为初始标记图像。
本申请实施例中,为了追踪相机的位置和姿态的变化,需要以标记图像作为基准,在相机拍摄至少一个图像的过程中,通过追踪标记图像的特征点来确定相机的位姿参数。
为此,在未设置标记图像的情况下,智能设备可以通过相机拍摄第三图像,获取相机当前拍摄的图像,将该图像设置为第一标记图像,从而实现标记图像的初始化,后续智能设备继续拍摄其他图像的过程中,即可通过追踪第一标记图像的特征点来获取每个图像的位姿参数。
其中,相机可以按照预设周期进行拍摄,每隔一个预设周期拍摄一个图像,该预设周期可以为0.1秒或者0.01秒等。
在一种可能实现方式中,为了防止第一标记图像中特征点数量较少而导致追踪失败,当获取到拍摄的图像后,可以先从该图像中提取特征点,判断提取到的特征点数量是否达到预设数量,当从该图像中提取到的特征点数量达到预设数量时,再将该图像设置为第一标记图像,而当从该图像中提取到的特征点数量未达到预设数量时,可以不将该图像设置为第一标记图像,而是获取相机拍摄的下一个图像,直至获取到提取的特征点数量达到预设数量的图像时,将该提取的特征点数量达到预设数量的图像设置为第一标记图像。
其中,提取特征点时采用的特征提取算法可以为FAST(Features from Accelerated Segment Test,加速段测试特征点)检测算法、Shi-Tomasi(史托马西)角点检测算法、Harris Corner Detection(Harris角点检测)算法等,预设数量可以根据对追踪精确度的需求确定。
需要说明的第一点是,本申请实施例中随着相机的运动可能会切换标记图像,为了统一衡量标准,准确确定相机的位置和姿态的变化,将以初始标记图像作为基准,每个图像相对于初始标记图像的位姿参数即可作为相应图像的位姿参数,以该位姿参数来表示相机拍摄相应图像时的位置和姿态。
需要说明的第二点是,本申请实施例以第一标记图像作为初始标记图像为例进行说明,实际上,第一标记图像也可以为初始标记图像之后设置的标记图 像。即在另一实施例中,在第一标记图像之前,智能设备也可能已设置其他的标记图像,之后经过一次或多次切换之后,切换至该第一标记图像,具体切换过程与下述从第一标记图像切换至第二标记图像的过程类似,在此暂不做说明。
302、通过追踪第一标记图像的特征点,获取相机拍摄的图像的位姿参数。
确定第一标记图像之后,将从第一标记图像中提取的特征点作为要追踪的目标特征点。随着相机的位置或姿态的变化,智能设备通过相机拍摄至少一个图像,并且通过在该至少一个图像中追踪特征点,得到每个图像相对于上一个图像的位姿参数。
其中,对于相机拍摄的相邻两个图像,使用从上一图像中提取第一标记图像的特征点进行光流,从而找到上一图像与下一图像之间的匹配特征点,得到匹配特征点的光流信息,该光流信息用于表示匹配特征点在该相邻两个图像中的运动信息,则根据匹配特征点的光流信息可以确定相邻两个图像中第二个图像相对于第一个图像的位姿参数。进行光流时采用的算法可以为Lucas-Kanade(卢卡斯-卡纳德)光流算法或者其他算法,除光流外,也可以采用描述子或者直接法对特征点进行匹配,找到上一图像与下一图像之间的匹配特征点。
那么,对于相机在第一标记图像之后拍摄的任一图像来说,获取从第一标记图像至该图像的每个图像相对于上一个图像的位姿参数,根据每个图像相对于上一个图像的位姿参数,可以进行迭代,从而确定该图像相对于第一标记图像的位姿参数。其中,该位姿参数可以包括位移参数和旋转参数,该位移参数用于表示相机拍摄该图像时的位置与拍摄第一标记图像时的位置之间的距离,该旋转参数用于表示相机拍摄该图像时的旋转角度与拍摄第一标记图像时的旋转角度之间的角度差。
例如,从第一标记图像开始,相机依次拍摄到图像1、图像2、图像3,获取到了图像1相对于第一标记图像的位姿参数(R1,T1)、图像2相对于图像1的位姿参数(R2,T2)以及图像3相对于图像2的位姿参数(R3,T3),则根据这些位姿参数可以进行迭代,确定图像3相对于第一标记图像的位姿参数(R3’,T3’)为:
Figure PCTCN2019079341-appb-000001
需要说明的是,在追踪特征点的过程中,随着相机的位置和姿态的变化,拍摄的图像中特征点的数量可能会减少,导致上一图像中的某些特征点在下一图像中不存在匹配的特征点,此时对相邻两个图像包括的特征点进行匹配时, 会排除掉一部分不匹配的特征点。
除此之外,智能设备还可以对光流匹配结果进行检验,排除不合理的特征点。即针对相机在第一标记图像之后拍摄的的任一图像,根据多个特征点在第一标记图像中的三维坐标以及该图像相对于第一标记图像的位姿参数,对特征点的位置和姿态的变化情况进行模拟,计算每个特征点在该图像中的估计三维坐标,对每个特征点在该图像中的估计三维坐标进行变换,得到每个特征点在该图像中的估计二维坐标,将每个特征点在该图像中的估计二维坐标与实际二维坐标进行对比,获取每个特征点在该图像中的估计二维坐标与在该图像中的实际二维坐标之间的距离,当任一特征点在该图像中的估计二维坐标与在该图像中的实际二维坐标之间的距离大于预设距离时,表示按照计算出的位姿参数对由第一标记图像开始的相机位姿变化进行模拟,得到的特征点的位置与实际位置相差过大,可以认为该特征点的位置和姿态的变化情况不符合应有的旋转平移关系,误差过大,因此为了避免该特征点对后续追踪过程的影响,将该特征点删除。
本申请实施例中,第一标记图像为初始标记图像,则该图像相对于第一标记图像的位姿参数即可表示相机拍摄该图像时的位置和姿态。
而在另一实施例中,如果第一标记图像不是初始标记图像,则根据该图像相对于第一标记图像的位姿参数以及该第一标记图像相对于初始标记图像的位姿参数,获取该图像相对于初始标记图像的位姿参数,该位姿参数即可表示相机拍摄该图像时的位置和姿态。
在一种可能实现方式中,根据第一标记图像相对于初始标记图像的位姿参数,以及该图像相对于第一标记图像的位姿参数,采用以下公式,获取该图像的位姿参数:
Figure PCTCN2019079341-appb-000002
R_final表示图像的旋转参数,T_final表示图像的位移参数;Rca表示图像相对于第一标记图像的旋转参数,Tca表示图像相对于第一标记图像的位移参数;R_old表示第一标记图像相对于初始标记图像的旋转参数,T_old表示第一标记图像相对于初始标记图像的位移参数。
需要说明的第一点是,在上述追踪过程中,需要确定特征点的三维坐标,才能通过追踪特征点确定相机在三维空间内的位置和姿态的变化。为此,在第 一标记图像中提取特征点时,确定特征点在第一标记图像中的二维坐标后,获取特征点的二维坐标对应的齐次坐标,齐次坐标用于将二维坐标以三维形式表示,采用以下坐标转换关系,将齐次坐标转换为对应的三维坐标:
Figure PCTCN2019079341-appb-000003
其中,M表示三维坐标,m表示齐次坐标,s表示特征点所在的标记图像的深度,fx、fy、cx和cy表示相机的参数。
例如,特征点的齐次坐标可以为[μ,ν,1],则特征点的三维坐标可以为
Figure PCTCN2019079341-appb-000004
需要说明的第二点是,在每一个独立的标记图像的追踪过程中,均假设该标记图像上所有三维特征点的深度为s。实际应用时,智能设备可以确定好标记图像、特征点的三维坐标和标记图像的深度,对这些参数采用PnP(Pespective-n-Point,透视n点定位)算法进行计算,即可获取到相机的位姿参数。其中,PnP算法可以为直接线性变换、P3P、ePnP、uPnP等,或者也可以采用除PnP算法以外的算法进行计算,如BA(Bundle Adjustment,光束平差法)优化PnP的算法。
303、当第一图像的上一个图像满足特征点追踪条件,而第一图像不满足特征点追踪条件时,将第一图像的上一个图像作为第二标记图像。
其中,特征点追踪条件为追踪当前标记图像的特征点的条件,若智能设备拍摄到的图像满足特征点追踪条件,则可以继续追踪,而若智能设备拍摄到的图像不满足特征点追踪条件,为了防止追踪失败,需要切换标记图像。
因此,在通过追踪第一标记图像的特征点获取图像位姿参数的过程中,智能设备拍摄到图像时,还会判断图像是否满足特征点追踪条件。以相机拍摄到的第一图像与第一图像的上一个图像为例,相机先拍摄到第一图像的上一个图像,且第一图像的上一个图像满足特征点追踪条件,则通过上述步骤302获取 到该第一图像的上一个图像的位姿参数。之后相机拍摄到第一图像,但第一图像不满足特征点追踪条件,则将第一图像的上一个图像作为第二标记图像,第一图像的上一个图像的位姿参数即为第二标记图像的位姿参数。
在一种可能实现方式中,特征点追踪条件可以为追踪到的特征点的数量达到预设数量,则当某一图像中追踪到的第一标记图像的特征点的数量达到预设数量时,确定该图像满足特征点追踪条件,当某一图像中追踪到的第一标记图像的特征点的数量未达到预设数量时,确定该图像不满足特征点追踪条件。
相应地,针对第一图像的上一个图像,获取第一图像的上一个图像中追踪到的特征点的数量,当该数量达到预设数量时,确定第一图像的上一个图像满足特征点追踪条件。而针对第一图像,获取第一图像中追踪到的特征点的数量,当该数量未达到预设数量时,确定第一图像不满足特征点追踪条件。
304、通过追踪第二标记图像的特征点,获取相机拍摄的图像相对于第二标记图像的位姿参数。
从第一标记图像切换为第二标记图像后,将会从第二标记图像中提取多个特征点,作为更新后的目标特征点,随着相机的位置或姿态的变化,智能设备通过相机拍摄至少一个图像,通过在该至少一个图像中追踪该第二标记图像的特征点,得到每个图像相对于上一个图像的位姿参数。
其中,对于相机拍摄的相邻两个图像,使用从上一图像中提取第一标记图像的特征点进行光流,从而找到上一图像与下一图像之间的匹配特征点,得到匹配特征点的光流信息,该光流信息用于表示匹配特征点在该相邻两个图像中的运动信息,则根据匹配特征点的光流信息可以确定相邻两个图像中第二个图像相对于第一个图像的位姿参数。进行光流时采用的算法可以为Lucas-Kanade光流算法或者其他算法,除光流外,也可以采用描述子或者直接法对特征点进行匹配,找到上一图像与下一图像之间的匹配特征点。
那么,以相机在第二标记图像之后拍摄的第二图像为例,获取从第二标记图像至该图像的每个图像相对于上一个图像的位姿参数,根据每个图像相对于上一个图像的位姿参数,可以进行迭代,从而确定该第二图像相对于第二标记图像的位姿参数。其中,该位姿参数可以包括位移参数和旋转参数中的至少一项,该位移参数用于表示相机拍摄该第二图像时的位置与拍摄第二标记图像时的位置之间的距离,该旋转参数用于表示相机拍摄该第二图像时的旋转角度与拍摄第二标记图像时的旋转角度之间的角度差。
305、根据图像相对于第二标记图像的位姿参数,以及每一个标记图像相对于上一个标记图像的位姿参数,获取图像的位姿参数,根据位姿参数确定相机的位姿。
以第二图像为例,在本申请实施例中,若第一标记图像为初始标记图像,则根据第二图像相对于第二标记图像的位姿参数以及第二标记图像相对于第一标记图像的位姿参数(即第二标记图像相对于初始标记图像的位姿参数),获取第二图像相对于初始标记图像的位姿参数,即为该第二图像的位姿参数,根据位姿参数可以确定相机的位姿。
而在另一实施例中,若第一标记图像不是初始标记图像,则根据第二图像相对于第二标记图像的位姿参数、第二标记图像相对于第一标记图像的位姿参数以及第一标记图像相对于初始标记图像的位姿参数,获取第二图像相对于初始标记图像的位姿参数,即为该第二图像的位姿参数,根据位姿参数可以确定相机的位姿。
其中,第二图像为第二标记图像之后拍摄的任一图像,可以为第一图像,也可以为第一图像之后拍摄的任一图像。
针对第一图像,在获取第一图像的位姿参数时,根据第二标记图像相对于第一标记图像的位姿参数,以及第一标记图像相对于初始标记图像的位姿参数,获取第二标记图像相对于初始标记图像的位姿参数;根据第一图像相对于第二标记图像的位姿参数,以及第二标记图像相对于初始标记图像的位姿参数,采用以下公式获取第一图像的位姿参数:
Figure PCTCN2019079341-appb-000005
R_final表示第一图像的旋转参数,T_final表示第一图像的位移参数;Rcl表示第一图像相对于第二标记图像的旋转参数,Tcl表示第一图像相对于第二标记图像的位移参数;R_old表示第二标记图像相对于初始标记图像的旋转参数,T_old表示第二标记图像相对于初始标记图像的位移参数。
需要说明的第一点是,在上述追踪过程中,需要确定特征点的三维坐标,才能通过追踪特征点确定相机在三维空间内的位置和姿态的变化。为此,在第二标记图像中提取特征点时,确定特征点在第二标记图像中的二维坐标后,获取特征点的二维坐标对应的齐次坐标,齐次坐标用于将二维坐标以三维形式表示,采用以下坐标转换关系,将齐次坐标转换为对应的三维坐标:
Figure PCTCN2019079341-appb-000006
其中,M表示三维坐标,m表示齐次坐标,s表示特征点所在的标记图像的深度,fx、fy、cx和cy表示相机的参数。
例如,特征点的齐次坐标可以为[μ,ν,1],则特征点的三维坐标可以为
Figure PCTCN2019079341-appb-000007
需要说明的第二点是,在追踪特征点的过程中,随着相机的位置和姿态的变化,拍摄的相邻两个图像中特征点的数量可能会减少,导致上一图像中的某些特征点在下一图像中不存在匹配的特征点,此时对相邻两个图像包括的特征点进行匹配时,会排除掉一部分不匹配的特征点。
除此之外,智能设备还可以对光流匹配结果进行检验,排除不合理的特征点。以相机在第二标记图像之后拍摄的第二图像为例,根据多个特征点在第二标记图像中的三维坐标以及第二图像相对于第二标记图像的位姿参数,对特征点的位置和姿态的变化情况进行模拟,计算每个特征点在第二图像中的估计三维坐标,对每个特征点在第二图像中的估计三维坐标进行变换,得到每个特征点在第二图像中的估计二维坐标,将每个特征点在第二图像中的估计二维坐标与实际二维坐标进行对比,获取每个特征点在该第二图像中的估计二维坐标与在该第二图像中的实际二维坐标之间的距离,当任一特征点在第二图像中的估计二维坐标与在第二图像中的实际二维坐标之间的距离大于预设距离时,表示表示按照计算出的位姿参数对由第二标记图像开始的相机位姿变化进行模拟,得到的特征点的位置与实际位置相差过大,可以认为该特征点的位置和姿态的变化情况不符合应有的旋转平移关系,误差过大,因此为了避免该特征点对后续追踪过程的影响,将该特征点删除。
其中,在对估计三维坐标进行变换得到估计二维坐标时,可以根据上述坐标转换关系的逆转换进行,也即是采用以下逆转换关系,将估计三维坐标变换 为估计二维坐标:
Figure PCTCN2019079341-appb-000008
其中,M表示估计三维坐标,m表示估计二维坐标,s表示特征点所在的标记图像的深度,fx、fy、cx和cy表示相机的参数。
排除掉不匹配的特征点或者排除掉误差过大的特征点后,智能设备获取第二图像中特征点的数量,继续判断第二图像是否满足第二标记图像的特征点追踪条件,从而确定是否要切换标记图像。
需要说明的第三点是,为了保证深度的连续性,在第一标记图像的追踪过程中假设该第一标记图像上所有特征点的深度为s,而在第二标记图像的追踪过程中,不仅要满足第二标记图像上所有特征点的深度相等,也需要满足这些特征点在第一标记图像上的深度仍为s。因此可以通过迭代计算每个标记图像在追踪过程中的深度。
以S n表示第二标记图像的深度,d表示第一标记图像的特征点在第二标记图像中的深度,S n-1表示第一标记图像的深度,d可以通过第二标记图像的位姿参数计算得到。则采用以下公式,计算第二标记图像的深度:S n=d*S n-1。其中,可以假设相机拍摄的第一个图像中特征点的深度均为1。标记图像的深度更新为S n后,对第二标记图像、第二标记图像中提取的特征点的三维坐标和第二标记图像的深度S n采用PnP算法进行计算,即可追踪相机的位移参数。
本申请实施例提供的方法,通过在追踪第一标记图像的特征点,获取相机拍摄的图像的位姿参数的过程中,当第一图像的上一个图像满足特征点追踪条件,而第一图像不满足特征点追踪条件时,将第一图像的上一个图像作为第二标记图像,之后通过追踪第二标记图像的特征点,根据相机拍摄的图像相对于第二标记图像的位姿参数,以及每一个标记图像相对于上一个标记图像的位姿参数,获取图像的位姿参数,根据位姿参数确定相机的位姿。通过在第一图像不满足特征点追踪条件时切换标记图像,通过追踪切换后的标记图像的特征点来确定相机的位置和姿态,避免了由于相机的位置或姿态的变化过多而导致无法追踪到特征点的问题,增强了鲁棒性,提高了相机的追踪精度。本申请实施 例提供的方法,轻量简单,没有复杂的后端优化,因此计算速度很快,甚至可以做到实时追踪。相对于传统的slam(simultaneous localization and mapping,即时定位与地图构建)算法,本申请实施例提供的方法鲁棒性更强,可以达到非常高的计算精度。
另外,无需预先给定标记图像,只需拍摄当前的场景得到一个图像,设置为初始标记图像,即可实现标记图像的初始化,摆脱了必须预先给定标记图像的限制,扩展了应用范围。
举例来说,相机拍摄的多个图像如图4所示,追踪过程包括以下步骤:
1、相机拍摄到第一个图像,将第一个图像作为初始标记图像。
2、通过追踪初始标记图像的特征点,获取相机拍摄的图像相对于初始标记图像的位姿参数,直至图像a的下一个图像不满足特征点追踪条件时,将图像a作为第一标记图像,此时当前标记图像相对于初始标记图像的位姿参数
Figure PCTCN2019079341-appb-000009
为图像a相对于第一个图像的位姿参数。
3、通过追踪第一标记图像的特征点,获取相机拍摄的图像相对于第一标记图像的位姿参数,直至获取到图像l相对于第一标记图像的位姿参数。之后,由于图像c不满足特征点追踪条件,而导致无法获取图像c相对于第一标记图像的位姿参数。
4、将图像l作为第二标记图像,此时当前标记图像相对于初始标记图像的位姿参数(R_old,T_old)更新为图像l相对于第一个图像的位姿参数。
5、通过追踪第二标记图像的特征点,获取相机拍摄的图像相对于第二标记图像的位姿参数(Rcl,Tcl),根据第二标记图像相对于初始标记图像的位姿参数(R_old,T_old)和相机拍摄的图像相对于第二标记图像的位姿参数(Rcl,Tcl),获取相机拍摄的图像相对于初始标记图像的位姿参数(R_final,T_final),根据该位姿参数(R_final,T_final)确定相机的位姿。
本申请实施例中,位姿参数可以包括位移参数和旋转参数,位移参数用于表示相机的平移情况,可以确定相机在三维空间内位置的变化,而旋转参数用于表示相机的旋转角度的变化,可以确定相机在三维空间内姿态的变化。通过执行上述步骤可以获取到相机的位移参数和旋转参数。或者,通过执行上述步骤可以获取到相机的位移参数而不获取旋转参数,相机的旋转参数的获取过程 详见下述实施例。
图5是本申请实施例提供的一种位姿确定方法的流程图,该位姿确定方法的执行主体为智能设备,该智能设备可以为配置有相机的手机、平板电脑等终端或者为配置有相机的AR眼镜、AR头盔等AR设备,参见图5,该方法包括:
501、通过IMU(Inertial Measurement Unit,惯性测量单元)获取相机的多个旋转参数以及对应的时间戳。
其中,每个旋转参数对应的时间戳是指获取该旋转参数时的时间戳。
502、根据多个旋转参数以及对应的时间戳进行插值得到旋转参数曲线。
其中,插值算法可以采用Slerp(Spherical Linear Interpolation,球面线性插值)算法或者其他算法。
根据多个旋转参数以及对应的时间戳进行插值得到旋转参数曲线,该旋转参数曲线可以表示相机的旋转参数随拍摄时间的变化规律。
503、当相机拍摄到一个图像时,获取相机拍摄的图像的时间戳,获取该时间戳在旋转参数曲线中对应的旋转参数,作为相机拍摄的图像的旋转参数,根据该旋转参数确定相机的姿态。
由于图像的拍摄频率与IMU的采样频率不匹配,因此通过插值得到旋转参数曲线,根据旋转参数曲线可以进行数据对齐,从而得到图像对应的旋转参数,根据该旋转参数确定相机的姿态。
实际应用中,智能设备配置有陀螺仪、加速度计和地磁传感器,通过陀螺仪和地磁传感器,可以得到在地球坐标系中唯一的旋转参数。该地图坐标系有以下特点:
1、X轴使用向量积来定义的,在智能设备当前的位置上与地面相切,并指向东方。
2、Y轴在智能设备当前的位置上与地面相切,且指向地磁场的北极。
3、Z轴指向天空,并垂直于地面。
通过该地图坐标系得到的旋转参数可以认为没有误差,而且无需依赖于IMU的参数,避免了IMU的标定问题,可以兼容多种类型的设备。
智能设备提供了获取旋转参数的接口:rotation-vector(旋转矢量)接口,可以按照IMU的采样频率调用rotation-vector接口,从而获取到旋转参数。
智能设备可以将获取到多个旋转参数以及对应的时间戳均存储至IMU队列中,通过读取IMU队列中的数据进行插值得到旋转参数曲线。或者,考虑到上 述数据可能会存在噪声,因此为了保证数据的准确性,可以计算获取到的旋转参数与上一个旋转参数之间的角度差,如果该角度差大于预设阈值,可以认为获取到的旋转参数为噪声项,则将该旋转参数删除。通过上述检测可以删除噪声项,仅将通过检测的旋转参数及其对应的时间戳存储至IMU队列中。
本申请实施例提供的方法,通过根据IMU测量的多个旋转参数以及对应的时间戳进行插值得到旋转参数曲线,根据旋转参数曲线可以进行数据对齐,从而根据所拍摄图像的时间戳和旋转参数曲线,获取图像的旋转参数,提高了精确度,无需依赖于IMU的参数,避免了IMU的标定问题,并且考虑到了智能设备计算能力低的问题,通过IMU获取旋转参数可以降低计算量,提高计算速度。另外,将噪声项删除,可以提高数据的准确性,进一步提高精确度。
本申请实施例的操作流程可以如图6所示,参见图6,将智能设备的各个功能划分为多个模块,操作流程如下:
1、通过模块601读取到IMU测量的数据,包括旋转参数和对应的时间戳,通过模块602检测数据是否合理,如果不合理则丢弃该数据,如果合理则通过模块603将数据存储至IMU队列中。
2、通过模块604读取拍摄的图像,判断当前是否已设置标记图像。如果未设置标记图像,则利用当前拍摄的图像初始化一个标记图像,如果已设置标记图像,则直接通过模块607建立与标记图像的连接,追踪标记图像的特征点。
3、通过模块608联合IMU队列中的数据以及追踪特征点得到的数据,获取到位移参数和旋转参数,计算出从当前图像相对于当前标记图像的旋转平移矩阵。
4、通过模块609检测图像的旋转参数和位移参数是否合理,如果是,则送入模块612,通过模块612将当前图像相对于当前标记图像的旋转平移矩阵转换为当前图像相对于初始标记图像的旋转平移矩阵;如果否,则通过模块610切换标记图像,计算出当前图像相对于当前标记图像的旋转平移矩阵,再通过模块611检测结果是否合理,如果是,则送入模块612,如果否,则转回模块606,利用当前图像重新进行初始化。
5、通过模块613和614对获得的数据结果进行平滑并输出。平滑时可以采用kalman(卡尔曼)滤波器或者其他滤波器。
综上所述,本申请实施例提供了一套相机姿态追踪算法:Anchor-Switching(切换标记图像)算法。将相机的运动过程划分成多段标记图像的追踪过程,每段过程是一次独立的标记图像追踪过程,当追踪失败时通过在上一帧图像切换标记图像来连接。针对智能设备计算能力低的特点,利用IMU得到相机相对于初始场景的旋转参数,将真实场景的图像作为标记图像,通过追踪得到相机相对于当前标记图像的位移参数,通过切换标记图像得到相对于初始场景的位移参数,两者结合得到相对于初始场景的位置和姿态变化,从而实现了一套真实自然场景下稳定、快速、鲁棒的相机姿态跟踪系统,不依赖于预先给定的标记图像,在提高计算速度的同时增强了系统的鲁棒性,相机定位精度很高。同时避免了复杂的IMU和图像融合算法,也降低了对参数的敏感性。本申请实施例提供的方法能在移动端流畅运行,且不需要精确的标定。
本申请实施例对应于人眼观测三维空间的场景,旋转参数的影响较大,而假设平面上的移动不大。而在AR场景下,用户通常是在平面场景下和虚拟元素进行互动,如茶几桌子等,则可以认为相机在平面上移动,旋转参数的影响较大。因此本申请实施例非常适用于AR场景。
图7是本申请实施例提供的一种位姿确定装置的结构示意图。参见图7,该装置应用于智能设备中,该装置包括:
第一获取模块701,用于执行上述实施例中通过追踪第一标记图像的特征点,获取相机拍摄的图像的位姿参数的步骤;
切换模块702,用于执行上述实施例中将第一图像的上一个图像作为第二标记图像的步骤;
第二获取模块703,用于执行上述实施例中通过追踪第二标记图像的特征点,获取相机拍摄的图像相对于第二标记图像的位姿参数,进而获取图像的位姿参数,根据位姿参数确定位姿的步骤。
可选地,第二获取模块703,包括:
提取单元,用于执行上述实施例中从第二标记图像中提取多个特征点的步骤;
追踪单元,用于执行上述实施例中追踪多个特征点得到每个图像相对于上一个图像的位姿参数的步骤;
确定单元,用于执行上述实施例中确定第二图像相对于第二标记图像的位 姿参数的步骤。
可选地,装置还包括:
三维坐标计算模块,用于执行上述实施例中计算每个特征点在第二图像中的估计三维坐标的步骤;
坐标变换模块,用于执行上述实施例中将估计三维坐标进行变换得到估计二维坐标的步骤;
删除模块,用于执行上述实施例中删除特征点的步骤。
可选地,第一获取模块701还用于执行上述实施例中根据第一标记图像相对于初始标记图像的位姿参数,以及图像相对于第一标记图像的位姿参数,采用公式获取图像的位姿参数的步骤。
可选地,第二获取模块703还用于执行上述实施例中根据第二标记图像相对于第一标记图像的位姿参数,以及第一标记图像相对于初始标记图像的位姿参数,以及第一图像相对于第二标记图像的位姿参数,采用公式获取第一图像的位姿参数的步骤。
可选地,装置还包括:
数量获取模块,用于执行上述实施例中获取特征点的数量的步骤;
确定模块,用于执行上述实施例中当数量达到预设数量时确定第一图像不满足特征点追踪条件的步骤。
可选地,装置还包括:
齐次坐标获取模块,用于执行上述实施例中获取特征点的二维坐标对应的齐次坐标的步骤;
坐标转换模块,用于执行上述实施例中采用坐标转换关系,将齐次坐标转换为对应的三维坐标的步骤。
可选地,装置还包括:
深度计算模块,用于执行上述实施例中采用公式计算第二标记图像的深度的步骤。
可选地,装置还包括:
初始化模块,用于执行上述实施例中将拍摄的图像设置为第一标记图像的步骤。
可选地,位姿参数包括位移参数,装置还包括:
插值模块,用于执行上述实施例中通过IMU获取的数据进行插值得到旋转 参数曲线的步骤;
旋转参数获取模块,用于执行上述实施例中根据旋转参数曲线获取图像的旋转参数的步骤。
需要说明的是:上述实施例提供的位姿确定装置在确定位姿参数时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将智能设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的位姿确定装置与位姿确定方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图8示出了本申请一个示例性实施例提供的终端800的结构框图,终端800用于执行上述方法实施例中智能设备所执行的步骤。
该终端800可以是便携式移动终端,比如:智能手机、平板电脑、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、笔记本电脑或台式电脑,也可以是AR眼镜、AR头盔等AR设备。终端800还可能被称为用户设备、便携式终端、膝上型终端、台式终端等其他名称。
该终端800包括:处理器801和存储器802,存储器802中存储有至少一条指令、至少一段程序、代码集或指令集,指令、程序、代码集或指令集由处理器801加载并执行以实现上述实施例中智能设备所执行的操作。
处理器801可以包括一个或多个处理核心,比如4核心处理器、5核心处理器等。处理器801可以采用DSP(Digital Signal Processing,数字信号处理)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器801也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central Processing Unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器801可以在集成有GPU(Graphics Processing Unit,图像处理器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器801还可以包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器 学习的计算操作。
存储器802可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器802还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器802中的非暂态的计算机可读存储介质用于存储至少一个指令,该至少一个指令用于被处理器801所具有以实现本申请中方法实施例提供的位姿确定方法。
在一些实施例中,终端800还可选包括有:外围设备接口803和至少一个外围设备。处理器801、存储器802和外围设备接口803之间可以通过总线或信号线相连。各个外围设备可以通过总线、信号线或电路板与外围设备接口803相连。具体地,外围设备包括:射频电路804、触摸显示屏805、摄像头806、音频电路807、定位组件808和电源809中的至少一种。
外围设备接口803可被用于将I/O(Input/Output,输入/输出)相关的至少一个外围设备连接到处理器801和存储器802。在一些实施例中,处理器801、存储器802和外围设备接口803被集成在同一芯片或电路板上;在一些其他实施例中,处理器801、存储器802和外围设备接口803中的任意一个或两个可以在单独的芯片或电路板上实现,本实施例对此不加以限定。
射频电路804用于接收和发射RF(Radio Frequency,射频)信号,也称电磁信号。射频电路804通过电磁信号与通信网络以及其他通信设备进行通信。射频电路804将电信号转换为电磁信号进行发送,或者,将接收到的电磁信号转换为电信号。可选地,射频电路804包括:天线系统、RF收发器、一个或多个放大器、调谐器、振荡器、数字信号处理器、编解码芯片组、用户身份模块卡等等。射频电路804可以通过至少一种无线通信协议来与其它终端进行通信。该无线通信协议包括但不限于:城域网、各代移动通信网络(2G、3G、4G及13G)、无线局域网和/或WiFi(Wireless Fidelity,无线保真)网络。在一些实施例中,射频电路804还可以包括NFC(Near Field Communication,近距离无线通信)有关的电路,本申请对此不加以限定。
显示屏805用于显示UI(User Interface,用户界面)。该UI可以包括图形、文本、图标、视频及其它们的任意组合。当显示屏805是触摸显示屏时,显示屏805还具有采集在显示屏805的表面或表面上方的触摸信号的能力。该触摸信号可以作为控制信号输入至处理器801进行处理。此时,显示屏805还可以 用于提供虚拟按钮和/或虚拟键盘,也称软按钮和/或软键盘。在一些实施例中,显示屏805可以为一个,设置终端800的前面板;在另一些实施例中,显示屏805可以为至少两个,分别设置在终端800的不同表面或呈折叠设计;在再一些实施例中,显示屏805可以是柔性显示屏,设置在终端800的弯曲表面上或折叠面上。甚至,显示屏805还可以设置成非矩形的不规则图形,也即异形屏。显示屏805可以采用LCD(Liquid Crystal Display,液晶显示屏)、OLED(Organic Light-Emitting Diode,有机发光二极管)等材质制备。
摄像头组件806用于采集图像或视频。可选地,摄像头组件806包括前置摄像头和后置摄像头。通常,前置摄像头设置在终端800的前面板,后置摄像头设置在终端的背面。在一些实施例中,后置摄像头为至少两个,分别为主摄像头、景深摄像头、广角摄像头、长焦摄像头中的任意一种,以实现主摄像头和景深摄像头融合实现背景虚化功能、主摄像头和广角摄像头融合实现全景拍摄以及VR(Virtual Reality,虚拟现实)拍摄功能或者其它融合拍摄功能。在一些实施例中,摄像头组件806还可以包括闪光灯。闪光灯可以是单色温闪光灯,也可以是双色温闪光灯。双色温闪光灯是指暖光闪光灯和冷光闪光灯的组合,可以用于不同色温下的光线补偿。
音频电路807可以包括麦克风和扬声器。麦克风用于采集用户及环境的声波,并将声波转换为电信号输入至处理器801进行处理,或者输入至射频电路804以实现语音通信。出于立体声采集或降噪的目的,麦克风可以为多个,分别设置在终端800的不同部位。麦克风还可以是阵列麦克风或全向采集型麦克风。扬声器则用于将来自处理器801或射频电路804的电信号转换为声波。扬声器可以是传统的薄膜扬声器,也可以是压电陶瓷扬声器。当扬声器是压电陶瓷扬声器时,不仅可以将电信号转换为人类可听见的声波,也可以将电信号转换为人类听不见的声波以进行测距等用途。在一些实施例中,音频电路807还可以包括耳机插孔。
定位组件808用于定位终端800的当前地理位置,以实现导航或LBS(Location Based Service,基于位置的服务)。定位组件808可以是基于美国的GPS(Global Positioning System,全球定位系统)、中国的北斗系统、俄罗斯的格雷纳斯系统或欧盟的伽利略系统的定位组件。
电源809用于为终端800中的各个组件进行供电。电源809可以是交流电、直流电、一次性电池或可充电电池。当电源809包括可充电电池时,该可充电 电池可以支持有线充电或无线充电。该可充电电池还可以用于支持快充技术。
在一些实施例中,终端800还包括有一个或多个传感器810。该一个或多个传感器810包括但不限于:加速度传感器811、陀螺仪传感器812、压力传感器813、指纹传感器814、光学传感器815以及接近传感器816。
加速度传感器811可以检测以终端800建立的坐标系的三个坐标轴上的加速度大小。比如,加速度传感器811可以用于检测重力加速度在三个坐标轴上的分量。处理器801可以根据加速度传感器811采集的重力加速度信号,控制触摸显示屏805以横向视图或纵向视图进行用户界面的显示。加速度传感器811还可以用于游戏或者用户的运动数据的采集。
陀螺仪传感器812可以检测终端800的机体方向及转动角度,陀螺仪传感器812可以与加速度传感器811协同采集用户对终端800的3D动作。处理器801根据陀螺仪传感器812采集的数据,可以实现如下功能:动作感应(比如根据用户的倾斜操作来改变UI)、拍摄时的图像稳定、游戏控制以及惯性导航。
压力传感器813可以设置在终端800的侧边框和/或触摸显示屏805的下层。当压力传感器813设置在终端800的侧边框时,可以检测用户对终端800的握持信号,由处理器801根据压力传感器813采集的握持信号进行左右手识别或快捷操作。当压力传感器813设置在触摸显示屏805的下层时,由处理器801根据用户对触摸显示屏805的压力操作,实现对UI界面上的可操作性控件进行控制。可操作性控件包括按钮控件、滚动条控件、图标控件、菜单控件中的至少一种。
指纹传感器814用于采集用户的指纹,由处理器801根据指纹传感器814采集到的指纹识别用户的身份,或者,由指纹传感器814根据采集到的指纹识别用户的身份。在识别出用户的身份为可信身份时,由处理器801授权该用户具有相关的敏感操作,该敏感操作包括解锁屏幕、查看加密信息、下载软件、支付及更改设置等。指纹传感器814可以被设置终端800的正面、背面或侧面。当终端800上设置有物理按键或厂商Logo时,指纹传感器814可以与物理按键或厂商标志集成在一起。
光学传感器815用于采集环境光强度。在一个实施例中,处理器801可以根据光学传感器815采集的环境光强度,控制触摸显示屏805的显示亮度。具体地,当环境光强度较高时,调高触摸显示屏805的显示亮度;当环境光强度较低时,调低触摸显示屏805的显示亮度。在另一个实施例中,处理器801还 可以根据光学传感器815采集的环境光强度,动态调整摄像头组件806的拍摄参数。
接近传感器816,也称距离传感器,通常设置在终端800的前面板。接近传感器816用于采集用户与终端800的正面之间的距离。在一个实施例中,当接近传感器816检测到用户与终端800的正面之间的距离逐渐变小时,由处理器801控制触摸显示屏805从亮屏状态切换为息屏状态;当接近传感器816检测到用户与终端800的正面之间的距离逐渐变大时,由处理器801控制触摸显示屏805从息屏状态切换为亮屏状态。
本领域技术人员可以理解,图8中示出的结构并不构成对终端800的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。
本申请实施例还提供了一种位姿确定装置,该位姿确定装置包括处理器和存储器,存储器中存储有至少一条指令、至少一段程序、代码集或指令集,指令、程序、代码集或指令集由处理器加载并具有以实现上述实施例的位姿确定方法中所具有的操作。
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,该指令、该程序、该代码集或该指令集由处理器加载并具有以实现上述实施例的位姿确定方法中所具有的操作。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本申请的可选实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (24)

  1. 一种位姿确定方法,其特征在于,所述方法包括:
    通过追踪第一标记图像的特征点,获取相机拍摄的图像的位姿参数;
    当第一图像的上一个图像满足特征点追踪条件,而所述第一图像不满足特征点追踪条件时,将所述第一图像的上一个图像作为第二标记图像;
    通过追踪所述第二标记图像的特征点,获取所述相机拍摄的图像相对于所述第二标记图像的位姿参数;
    根据所述图像相对于所述第二标记图像的位姿参数,以及每一个标记图像相对于上一个标记图像的位姿参数,获取所述图像的位姿参数,根据所述位姿参数确定所述相机的位姿。
  2. 根据权利要求1所述的方法,其特征在于,所述通过追踪所述第二标记图像的特征点,获取所述相机拍摄的图像相对于所述第二标记图像的位姿参数,包括:
    从所述第二标记图像中提取多个特征点;
    通过在所述相机拍摄的至少一个图像中追踪所述多个特征点,得到每个图像相对于上一个图像的位姿参数;
    对于所述相机拍摄的第二图像,根据从所述第二标记图像至所述第二图像中的每个图像相对于上一个图像的位姿参数,确定所述第二图像相对于所述第二标记图像的位姿参数。
  3. 根据权利要求2所述的方法,其特征在于,所述确定所述第二图像相对于所述第二标记图像的位姿参数之后,所述方法还包括:
    根据所述多个特征点在所述第二标记图像中的三维坐标以及所述第二图像相对于所述第二标记图像的位姿参数,计算每个特征点在所述第二图像中的估计三维坐标;
    对所述每个特征点在所述第二图像中的估计三维坐标进行变换,得到所述每个特征点在所述第二图像中的估计二维坐标;
    当任一特征点在所述第二图像中的估计二维坐标与在所述第二图像中的实际二维坐标之间的距离大于预设距离时,将所述任一特征点删除。
  4. 根据权利要求1所述的方法,其特征在于,所述通过追踪第一标记图像的特征点,获取所述相机拍摄的图像的位姿参数,包括:
    根据所述第一标记图像相对于初始标记图像的位姿参数,以及所述图像相对于所述第一标记图像的位姿参数,采用以下公式,获取所述图像的位姿参数:
    Figure PCTCN2019079341-appb-100001
    R_final表示所述图像的旋转参数,T_final表示所述图像的位移参数;
    Rca表示所述图像相对于所述第一标记图像的旋转参数,Tca表示所述图像相对于所述第一标记图像的位移参数;
    R_old表示所述第一标记图像相对于所述初始标记图像的旋转参数,T_old表示所述第一标记图像相对于所述初始标记图像的位移参数。
  5. 根据权利要求1所述的方法,其特征在于,所述根据所述图像相对于所述第二标记图像的位姿参数,以及每一个标记图像相对于上一个标记图像的位姿参数,获取所述图像的位姿参数,包括:
    根据所述第二标记图像相对于所述第一标记图像的位姿参数,以及所述第一标记图像相对于初始标记图像的位姿参数,获取所述第二标记图像相对于所述初始标记图像的位姿参数;
    对于所述第一图像,根据所述第一图像相对于所述第二标记图像的位姿参数,以及所述第二标记图像相对于所述初始标记图像的位姿参数,采用以下公式获取所述第一图像的位姿参数:
    Figure PCTCN2019079341-appb-100002
    R_final表示所述第一图像的旋转参数,T_final表示所述第一图像的位移参数;
    Rcl表示所述第一图像相对于所述第二标记图像的旋转参数,Tcl表示所述第一图像相对于所述第二标记图像的位移参数;
    R_old表示所述第二标记图像相对于所述初始标记图像的旋转参数,T_old表示所述第二标记图像相对于所述初始标记图像的位移参数。
  6. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    获取所述第一图像中追踪到的特征点的数量;
    当所述数量未达到预设数量时,确定所述第一图像不满足所述特征点追踪条件。
  7. 根据权利要求1-6任一项所述的方法,其特征在于,所述方法还包括:
    对于所述第一标记图像或所述第二标记图像的任一特征点,获取所述特征点的二维坐标对应的齐次坐标,所述齐次坐标用于将所述二维坐标以三维形式表示;
    采用以下坐标转换关系,将所述齐次坐标转换为对应的三维坐标:
    Figure PCTCN2019079341-appb-100003
    其中,M表示所述三维坐标,m表示所述齐次坐标,s表示所述特征点所在的标记图像的深度,fx、fy、cx和cy表示所述相机的参数。
  8. 根据权利要求7所述的方法,其特征在于,所述方法还包括:
    根据所述第一标记图像的深度,以及所述第一标记图像的特征点在所述第二标记图像中的深度,采用以下公式,计算所述第二标记图像的深度:
    S n=d*S n-1
    其中,S n表示所述第二标记图像的深度,d表示所述第一标记图像的特征点在所述第二标记图像中的深度,S n-1表示所述第一标记图像的深度。
  9. 根据权利要求1所述的方法,其特征在于,所述通过追踪第一标记图像的特征点,获取相机拍摄的图像的位姿参数之前,所述方法还包括:
    如果未设置标记图像,则获取所述相机拍摄的图像;
    当从所述拍摄的图像中提取到的特征点数量达到预设数量时,将所述拍摄的图像设置为所述第一标记图像。
  10. 根据权利要求1-6任一项所述的方法,其特征在于,所述位姿参数包括 位移参数,所述方法还包括:
    通过惯性测量单元IMU,获取所述相机的多个旋转参数以及对应的时间戳,根据所述多个旋转参数以及对应的时间戳进行插值得到旋转参数曲线;
    获取所述相机拍摄的图像的时间戳在所述旋转参数曲线中对应的旋转参数,作为所述相机拍摄的图像的旋转参数。
  11. 一种位姿确定装置,其特征在于,所述装置包括:
    第一获取模块,用于通过追踪第一标记图像的特征点,获取相机拍摄的图像的位姿参数;
    切换模块,用于当第一图像的上一个图像满足特征点追踪条件,而所述第一图像不满足特征点追踪条件时,将所述第一图像的上一个图像作为第二标记图像;
    第二获取模块,用于通过追踪所述第二标记图像的特征点,获取所述相机拍摄的图像相对于所述第二标记图像的位姿参数;
    所述第二获取模块,还用于根据所述图像相对于所述第二标记图像的位姿参数,以及每一个标记图像相对于上一个标记图像的位姿参数,获取所述图像的位姿参数,根据所述位姿参数确定所述相机的位姿。
  12. 根据权利要求11所述的装置,其特征在于,所述第二获取模块,包括:
    提取单元,用于从所述第二标记图像中提取多个特征点;
    追踪单元,用于通过在所述相机拍摄的至少一个图像中追踪所述多个特征点,得到每个图像相对于上一个图像的位姿参数;
    确定单元,用于对于所述相机拍摄的第二图像,根据从所述第二标记图像至所述第二图像中的每个图像相对于上一个图像的位姿参数,确定所述第二图像相对于所述第二标记图像的位姿参数。
  13. 根据权利要求11所述的装置,其特征在于,所述第二获取模块还用于:
    根据所述第二标记图像相对于所述第一标记图像的位姿参数,以及所述第一标记图像相对于初始标记图像的位姿参数,获取所述第二标记图像相对于所述初始标记图像的位姿参数;
    对于所述第一图像,根据所述第一图像相对于所述第二标记图像的位姿参数,以及所述第二标记图像相对于所述初始标记图像的位姿参数,采用以下公式获取所述第一图像的位姿参数:
    Figure PCTCN2019079341-appb-100004
    R_final表示所述第一图像的旋转参数,T_final表示所述第一图像的位移参数;
    Rcl表示所述第一图像相对于所述第二标记图像的旋转参数,Tcl表示所述第一图像相对于所述第二标记图像的位移参数;
    R_old表示所述第二标记图像相对于所述初始标记图像的旋转参数,T_old表示所述第二标记图像相对于所述初始标记图像的位移参数。
  14. 一种智能设备,其特征在于,所述智能设备包括:处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    通过追踪第一标记图像的特征点,获取相机拍摄的图像的位姿参数;
    当第一图像的上一个图像满足特征点追踪条件,而所述第一图像不满足特征点追踪条件时,将所述第一图像的上一个图像作为第二标记图像;
    通过追踪所述第二标记图像的特征点,获取所述相机拍摄的图像相对于所述第二标记图像的位姿参数;
    根据所述图像相对于所述第二标记图像的位姿参数,以及每一个标记图像相对于上一个标记图像的位姿参数,获取所述图像的位姿参数,根据所述位姿参数确定所述相机的位姿。
  15. 根据权利要求14所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    从所述第二标记图像中提取多个特征点;
    通过在所述相机拍摄的至少一个图像中追踪所述多个特征点,得到每个图像相对于上一个图像的位姿参数;
    对于所述相机拍摄的第二图像,根据从所述第二标记图像至所述第二图像 中的每个图像相对于上一个图像的位姿参数,确定所述第二图像相对于所述第二标记图像的位姿参数。
  16. 根据权利要求15所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    根据所述多个特征点在所述第二标记图像中的三维坐标以及所述第二图像相对于所述第二标记图像的位姿参数,计算每个特征点在所述第二图像中的估计三维坐标;
    对所述每个特征点在所述第二图像中的估计三维坐标进行变换,得到所述每个特征点在所述第二图像中的估计二维坐标;
    当任一特征点在所述第二图像中的估计二维坐标与在所述第二图像中的实际二维坐标之间的距离大于预设距离时,将所述任一特征点删除。
  17. 根据权利要求14所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    根据所述第一标记图像相对于初始标记图像的位姿参数,以及所述图像相对于所述第一标记图像的位姿参数,采用以下公式,获取所述图像的位姿参数:
    Figure PCTCN2019079341-appb-100005
    R_final表示所述图像的旋转参数,T_final表示所述图像的位移参数;
    Rca表示所述图像相对于所述第一标记图像的旋转参数,Tca表示所述图像相对于所述第一标记图像的位移参数;
    R_old表示所述第一标记图像相对于所述初始标记图像的旋转参数,T_old表示所述第一标记图像相对于所述初始标记图像的位移参数。
  18. 根据权利要求14所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    根据所述第二标记图像相对于所述第一标记图像的位姿参数,以及所述第一标记图像相对于初始标记图像的位姿参数,获取所述第二标记图像相对于所述初始标记图像的位姿参数;
    对于所述第一图像,根据所述第一图像相对于所述第二标记图像的位姿参 数,以及所述第二标记图像相对于所述初始标记图像的位姿参数,采用以下公式获取所述第一图像的位姿参数:
    Figure PCTCN2019079341-appb-100006
    R_final表示所述第一图像的旋转参数,T_final表示所述第一图像的位移参数;
    Rcl表示所述第一图像相对于所述第二标记图像的旋转参数,Tcl表示所述第一图像相对于所述第二标记图像的位移参数;
    R_old表示所述第二标记图像相对于所述初始标记图像的旋转参数,T_old表示所述第二标记图像相对于所述初始标记图像的位移参数。
  19. 根据权利要求14所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    获取所述第一图像中追踪到的特征点的数量;
    当所述数量未达到预设数量时,确定所述第一图像不满足所述特征点追踪条件。
  20. 根据权利要求14-19任一项所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    对于所述第一标记图像或所述第二标记图像的任一特征点,获取所述特征点的二维坐标对应的齐次坐标,所述齐次坐标用于将所述二维坐标以三维形式表示;
    采用以下坐标转换关系,将所述齐次坐标转换为对应的三维坐标:
    其中,M表示所述三维坐标,m表示所述齐次坐标,s表示所述特征点所在的标记图像的深度,fx、fy、cx和cy表示所述相机的参数。
  21. 根据权利要求20所述的智能设备,其特征在于,所述指令、所述程序、 所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    根据所述第一标记图像的深度,以及所述第一标记图像的特征点在所述第二标记图像中的深度,采用以下公式,计算所述第二标记图像的深度:
    S n=d*S n-1
    其中,S n表示所述第二标记图像的深度,d表示所述第一标记图像的特征点在所述第二标记图像中的深度,S n-1表示所述第一标记图像的深度。
  22. 根据权利要求14所述的智能设备,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    如果未设置标记图像,则获取所述相机拍摄的图像;
    当从所述拍摄的图像中提取到的特征点数量达到预设数量时,将所述拍摄的图像设置为所述第一标记图像。
  23. 根据权利要求14-19任一项所述的智能设备,其特征在于,所述位姿参数包括位移参数,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以实现如下操作:
    通过惯性测量单元IMU,获取所述相机的多个旋转参数以及对应的时间戳,根据所述多个旋转参数以及对应的时间戳进行插值得到旋转参数曲线;
    获取所述相机拍摄的图像的时间戳在所述旋转参数曲线中对应的旋转参数,作为所述相机拍摄的图像的旋转参数。
  24. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述指令、所述程序、所述代码集或所述指令集由处理器加载并具有以实现如权利要求1至10任一权利要求所述的位姿确定方法中所具有的操作。
PCT/CN2019/079341 2018-04-27 2019-03-22 位姿确定方法、装置、智能设备及存储介质 WO2019205850A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19792476.4A EP3786893A4 (en) 2018-04-27 2019-03-22 METHOD AND DEVICE FOR DETERMINING POSE, INTELLIGENT DEVICE AND INFORMATION MEDIA
US16/917,069 US11158083B2 (en) 2018-04-27 2020-06-30 Position and attitude determining method and apparatus, smart device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810392212.7 2018-04-27
CN201810392212.7A CN108537845B (zh) 2018-04-27 2018-04-27 位姿确定方法、装置及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/917,069 Continuation US11158083B2 (en) 2018-04-27 2020-06-30 Position and attitude determining method and apparatus, smart device, and storage medium

Publications (1)

Publication Number Publication Date
WO2019205850A1 true WO2019205850A1 (zh) 2019-10-31

Family

ID=63479506

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/079341 WO2019205850A1 (zh) 2018-04-27 2019-03-22 位姿确定方法、装置、智能设备及存储介质

Country Status (4)

Country Link
US (1) US11158083B2 (zh)
EP (1) EP3786893A4 (zh)
CN (2) CN110555882B (zh)
WO (1) WO2019205850A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115937305A (zh) * 2022-06-28 2023-04-07 北京字跳网络技术有限公司 图像处理方法、装置及电子设备

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108876854B (zh) 2018-04-27 2022-03-08 腾讯科技(深圳)有限公司 相机姿态追踪过程的重定位方法、装置、设备及存储介质
CN110555882B (zh) * 2018-04-27 2022-11-15 腾讯科技(深圳)有限公司 界面显示方法、装置及存储介质
CN110544280B (zh) 2018-05-22 2021-10-08 腾讯科技(深圳)有限公司 Ar系统及方法
CN109359547B (zh) * 2018-09-19 2024-04-12 上海掌门科技有限公司 一种用于记录用户的静坐过程的方法与设备
CN109685839B (zh) * 2018-12-20 2023-04-18 广州华多网络科技有限公司 图像对齐方法、移动终端以及计算机存储介质
CN111784769B (zh) * 2019-04-04 2023-07-04 舜宇光学(浙江)研究院有限公司 基于模板的空间定位方法、空间定位装置,电子设备及计算机可读存储介质
CN110310326B (zh) * 2019-06-28 2021-07-02 北京百度网讯科技有限公司 一种视觉定位数据处理方法、装置、终端及计算机可读存储介质
CN110487274B (zh) * 2019-07-30 2021-01-29 中国科学院空间应用工程与技术中心 用于弱纹理场景的slam方法、系统、导航车及存储介质
CN112734797A (zh) * 2019-10-29 2021-04-30 浙江商汤科技开发有限公司 图像特征跟踪方法、装置及电子设备
CN116797971A (zh) * 2019-12-31 2023-09-22 支付宝实验室(新加坡)有限公司 一种视频流识别方法及装置
CN113313966A (zh) * 2020-02-27 2021-08-27 华为技术有限公司 一种位姿确定方法以及相关设备
CN111292420B (zh) * 2020-02-28 2023-04-28 北京百度网讯科技有限公司 用于构建地图的方法和装置
CN113382156A (zh) * 2020-03-10 2021-09-10 华为技术有限公司 获取位姿的方法及装置
CN112333491B (zh) * 2020-09-23 2022-11-01 字节跳动有限公司 视频处理方法、显示装置和存储介质
CN112689221B (zh) * 2020-12-18 2023-05-30 Oppo广东移动通信有限公司 录音方法、录音装置、电子设备及计算机可读存储介质
CN112907662B (zh) * 2021-01-28 2022-11-04 北京三快在线科技有限公司 特征提取方法、装置、电子设备及存储介质
CN113409444B (zh) * 2021-05-21 2023-07-11 北京达佳互联信息技术有限公司 三维重建方法、装置、电子设备及存储介质
CN113223185B (zh) * 2021-05-26 2023-09-05 北京奇艺世纪科技有限公司 一种图像处理方法、装置、电子设备及存储介质
CN113436349B (zh) * 2021-06-28 2023-05-16 展讯通信(天津)有限公司 一种3d背景替换方法、装置、存储介质和终端设备
US20230049084A1 (en) * 2021-07-30 2023-02-16 Gopro, Inc. System and method for calibrating a time difference between an image processor and an intertial measurement unit based on inter-frame point correspondence
CN113689484B (zh) * 2021-08-25 2022-07-15 北京三快在线科技有限公司 深度信息的确定方法、装置、终端及存储介质
CN114399532A (zh) * 2022-01-06 2022-04-26 广东汇天航空航天科技有限公司 一种相机位姿确定方法和装置
CN115278184B (zh) * 2022-07-18 2024-03-15 峰米(重庆)创新科技有限公司 投影画面校正方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105931275A (zh) * 2016-05-23 2016-09-07 北京暴风魔镜科技有限公司 基于移动端单目和imu融合的稳定运动跟踪方法和装置
WO2017027338A1 (en) * 2015-08-07 2017-02-16 Pcms Holdings, Inc. Apparatus and method for supporting interactive augmented reality functionalities
EP3264372A1 (en) * 2016-06-30 2018-01-03 Alcatel Lucent Image processing device and method
CN108537845A (zh) * 2018-04-27 2018-09-14 腾讯科技(深圳)有限公司 位姿确定方法、装置及存储介质

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPWO2005010817A1 (ja) * 2003-07-24 2006-09-14 オリンパス株式会社 画像処理装置
JP4926817B2 (ja) * 2006-08-11 2012-05-09 キヤノン株式会社 指標配置情報計測装置および方法
JP5538667B2 (ja) * 2007-04-26 2014-07-02 キヤノン株式会社 位置姿勢計測装置及びその制御方法
NO327279B1 (no) 2007-05-22 2009-06-02 Metaio Gmbh Kamerapositurestimeringsanordning og- fremgangsmate for foroket virkelighetsavbildning
CN102819845A (zh) * 2011-06-07 2012-12-12 中兴通讯股份有限公司 一种混合特征的跟踪方法和装置
KR102209008B1 (ko) * 2014-02-17 2021-01-28 삼성전자주식회사 카메라 포즈 추정 장치 및 카메라 포즈 추정 방법
CN104915965A (zh) * 2014-03-14 2015-09-16 华为技术有限公司 一种摄像机跟踪方法及装置
CN104050475A (zh) 2014-06-19 2014-09-17 樊晓东 基于图像特征匹配的增强现实的系统和方法
CN105184822B (zh) * 2015-09-29 2017-12-29 中国兵器工业计算机应用技术研究所 一种目标跟踪模板更新方法
JP2017130042A (ja) * 2016-01-20 2017-07-27 株式会社リコー 映像処理装置、映像処理方法、及びプログラム
CN106843456B (zh) * 2016-08-16 2018-06-29 深圳超多维光电子有限公司 一种基于姿态追踪的显示方法、装置和虚拟现实设备
CN106920259B (zh) * 2017-02-28 2019-12-06 武汉工程大学 一种定位方法及系统
CN107590453B (zh) * 2017-09-04 2019-01-11 腾讯科技(深圳)有限公司 增强现实场景的处理方法、装置及设备、计算机存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017027338A1 (en) * 2015-08-07 2017-02-16 Pcms Holdings, Inc. Apparatus and method for supporting interactive augmented reality functionalities
CN105931275A (zh) * 2016-05-23 2016-09-07 北京暴风魔镜科技有限公司 基于移动端单目和imu融合的稳定运动跟踪方法和装置
EP3264372A1 (en) * 2016-06-30 2018-01-03 Alcatel Lucent Image processing device and method
CN108537845A (zh) * 2018-04-27 2018-09-14 腾讯科技(深圳)有限公司 位姿确定方法、装置及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3786893A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115937305A (zh) * 2022-06-28 2023-04-07 北京字跳网络技术有限公司 图像处理方法、装置及电子设备

Also Published As

Publication number Publication date
EP3786893A1 (en) 2021-03-03
EP3786893A4 (en) 2022-01-19
CN108537845B (zh) 2023-01-03
US11158083B2 (en) 2021-10-26
CN110555882A (zh) 2019-12-10
US20200334854A1 (en) 2020-10-22
CN110555882B (zh) 2022-11-15
CN108537845A (zh) 2018-09-14

Similar Documents

Publication Publication Date Title
WO2019205850A1 (zh) 位姿确定方法、装置、智能设备及存储介质
WO2019205851A1 (zh) 位姿确定方法、装置、智能设备及存储介质
CN108682038B (zh) 位姿确定方法、装置及存储介质
US11205282B2 (en) Relocalization method and apparatus in camera pose tracking process and storage medium
CN108734736B (zh) 相机姿态追踪方法、装置、设备及存储介质
WO2019205853A1 (zh) 相机姿态追踪过程的重定位方法、装置、设备及存储介质
CN109947886B (zh) 图像处理方法、装置、电子设备及存储介质
US11276183B2 (en) Relocalization method and apparatus in camera pose tracking process, device, and storage medium
CN110148178B (zh) 相机定位方法、装置、终端及存储介质
WO2019154231A1 (zh) 图像处理方法、电子设备及存储介质
CN109886208B (zh) 物体检测的方法、装置、计算机设备及存储介质
CN111897429A (zh) 图像显示方法、装置、计算机设备及存储介质
CN111862148A (zh) 实现视觉跟踪的方法、装置、电子设备及介质
CN113160031B (zh) 图像处理方法、装置、电子设备及存储介质
WO2019134305A1 (zh) 确定姿态的方法、装置、智能设备、存储介质和程序产品
CN111928861B (zh) 地图构建方法及装置
CN114093020A (zh) 动作捕捉方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19792476

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2019792476

Country of ref document: EP