WO2024087917A1 - Pose determination method and apparatus, computer readable storage medium, and electronic device - Google Patents

Pose determination method and apparatus, computer readable storage medium, and electronic device Download PDF

Info

Publication number
WO2024087917A1
WO2024087917A1 PCT/CN2023/118181 CN2023118181W WO2024087917A1 WO 2024087917 A1 WO2024087917 A1 WO 2024087917A1 CN 2023118181 W CN2023118181 W CN 2023118181W WO 2024087917 A1 WO2024087917 A1 WO 2024087917A1
Authority
WO
WIPO (PCT)
Prior art keywords
camera
color image
feature points
coordinate system
dimensional feature
Prior art date
Application number
PCT/CN2023/118181
Other languages
French (fr)
Chinese (zh)
Inventor
尹赫
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2024087917A1 publication Critical patent/WO2024087917A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks

Definitions

  • the present disclosure relates to the field of computer vision technology, and in particular to a posture determination method, a posture determination device, a computer-readable storage medium, and an electronic device.
  • visual positioning is a technology that uses images taken by a camera to determine the camera's position in the real world. It has important application value in augmented reality, virtual reality, robotics, intelligent transportation and other fields.
  • the present disclosure provides a posture determination method, a posture determination device, a computer-readable storage medium, and an electronic device.
  • a posture determination method is provided, which is applied to a terminal device, wherein the terminal device is configured with a first camera and at least one second camera, and the posture determination method includes: obtaining a current frame color image captured by the first camera, and determining first two-dimensional feature points on the current frame color image captured by the first camera that match a previous frame color image captured by the first camera; obtaining a current frame color image captured by the second camera, and determining second two-dimensional feature points on the current frame color image captured by the second camera that match a previous frame color image captured by the second camera; converting the second two-dimensional feature points into third two-dimensional feature points in the first camera coordinate system by using a conversion matrix between a first camera coordinate system of the first camera and a second camera coordinate system of the second camera; and determining the posture of the first camera when capturing the current frame color image according to the first two-dimensional feature points, the third two-dimensional feature points, the three-dimensional feature points of the previous frame color image captured by the first camera in the
  • a posture determination device which is configured in a terminal device, and the terminal device is further configured with a first camera and at least one second camera.
  • the posture determination device includes: a first feature point determination module, which is used to obtain a current frame color image captured by the first camera, and determine a first two-dimensional feature point on the current frame color image captured by the first camera that matches a previous frame color image captured by the first camera; a second feature point determination module, which is used to obtain a current frame color image captured by the second camera, and determine a second two-dimensional feature point on the current frame color image captured by the second camera that matches a previous frame color image captured by the second camera; a feature point conversion module, which is used to convert the second two-dimensional feature point into a third two-dimensional feature point in the first camera coordinate system by using a conversion matrix between a first camera coordinate system of the first camera and a second camera coordinate system of the second camera; and a posture determination module, which is used to determine the posture of the first camera when
  • a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the above-mentioned posture determination method is implemented.
  • an electronic device comprising a processor; a memory for storing a or multiple programs, when one or more programs are executed by a processor, the processor implements the above-mentioned posture determination method.
  • FIG1 is a schematic diagram showing a system architecture of a posture determination system according to an embodiment of the present disclosure
  • FIG2 is a schematic diagram showing a placement of dual cameras on a terminal device according to an embodiment of the present disclosure
  • FIG3 is a schematic diagram showing the placement angles of the dual cameras according to an embodiment of the present disclosure.
  • FIG4 is a schematic diagram showing various processing stages involved in the posture determination solution of an embodiment of the present disclosure.
  • FIG5 schematically shows a flow chart of a method for determining a posture according to an exemplary embodiment of the present disclosure
  • FIG6 is a schematic diagram showing dual-camera point pair matching according to an embodiment of the present disclosure.
  • FIG7 shows a flowchart of a positioning initialization process according to an embodiment of the present disclosure
  • FIG8 is a schematic diagram showing a method of determining two planes according to an embodiment of the present disclosure.
  • FIG9 is a schematic diagram showing a method of determining a ground plane according to an embodiment of the present disclosure.
  • FIG10 is a flowchart showing a process of determining a transformation matrix between a first camera coordinate system and a world coordinate system according to an embodiment of the present disclosure
  • FIG11 schematically shows a block diagram of a posture determination apparatus according to a first exemplary embodiment of the present disclosure
  • FIG12 schematically shows a block diagram of a posture determination apparatus according to a second exemplary embodiment of the present disclosure
  • FIG13 schematically shows a block diagram of a posture determination apparatus according to a third exemplary embodiment of the present disclosure.
  • FIG14 schematically shows a block diagram of a posture determination apparatus according to a fourth exemplary embodiment of the present disclosure.
  • FIG. 15 schematically shows a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
  • computer devices can autonomously perceive their own position in the environment, so as to perform any tasks proposed by the user, such as tracking, monitoring, interaction, displaying images, playing audio, etc.
  • the accuracy of positioning greatly affects the realization of computer device functions.
  • the embodiments of the present disclosure provide a new positioning solution.
  • the present application provides a posture determination method, which is applied to a terminal device, wherein the terminal device is configured with a first camera and at least one second camera, and the posture determination method includes:
  • the posture of the first camera when capturing the current frame of color image is determined based on the first two-dimensional feature points, the third two-dimensional feature points, the three-dimensional feature points of the previous frame of color image captured by the first camera in the world coordinate system, and the three-dimensional feature points of the previous frame of color image captured by the second camera in the world coordinate system.
  • determining a first two-dimensional feature point on a current frame color image acquired by the first camera that matches a previous frame color image acquired by the first camera includes:
  • Optical flow tracking is performed using feature points of a current frame color image captured by the first camera and feature points of a previous frame color image captured by the first camera to determine the first two-dimensional feature points.
  • using a transformation matrix between a first camera coordinate system of the first camera and a second camera coordinate system of the second camera to transform the second two-dimensional feature points into third two-dimensional feature points in the first camera coordinate system includes:
  • the third two-dimensional feature point is determined according to the conversion matrix, the depth information of the second two-dimensional feature point, and the second two-dimensional feature point.
  • determining the third two-dimensional feature point according to the conversion matrix, the depth information of the second two-dimensional feature point, and the second two-dimensional feature point includes:
  • the conversion matrix, the depth information of the second two-dimensional feature point, and the second two-dimensional feature point are multiplied, and a result of the multiplication is normalized to determine the third two-dimensional feature point.
  • the first two-dimensional feature points and the third two-dimensional feature points constitute two-dimensional coordinate information
  • the three-dimensional feature points of the previous frame color image acquired by the first camera in the world coordinate system and the three-dimensional feature points of the previous frame color image acquired by the second camera in the world coordinate system constitute three-dimensional coordinate information
  • determining the position and posture of the first camera when acquiring the current frame color image includes:
  • the point pair information is used to solve the perspective n-point problem, and the pose of the first camera when capturing the current frame color image is determined in combination with the solution result.
  • the posture determination method further includes:
  • the three-dimensional feature points in the first camera coordinate system are transformed to obtain the three-dimensional feature points of the last frame of color image captured by the first camera in the world coordinate system.
  • spatially projecting feature points of the previous frame of color image acquired by the first camera to obtain three-dimensional feature points of the previous frame of color image acquired by the first camera in the first camera coordinate system includes:
  • the predetermined depth range is determined based on a range of depth measurement.
  • the posture determination method further includes:
  • the three-dimensional feature points of the previous frame of color image acquired by the second camera in the second camera coordinate system are transformed into three-dimensional feature points in the first camera coordinate system;
  • the three-dimensional feature points in the first camera coordinate system are transformed to obtain the three-dimensional feature points of the last frame of color image captured by the second camera in the world coordinate system.
  • Performing spatial projection on feature points of the previous frame of color image acquired by the second camera to obtain three-dimensional feature points of the previous frame of color image acquired by the second camera in the second camera coordinate system includes:
  • the predetermined depth range is determined based on a range of depth measurement.
  • the posture determination method further includes:
  • the initial positioning result of the first camera in the first camera coordinate system is transformed by using the transformation matrix between the first camera coordinate system and the world coordinate system, so as to determine the position and posture of the first camera when capturing the initial frame color image.
  • the posture determination method further includes:
  • a transformation matrix between the first camera coordinate system and the world coordinate system is determined according to a normal vector and a gravity vector of the designated plane.
  • the posture determination method further includes:
  • the designated plane is selected according to the plane information of the reference point cloud.
  • determining a reference point cloud corresponding to the first camera in combination with a reference depth image output by the first camera includes:
  • a reference point cloud corresponding to the first camera is constructed by combining the three-dimensional space point of each pixel point on the reference depth image output by the first camera.
  • combining the three-dimensional space point of each pixel point on the reference depth image output by the first camera to construct a reference point cloud corresponding to the first camera includes:
  • the second camera output is converted into a coordinate system according to the conversion matrix between the first camera coordinate system and the second camera coordinate system.
  • the three-dimensional space point of each pixel on the reference depth image is converted to obtain a converted three-dimensional space point;
  • the three-dimensional space point of each pixel point on the reference depth image output by the first camera is merged with the converted three-dimensional space point to construct a reference point cloud corresponding to the first camera.
  • screening the designated plane according to the plane information of the reference point cloud includes:
  • the designated plane is filtered according to distance information between the plane and the first camera included in the plane information of the reference point cloud.
  • screening the designated plane according to the distance information between the plane and the first camera included in the plane information of the reference point cloud includes:
  • the distance information includes a distance within a predetermined distance range
  • the distance threshold is within the predetermined distance range.
  • the designated plane is a ground plane.
  • FIG1 is a schematic diagram showing a system architecture of a position and posture determination system according to an embodiment of the present disclosure.
  • a terminal device 1 may include a processor 100 , a first camera 110 , and at least one second camera 120 .
  • the terminal device 1 may include, for example, a robot, an intelligent monitoring device, an intelligent tracking device, etc. It may be a whole device, or a device system composed of multiple entity units.
  • the terminal device 1 may be a robot dog.
  • a robot dog is a robot form with advantages such as flexibility and strong mobility, and can perform tasks such as security patrol, transporting items, and emotional companionship.
  • the first camera 110 and the at least one second camera 120 serve as input sensors of the posture determination solution of the embodiment of the present disclosure, and can transmit the sensed color image and depth image to the processor 100 .
  • the first camera 110 and the second camera 120 may be Realsense D455 cameras.
  • the Realsense D455 camera consists of an RGB camera, two IR (infrared) cameras, and an IR transmitter.
  • the RGB camera outputs a color image
  • the two IR cameras may output a dense depth map aligned with the color image.
  • the FOV (field of view) of the Realsense D455 camera is 90° horizontally and 65° vertically.
  • the terminal device 1 includes a first camera 110 and a second camera 120
  • the first camera 110 may be a left camera
  • the second camera 120 may be a right camera.
  • the left camera involved may be understood as the first camera 110
  • the right camera involved may be understood as the second camera 120.
  • “left”, “right”, “first”, and “second” are merely exemplary descriptions for distinction.
  • the first camera 110 may be a right camera
  • the second camera 120 may be a left camera, and the present disclosure does not limit this.
  • FIG2 shows a schematic diagram of the placement of the dual cameras on a terminal device according to an embodiment of the present disclosure. It should be understood that the placement shown in FIG2 is only an exemplary description, and there may be multiple placements according to the type of terminal device and the camera configuration space. This is not a restriction.
  • FIG3 shows a schematic diagram of the placement angles of the dual cameras of the embodiment of the present disclosure.
  • the first camera 110 and the second camera 120 both of which are placed vertically, their viewing angles are both 65°, corresponding to angles A and B in FIG3 , respectively.
  • the leftmost line of sight of the first camera 110 can be parallel to the rightmost line of sight of the second camera 120.
  • the two cameras can obtain the maximum field of view, that is, 130°, corresponding to angle C in FIG3 .
  • the first camera 110 and the second camera 120 are placed vertically side by side at an angle of 115°, and the fields of view of the two cameras are 130° in the horizontal direction and 90° in the vertical direction. This achieves the maximum superposition of the fields of view of the two cameras, effectively increases the field of view of the terminal device 1, and provides more sufficient accuracy for the subsequent positioning algorithm.
  • first camera 110 and the second camera 120 support multi-camera hardware synchronization.
  • the first camera 110 and the second camera 120 can be connected by a wire, and the same pulse signal is used to trigger the two cameras to expose simultaneously, thereby realizing hardware synchronization of multiple cameras.
  • the image input into the subsequent positioning algorithm is the image taken at the same time. In this way, additional errors caused by inconsistent shooting time of multiple cameras are avoided.
  • the internal and external parameters of the two cameras can be calibrated respectively for use by subsequent algorithms.
  • the present disclosure does not limit the calibration process.
  • the processor 100 can obtain the current frame color image captured by the first camera 110, and determine the first two-dimensional feature points on the current frame color image captured by the first camera 110 that match the previous frame color image captured by the first camera 110.
  • the processor 100 may acquire a current frame color image captured by the second camera 120, and determine a second two-dimensional feature point on the current frame color image captured by the second camera 120 that matches a previous frame color image captured by the second camera 120.
  • the processor 100 converts the second two-dimensional feature point into a third two-dimensional feature point in the first camera coordinate system using a conversion matrix between the first camera coordinate system of the first camera 110 and the second camera coordinate system of the second camera.
  • the processor 100 can determine the posture of the first camera 110 when capturing the current frame color image based on the first two-dimensional feature points, the third two-dimensional feature points, the three-dimensional feature points of the previous frame color image captured by the first camera 110 in the world coordinate system, and the three-dimensional feature points of the previous frame color image captured by the second camera 120 in the world coordinate system.
  • feature point data of each second camera 120 may be mapped to the first camera coordinate system of the first camera 110 for processing.
  • the placement positions of the first camera 110 and the second camera 120 on the terminal device 1 are fixed, and when the current posture of the first camera 110 is determined, the current posture of the second camera 120 and the current posture of the terminal device 1 can be obtained.
  • any one of the cameras may be determined as the first camera 110 in algorithm implementation, and the remaining cameras may be determined as the second camera 120 .
  • the feature points collected by the second camera 120 are converted to the first camera coordinate system to perform posture calculation together with the feature points collected by the first camera 110. Since the feature points come from at least two cameras and the coordinate systems are unified, more feature points are collected, that is, the feature points involved in the unified processing are more comprehensive. The determined position and posture are more accurate, which improves the accuracy of positioning. In addition, the position and posture determination process of the present disclosure takes into account the correlation between frames, combines the feature information of the previous frame image, and uses the data of the previous frame for constraints, which further improves the accuracy of positioning.
  • the processing stages involved include but are not limited to a coordinate system alignment stage, a positioning initialization stage, and a real-time positioning stage.
  • the terminal device determines the transformation matrix between the first camera coordinate system and the world coordinate system.
  • the terminal device can construct a point cloud using the depth image output by the first camera and the depth image output by the second camera, wherein the three-dimensional space points corresponding to the two depth images can be merged to obtain a point cloud of three-dimensional feature points.
  • the terminal device uses a plane detection algorithm to extract plane information from the point cloud, and selects a specified plane (such as the ground plane) based on the extracted plane information.
  • a plane detection algorithm to extract plane information from the point cloud, and selects a specified plane (such as the ground plane) based on the extracted plane information.
  • the terminal device may calculate a transformation matrix according to the normal vector and the gravity vector of the specified plane to align the first camera coordinate system with the world coordinate system.
  • the transformation matrix between the first camera coordinate system and the second camera coordinate system can be obtained.
  • the transformation matrix between the second camera coordinate system and the world coordinate system can also be obtained to achieve alignment among the first camera coordinate system, the second camera coordinate system, and the world coordinate system.
  • the terminal device can determine the position and posture of the first camera when initially capturing a color image. It should be understood that the position and posture of the camera when capturing an image in the present disclosure refers to the position and posture in the world coordinate system.
  • the terminal device can determine the three-dimensional feature points corresponding to the initial frame color image captured by the first camera, and the three-dimensional feature points are feature points in the first camera coordinate system.
  • the initial rotation matrix and the initial translation vector can be set.
  • the initial rotation matrix is the identity matrix
  • the initial translation vector is [0,0,0].
  • the positioning initialization in the first camera coordinate system is completed.
  • the positioning initialization result in the first camera coordinate system can be converted into the positioning initialization result in the world coordinate system, that is, the position and posture of the first camera when capturing the initial frame color image is determined.
  • the terminal device can combine the initial pose determined in the positioning initialization stage to obtain the pose of the current frame in real time.
  • the features of the second camera can be transferred to the coordinate system of the first camera, and the pose can be solved in conjunction with the features of the first camera to complete the pose prediction of the current frame.
  • FIG5 schematically shows a flow chart of a method for determining a posture according to an exemplary embodiment of the present disclosure.
  • the method for determining a posture may include the following steps:
  • the current frame color image is the color image captured by the camera at the current moment
  • the previous frame color image is the color image captured by the camera in the previous frame. No restrictions.
  • the terminal device may extract feature points of the current color image captured by the first camera.
  • the feature extraction algorithm used in the exemplary embodiments of the present disclosure may include but is not limited to the FAST feature point detection algorithm, the DOG feature point detection algorithm, the Harris feature point detection algorithm, the SIFT feature point detection algorithm, the SURF feature point detection algorithm, etc.
  • the feature descriptor may include but is not limited to the BRIEF feature point descriptor, the BRISK feature point descriptor, the FREAK feature point descriptor, etc.
  • the combination of the feature extraction algorithm and the feature descriptor may be a FAST feature point detection algorithm and a BRIEF feature point descriptor. According to other embodiments of the present disclosure, the combination of the feature extraction algorithm and the feature descriptor may be a DOG feature point detection algorithm and a FREAK feature point descriptor.
  • the FAST feature point detection algorithm and the BRIEF feature point descriptor can be used for feature extraction; for weak texture scenes, the DOG feature point detection algorithm and the FREAK feature point descriptor can be used for feature extraction.
  • the terminal device can use the feature points of the current color image frame captured by the first camera and the feature points of the previous color image frame captured by the first camera to determine the two-dimensional feature points that match between the two images, that is, the first two-dimensional feature points mentioned in the present disclosure.
  • the optical flow method can be used to determine the matching relationship of the feature points, that is, the feature points of the current frame color image captured by the first camera and the feature points of the previous frame color image captured by the first camera are used for optical flow tracking to determine the first two-dimensional feature points.
  • other image matching methods can also be used to determine 2D-2D feature point pairs, which is not limited in the present disclosure.
  • step S52 although there are descriptions of the current frame color image and the previous frame color image, the current frame color image and the previous frame color image in step S52 are captured by the first camera, and the current frame color image and the previous frame color image in step S54 are captured by the second camera.
  • the terminal device After acquiring the current frame color image captured by the second camera, the terminal device can extract feature points of the current color image captured by the second camera.
  • the method of extracting feature points can be the same as the method of extracting feature points in step S52, which will not be repeated.
  • the terminal device may perform optical flow tracking using the feature points of the current frame color image captured by the second camera and the feature points of the previous frame color image captured by the second camera to determine the second two-dimensional feature points.
  • a camera coordinate system of the first camera is recorded as a first camera coordinate system
  • a camera coordinate system of the second camera is recorded as a second camera coordinate system
  • the first camera and the second camera are placed at fixed positions on the terminal device, the first camera and the second camera are pre-positioned.
  • the two cameras are calibrated with internal and external parameters, and the conversion matrix between the first camera coordinate system of the first camera and the second camera coordinate system of the second camera can be determined from the calibration results.
  • the terminal device can obtain the conversion matrix between the first camera coordinate system and the second camera coordinate system and the depth information of the second two-dimensional feature point, and determine the third two-dimensional feature point according to the conversion matrix, the depth information of the second two-dimensional feature point and the second feature point.
  • the third two-dimensional feature point is the two-dimensional feature point converted from the second two-dimensional feature point to the first camera coordinate system.
  • the transformation matrix, the depth information of the second two-dimensional feature points, and the second two-dimensional feature points can be multiplied, and the multiplication result can be normalized to determine the third two-dimensional feature points.
  • the second two-dimensional feature points in the multiplication operation refer to the position coordinate information of these feature points.
  • the third two-dimensional feature points can be determined using formula 1
  • T lr is the transformation matrix between the first camera coordinate system and the second camera coordinate system
  • d j is the depth value of the second two-dimensional feature point
  • the first two-dimensional feature points and the third two-dimensional feature points constitute two-dimensional coordinate information
  • the three-dimensional feature points of the previous frame color image captured by the first camera in the world coordinate system and the three-dimensional feature points of the previous frame color image captured by the second camera in the world coordinate system constitute three-dimensional coordinate information
  • the terminal device can associate the two-dimensional coordinate system information with the three-dimensional coordinate information to obtain point pair information, and use the point pair information to solve the perspective-n-Point (PnP) problem, and determine the posture of the first camera when capturing the current frame color image based on the solution result.
  • PnP perspective-n-Point
  • PnP is a method in the field of machine vision, which can determine the relative position of the camera based on n feature points in the scene. Specifically, the rotation matrix and translation vector of the camera can be determined based on the n feature points on the scene.
  • the process of determining the three-dimensional feature points of the previous frame color image in the world coordinate system in the present disclosure can be performed during the processing of the current frame or during the processing of the previous frame, and the present disclosure does not impose any limitation on this.
  • the terminal device can obtain the last frame of color image captured by the first camera, and extract the feature points of the last frame of color image captured by the first camera.
  • the process of extracting feature points is the same as the process in step S52, which will not be repeated here. State.
  • the terminal device can use the previous frame depth image aligned with the previous frame color image captured by the first camera to perform spatial projection on the feature points of the previous frame color image captured by the first camera to obtain the three-dimensional feature points of the previous frame color image captured by the first camera in the first camera coordinate system.
  • the previous frame depth image can be output by the first camera, or can be obtained by other depth cameras equipped by the terminal device, and the present disclosure does not limit this.
  • the spatial projection process can also be constrained.
  • the terminal device can use the previous frame of depth image aligned with the previous frame of color image captured by the first camera to perform spatial projection on the feature points within a predetermined depth range among the feature points of the previous frame of color image captured by the first camera, so as to obtain the three-dimensional feature points of the previous frame of color image captured by the first camera in the first camera coordinate system.
  • the predetermined depth range is determined based on the range of the depth measurement.
  • the value of the predetermined depth range may vary depending on the type and model of the depth camera.
  • the present disclosure does not limit the specific value of the predetermined depth range. For example, feature points with a depth value greater than 0.5m and less than 6m are spatially projected.
  • the terminal device can transform the three-dimensional feature points in the first camera coordinate system according to the posture when the first camera captured the last frame of color image, so as to obtain the three-dimensional feature points in the world coordinate system of the last frame of color image captured by the first camera.
  • T w_last is the position and posture of the first camera when capturing the last frame of color image.
  • the position and posture of the first camera when capturing the previous color image can be determined during the processing of the previous image, that is, during the processing of the current frame, the position and posture corresponding to the previous frame is known.
  • the initial position and posture are explained in the process of positioning initialization of the present disclosure.
  • the terminal device can obtain the last frame of color image captured by the second camera, and extract feature points of the last frame of color image captured by the second camera.
  • the process of extracting feature points is the same as the process in step S52, which will not be repeated here.
  • the terminal device can use the previous frame of depth image aligned with the previous frame of color image captured by the second camera to perform spatial projection on the feature points of the previous frame of color image captured by the second camera to obtain the three-dimensional feature points of the previous frame of color image captured by the second camera in the second camera coordinate system.
  • the previous frame of depth image can be output by the second camera, or can be obtained by other depth cameras equipped by the terminal device, and the present disclosure does not limit this.
  • the spatial projection process can also be constrained.
  • the terminal device can use the previous frame of depth image aligned with the previous frame of color image captured by the second camera to perform spatial projection on the feature points within a predetermined depth range among the feature points of the previous frame of color image captured by the second camera, so as to obtain the three-dimensional feature points of the previous frame of color image captured by the second camera in the second camera coordinate system.
  • the predetermined depth range is determined based on the range of the depth measurement.
  • the value of the predetermined depth range may vary depending on the type and model of the depth camera.
  • the present disclosure does not limit the specific value of the predetermined depth range. For example, feature points with a depth value greater than 0.5m and less than 6m are spatially projected.
  • the terminal device may use the transformation matrix between the first camera coordinate system and the second camera coordinate system to transform the three-dimensional feature points of the previous frame color image captured by the second camera in the second camera coordinate system into the three-dimensional feature points in the first camera coordinate system.
  • the terminal device can convert the three-dimensional feature points in the converted first camera coordinate system again according to the posture of the first camera when capturing the previous frame of color image, so as to obtain the three-dimensional feature points of the previous frame of color image captured by the second camera in the world coordinate system.
  • T w_last is the position and posture when the first camera captured the last frame color image
  • T lr is the transformation matrix between the first camera coordinate system and the second camera coordinate system.
  • FIG6 shows a schematic diagram of point pair matching between the first camera and the second camera to achieve PnP pose solution, which involves the matching relationship of 2D-2D feature points of the current frame and the matching relationship of 3D-2D feature points.
  • the position and posture of the first camera when capturing the previous color image is used.
  • the process of determining the initial position and posture of the first camera is described below.
  • the terminal device may obtain an initial frame color image captured by the first camera, and extract feature points of the initial frame color image captured by the first camera.
  • the process of extracting feature points is the same as the process in step S52, and will not be repeated here.
  • the terminal device can use the initial frame depth image aligned with the initial frame color image captured by the first camera to spatially project the feature points of the initial frame color image captured by the first camera to obtain the three-dimensional feature points of the initial frame color image captured by the first camera in the first camera coordinate system.
  • the spatial projection process can also be constrained.
  • the terminal device can use the feature points in the initial frame color image captured by the first camera that are within a predetermined depth range for spatial projection to obtain the three-dimensional feature points of the initial frame color image captured by the first camera in the first camera coordinate system.
  • the predetermined depth range is determined based on the range of the depth measurement.
  • the value of the predetermined depth range may vary depending on the type and model of the depth camera.
  • the present disclosure does not limit the specific value of the predetermined depth range. For example, feature points with a depth value greater than 0.5m and less than 6m are spatially projected.
  • the terminal device can determine the initial positioning of the first camera in the first camera coordinate system based on the three-dimensional feature points, initial rotation matrix and initial translation vector of the initial frame color image captured by the first camera in the first camera coordinate system. result.
  • the initial rotation matrix may be set to the identity matrix, and the translation vector may be set to [0, 0, 0].
  • the terminal device may transform the initial positioning result of the first camera in the first camera coordinate system using the transformation matrix between the first camera coordinate system and the world coordinate system, so as to determine the posture of the first camera when capturing the initial frame color image.
  • the process of determining the initial position and posture of the first camera may also be combined with feature data of the second camera, and this process is described below.
  • the terminal device can determine the three-dimensional feature points of the initial frame color image captured by the first camera in the first camera coordinate system.
  • the terminal device can obtain the initial frame color image captured by the second camera, and extract feature points of the initial frame color image captured by the second camera.
  • the process of extracting feature points is the same as the process in step S52, which will not be repeated here.
  • the terminal device can use the initial frame depth image aligned with the initial frame color image captured by the second camera to spatially project the feature points of the initial frame color image captured by the second camera to obtain the three-dimensional feature points of the initial frame color image captured by the second camera in the second camera coordinate system.
  • the spatial projection process can also be constrained.
  • the terminal device can use the feature points in the initial frame color image captured by the second camera that are within a predetermined depth range to perform spatial projection to obtain the three-dimensional feature points of the initial frame color image captured by the second camera in the second camera coordinate system.
  • the predetermined depth range is determined based on the range of the depth measurement.
  • the value of the predetermined depth range may vary depending on the type and model of the depth camera.
  • the present disclosure does not limit the specific value of the predetermined depth range. For example, feature points with a depth value greater than 0.5m and less than 6m are spatially projected.
  • the terminal device can use the transformation matrix between the first camera coordinate system of the first camera and the second camera coordinate system of the second camera to transform the three-dimensional feature points of the initial frame color image captured by the second camera in the second camera coordinate system into the three-dimensional feature points in the first camera coordinate system.
  • the converted 3D feature points and the 3D feature points of the initial frame color image captured by the first camera in the first camera coordinate system can be combined to obtain combined 3D feature points. It can be understood that the combined 3D feature points are 3D feature points in the first camera coordinate system.
  • the terminal device can determine the initial positioning result of the first camera in the first camera coordinate system according to the combined three-dimensional feature points, the initial rotation matrix and the initial translation vector.
  • the initial rotation matrix can be set to the unit matrix and the translation vector can be set to [0,0,0].
  • the terminal device can use the conversion matrix between the first camera coordinate system and the world coordinate system to convert the first camera
  • the initial positioning result in the first camera coordinate system is transformed to determine the position and posture of the first camera when it captures the initial frame color image.
  • the terminal device may acquire an initial frame color image captured by the first camera, and extract feature points of the initial frame color image captured by the first camera.
  • the terminal device may perform spatial projection in combination with the depth image aligned with the initial frame color image captured by the first camera to obtain the three-dimensional feature points of the initial frame color image captured by the first camera in the first camera coordinate system.
  • the three-dimensional feature points determined in step S704 may also include the three-dimensional feature points corresponding to the initial frame color image captured by the second camera.
  • the terminal device may determine an initial positioning result of the first camera in the first camera coordinate system according to the three-dimensional feature points, the initial rotation matrix, and the initial translation vector determined in step S704.
  • the terminal device may transform the initial positioning result using the transformation matrix between the first camera coordinate system and the world coordinate system to determine the position and posture of the first camera when capturing the initial frame color, thereby completing the positioning initialization.
  • the transformation matrix between the first camera coordinate system and the world coordinate system is used.
  • the embodiment of the present disclosure provides a coordinate system alignment solution. Specifically, the coordinate system alignment is achieved in combination with the depth information.
  • the coordinate system alignment process is described using the terminology of the reference depth image.
  • the terminal device can obtain a reference depth image output by the first camera.
  • the terminal device can determine the transformation matrix between the first camera coordinate system and the world coordinate system according to the normal vector and gravity vector of the designated plane.
  • the gravity vector can be Ng (0,0,1), in which case the designated plane is usually the ground plane to match the scenario where the terminal device is, for example, a robot dog.
  • the designated plane can also be a plane manually designated in a specific scenario, such as a wall, a desktop, etc., and the present disclosure does not limit this.
  • R wc is the transformation matrix between the first camera coordinate system and the world coordinate system
  • the rotation angle ⁇ of R wc can be obtained by multiplying N g and n c , as shown in Formula 5:
  • the rotation axis ⁇ and the rotation angle ⁇ constitute the rotation vector between the first camera coordinate system and the world coordinate system.
  • the terminal device can calculate the transformation matrix R wc between the first camera coordinate system and the world coordinate system.
  • the thread of coordinate system alignment ends.
  • the terminal device may return to the step of acquiring the reference depth image, reacquire the reference depth image, and perform a process of determining whether the designated plane exists.
  • the terminal device can determine the point cloud corresponding to the first camera in combination with the reference depth image output by the first camera, which is recorded as the reference point cloud.
  • the terminal device determines the three-dimensional space point of each pixel on the reference depth image output by the first camera according to the pixel, the depth value of the pixel and the camera internal parameter of the first camera.
  • P represents the three-dimensional space point projected into the space
  • z represents the depth value of the pixel point
  • K -1 represents the inverse of the camera intrinsic parameter matrix
  • p represents the coordinate position of the pixel point.
  • a reference point cloud corresponding to the first camera may be constructed from the three-dimensional space points obtained through this process.
  • the terminal device determines, for each pixel point on the reference depth image output by the first camera, the three-dimensional spatial point of each pixel point on the reference depth image according to the pixel point, the depth value of the pixel point and the camera intrinsic parameters of the first camera.
  • the terminal device can obtain the reference depth image output by the second camera, and determine the three-dimensional space point of each pixel on the reference depth image output by the second camera in combination with the above formula 6.
  • the terminal device can transform the three-dimensional space point of each pixel on the reference depth image output by the second camera according to the transformation matrix between the first camera coordinate system and the second camera coordinate system to obtain the transformed three-dimensional space point.
  • PC_mixture is the determined reference point cloud
  • PC_right is the three-dimensional space point of each pixel point on the reference depth image output by the second camera
  • PC_left is the three-dimensional space point of each pixel point on the reference depth image output by the first camera
  • T lr is the transformation matrix between the first camera coordinate system and the second camera coordinate system.
  • the construction of the reference point cloud incorporates information of the depth image output by the second camera, thereby making the spatial feature points more comprehensive and improving the accuracy of the algorithm.
  • the terminal device can extract the plane information of the reference point cloud.
  • the present disclosure does not limit the plane extraction method, and can adopt the RANSAC fitting method, the normal vector region growing method, the hierarchical clustering method, etc., as long as the plane information in the scene can be extracted.
  • Some embodiments of the present disclosure adopt the plane extraction algorithm PEAC based on hierarchical clustering. Referring to Figure 8, two planes can be extracted using this algorithm. Figure 8 is only an example. All planes in the scene can be extracted using the above algorithm.
  • the extracted plane information includes but is not limited to the plane ID, the plane normal vector, the plane distance from the camera, distance, etc.
  • the terminal device may filter the designated plane according to the plane information of the reference point cloud. Specifically, the terminal device may filter the designated plane according to the distance information of the plane from the first camera included in the plane information of the reference point cloud.
  • the terminal device may determine a candidate plane corresponding to the distance, and in this case, the number of the determined candidate planes is one or more.
  • the terminal device may determine the candidate plane as the designated plane.
  • the terminal device may determine the candidate plane whose distance from the first camera is closest to a distance threshold as the designated plane, wherein the distance threshold is within the above-mentioned predetermined distance range.
  • FIG. 9 is a schematic diagram showing the screening of the ground plane. Compared with the result of plane detection, planes such as the ceiling are eliminated through the above distance-based screening process.
  • the terminal device is a robot dog.
  • the terminal device is equipped with a first camera and a second camera.
  • the configuration positions of the two cameras are fixed.
  • the robot dog is controlled to move for a short period of time and only moves on the ground plane.
  • the position of the ground plane in the coordinate system of the first camera is basically fixed.
  • the height of the ground plane from the camera is equivalent to the height of the robot dog, which is about 0.3m. Therefore, the above-mentioned predetermined distance range can be set to 0.25m to 0.35m as the ground plane. If multiple candidate planes are screened out, the plane with the closest distance of 0.3m is used as the ground plane.
  • the terminal device is controlled to continuously repeat the above-mentioned process of determining the plane using the depth image and plane screening until the terminal device detects the ground plane.
  • step S1002 the terminal device obtains a reference depth image output by the first camera, and back-projects the reference depth image to obtain a three-dimensional space point in space.
  • step S1004 the terminal device obtains a reference depth image output by the second camera, and back-projects the reference depth image to obtain a three-dimensional space point in space.
  • step S1006 the terminal device converts the three-dimensional space point obtained in step S1004 to a three-dimensional space point in the first camera coordinate system.
  • step S1008 the terminal device merges the three-dimensional space point obtained in step S1002 with the three-dimensional space point obtained in step S1006 to obtain a reference point cloud corresponding to the first camera.
  • the terminal device may extract plane information based on the reference point cloud.
  • the terminal device may screen the extracted planes to determine the ground plane
  • the terminal device may determine a transformation matrix between the first camera coordinate system and the world coordinate system using the normal vector of the ground plane and the gravity vector to complete the alignment of the first camera coordinate system and the world coordinate system.
  • the transformation matrix between the second camera coordinate system and the world coordinate system can also be obtained to achieve alignment of the first camera coordinate system, the second camera coordinate system, and the world coordinate system. Therefore, the coordinate system alignment result can be applied to the above-mentioned posture determination process of the present disclosure.
  • this exemplary embodiment also provides a posture determination device, which is configured in a terminal device, and the terminal device is also configured with a first camera and at least one second camera.
  • FIG11 schematically shows a block diagram of a posture determination device according to an exemplary embodiment of the present disclosure.
  • the posture determination device 11 may include a first feature point determination module 111 , a second feature point determination module 113 , a feature point conversion module 115 , and a posture determination module 117 .
  • the first feature point determination module 111 can be used to obtain the current frame color image captured by the first camera, and determine the first two-dimensional feature points on the current frame color image captured by the first camera that match the previous frame color image captured by the first camera;
  • the second feature point determination module 113 can be used to obtain the current frame color image captured by the second camera, and determine the second two-dimensional feature points on the current frame color image captured by the second camera that match the previous frame color image captured by the second camera;
  • the feature point conversion module 115 can be used to use the conversion matrix between the first camera coordinate system of the first camera and the second camera coordinate system of the second camera to convert the second two-dimensional feature points into third two-dimensional feature points in the first camera coordinate system;
  • the posture determination module 117 can be used to determine the posture of the first camera when capturing the current frame color image based on the first two-dimensional feature points, the third two-dimensional feature points, the three-dimensional feature points of the previous frame color image captured by the first camera in the world coordinate system, and the three-dimensional feature points of the
  • the first feature point determination module 111 can be configured to perform: extracting feature points of the current frame color image captured by the first camera; performing optical flow tracking using the feature points of the current frame color image captured by the first camera and the feature points of the previous frame color image captured by the first camera to determine the first two-dimensional feature points.
  • the feature point conversion module 115 can be configured to perform: obtaining the conversion matrix between the first camera coordinate system of the first camera and the second camera coordinate system of the second camera and the depth information of the second two-dimensional feature point; determining the third two-dimensional feature point based on the conversion matrix, the depth information of the second two-dimensional feature point and the second two-dimensional feature point.
  • the feature point conversion module 115 may be configured to perform: multiplying the conversion matrix, the depth information of the second two-dimensional feature point, and the second two-dimensional feature point, and normalizing the multiplication result to determine a third two-dimensional feature point.
  • the first two-dimensional feature point and the third two-dimensional feature point constitute two-dimensional coordinate information
  • the three-dimensional feature points of the previous frame color image captured by the first camera in the world coordinate system and the three-dimensional feature points of the previous frame color image captured by the second camera in the world coordinate system constitute three-dimensional coordinate information
  • the posture determination module 117 can be configured to perform: associating the two-dimensional coordinate information with the three-dimensional coordinate information to obtain point pair information; solving the perspective n-point problem using the point pair information, and determining the posture of the first camera when capturing the current frame color image in combination with the solution result.
  • the position and posture determining apparatus 12 may further include a third feature point determining module 121 .
  • the third feature point determination module 121 can be configured to execute: obtaining the last frame of color image collected by the first camera, extracting the feature points of the last frame of color image collected by the first camera; The last frame of depth image aligned with the color image is spatially projected on the feature points of the last frame of color image acquired by the first camera to obtain the three-dimensional feature points of the last frame of color image acquired by the first camera in the first camera coordinate system; according to the posture of the first camera when acquiring the last frame of color image, the three-dimensional feature points in the first camera coordinate system are transformed to obtain the three-dimensional feature points of the last frame of color image acquired by the first camera in the world coordinate system.
  • the third feature point determination module 121 can be configured to execute: utilizing a previous frame depth image aligned with a previous frame color image captured by the first camera, and spatially projecting feature points within a predetermined depth range among the feature points of the previous frame color image captured by the first camera to obtain three-dimensional feature points of the previous frame color image captured by the first camera in the first camera coordinate system; wherein the predetermined depth range is determined based on the range of the depth measurement.
  • the third feature point determination module 121 can also be configured to execute: obtaining the previous frame of color image captured by the second camera, and extracting the feature points of the previous frame of color image captured by the second camera; using the previous frame of depth image aligned with the previous frame of color image captured by the second camera, spatially projecting the feature points of the previous frame of color image captured by the second camera to obtain the three-dimensional feature points of the previous frame of color image captured by the second camera in the second camera coordinate system; using the transformation matrix between the first camera coordinate system and the second camera coordinate system to convert the three-dimensional feature points of the previous frame of color image captured by the second camera in the second camera coordinate system into the three-dimensional feature points in the first camera coordinate system; according to the posture of the first camera when capturing the previous frame of color image, converting the three-dimensional feature points in the first camera coordinate system to obtain the three-dimensional feature points of the previous frame of color image captured by the second camera in the world coordinate system.
  • the third feature point determination module 121 can also be configured to execute: utilizing a previous frame depth image aligned with a previous frame color image captured by the second camera, spatially projecting feature points within a predetermined depth range among the feature points of the previous frame color image captured by the second camera, so as to obtain three-dimensional feature points of the previous frame color image captured by the second camera in the second camera coordinate system; wherein the predetermined depth range is determined based on the range of the depth measurement.
  • the position and posture determining apparatus 13 may further include a positioning initialization module 131 .
  • the positioning initialization module 131 can be configured to execute: obtaining an initial frame color image captured by the first camera, and extracting feature points of the initial frame color image captured by the first camera; using an initial frame depth image aligned with the initial frame color image captured by the first camera, spatially projecting the feature points of the initial frame color image captured by the first camera to obtain three-dimensional feature points of the initial frame color image captured by the first camera in the first camera coordinate system; determining an initial positioning result of the first camera in the first camera coordinate system based on the three-dimensional feature points, initial rotation matrix and initial translation vector of the initial frame color image captured by the first camera in the first camera coordinate system; using a transformation matrix between the first camera coordinate system and the world coordinate system, transforming the initial positioning result of the first camera in the first camera coordinate system to determine the posture of the first camera when the initial frame color image is captured.
  • the posture determination device 14 may further include a transformation matrix determination module 141 .
  • the transformation matrix determination module 141 may be configured to execute: obtaining a reference depth image output by the first camera; determining a specified plane in combination with the reference depth image output by the first camera, and converting the specified plane into a normal plane according to the normal plane.
  • the volume and the gravity vector determine the transformation matrix between the first camera coordinate system and the world coordinate system.
  • the transformation matrix determination module 141 can be configured to perform: determining a reference point cloud corresponding to the first camera in combination with a reference depth image output by the first camera; extracting plane information of the reference point cloud; and filtering a specified plane according to the plane information of the reference point cloud.
  • the process of determining the reference point cloud by the transformation matrix determination module 141 can be configured to perform: for each pixel point on the reference depth image output by the first camera, determine the three-dimensional space point of the pixel point according to the pixel point, the depth value of the pixel point and the camera intrinsic parameters of the first camera; and construct a reference point cloud corresponding to the first camera in combination with the three-dimensional space point of each pixel point on the reference depth image output by the first camera.
  • the process of determining the reference point cloud by the transformation matrix determination module 141 can also be configured to execute: obtaining a reference depth image output by the second camera; determining the three-dimensional space point of each pixel point on the reference depth image output by the second camera; transforming the three-dimensional space point of each pixel point on the reference depth image output by the second camera according to the transformation matrix between the first camera coordinate system and the second camera coordinate system to obtain the transformed three-dimensional space point; merging the three-dimensional space point of each pixel point on the reference depth image output by the first camera with the transformed three-dimensional space point to construct a reference point cloud corresponding to the first camera.
  • the process of selecting the designated plane by the transformation matrix determination module 141 may be configured to perform: selecting the designated plane according to the distance information of the plane from the first camera included in the plane information of the reference point cloud.
  • the process of the transformation matrix determination module 141 screening the designated plane can be configured to perform: when the distance information contains a distance within a predetermined distance range, determining a candidate plane corresponding to the distance in the distance information within the predetermined distance range; when the number of candidate planes is one, determining the candidate plane as the designated plane; when the number of candidate planes is multiple, determining the candidate plane whose distance from the first camera is closest to a distance threshold as the designated plane; wherein the distance threshold is within the predetermined distance range.
  • the designated plane is a ground plane.
  • FIG15 shows a schematic diagram of an electronic device suitable for implementing an exemplary embodiment of the present disclosure.
  • the terminal device of the exemplary embodiment of the present disclosure may be configured as shown in FIG15. It should be noted that the electronic device shown in FIG15 is only an example and should not bring any limitation to the functions and scope of use of the embodiments of the present disclosure.
  • the electronic device of the present disclosure includes at least a processor and a memory, and the memory is used to store one or more programs.
  • the processor can implement the posture determination method of the exemplary embodiment of the present disclosure.
  • the electronic device 150 at least includes: a processor 1510, an internal memory 1521, an external memory interface 1522, a Universal Serial Bus (USB) interface 1530, a charging management module 1540, a power management module 1541, a battery 1542, an antenna, a wireless communication module 1550, an audio module 1560, a display screen 1570, a sensor module 1580, a camera module 1590, etc.
  • a processor 1510 an internal memory 1521, an external memory interface 1522, a Universal Serial Bus (USB) interface 1530
  • a charging management module 1540 a power management module 1541, a battery 1542, an antenna, a wireless communication module 1550, an audio module 1560, a display screen 1570, a sensor module 1580, a camera module 1590, etc.
  • USB Universal Serial Bus
  • the sensor module 1580 may include a depth sensor, a pressure sensor, a gyroscope sensor, an air pressure sensor, a magnetic sensor, an acceleration sensor, a distance sensor, a proximity light sensor, a fingerprint sensor, a temperature sensor, a touch sensor, an ambient light sensor, a bone conduction sensor, etc.
  • the structure illustrated in the embodiment of the present disclosure does not constitute a specific limitation on the electronic device 150.
  • the electronic device 150 may include more or fewer components than shown in the figure, or combine some components, or split some components, or arrange the components differently.
  • the illustrated components may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 1510 may include one or more processing units, for example, the processor 1510 may include an application processor (AP), a modem processor, a graphics processor (GPU), an image signal processor (ISP), a controller, a video codec, a digital signal processor (DSP), a baseband processor and/or a neural network processor (NPU). Different processing units may be independent devices or integrated in one or more processors.
  • a memory may be provided in the processor 1510 for storing instructions and data.
  • the electronic device 150 can implement the shooting function through the ISP, the camera module 1590, the video codec, the GPU, the display screen 1570 and the application processor.
  • the electronic device 150 may include at least two camera modules 1590.
  • one camera module is determined as the reference camera, and the feature data collected by the other camera modules is transferred to the coordinate system of the reference camera for processing.
  • the electronic device 150 is configured with two Realsense D455 cameras.
  • the internal memory 1521 can be used to store computer executable program codes, which include instructions.
  • the internal memory 1521 can include a program storage area and a data storage area.
  • the external memory interface 1522 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 150.
  • the present disclosure also provides a computer-readable storage medium, which may be included in the electronic device described in the above embodiments; or may exist independently without being assembled into the electronic device.
  • Computer-readable storage media may be, for example, but not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or components, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memories (RAM), read-only memories (ROM), erasable programmable read-only memories (EPROM or flash memory), optical fibers, portable compact disk read-only memories (CD-ROMs), optical storage devices, magnetic storage devices, or any suitable combination thereof.
  • computer-readable storage media may be any tangible medium containing or storing a program that may be used by or in conjunction with an instruction execution system, device, or device.
  • Computer-readable storage media can send, propagate or transmit programs for use by or in conjunction with an instruction execution system, apparatus or device.
  • the program code contained on the computer-readable storage medium can be transmitted using any appropriate medium, including but not limited to: wireless, wire, optical cable, RF, etc., or any suitable combination of the above.
  • the computer-readable storage medium carries one or more programs.
  • the electronic device implements the method described in the embodiments of the present disclosure.
  • each box in the flowchart or block diagram may represent a module, a program segment, or a portion of code, which contains one or more executable instructions for implementing the specified logical functions. It should also be noted that in some alternative implementations, the box may be a module, a program segment, or a portion of code.
  • the functions noted in the figures may also occur in a different order than that noted in the figures. For example, two blocks shown in succession may actually be executed substantially in parallel, or they may sometimes be executed in the opposite order, depending on the functions involved.
  • each block in a block diagram or flow chart, and combinations of blocks in a block diagram or flow chart may be implemented by a dedicated hardware-based system that performs the specified functions or operations, or may be implemented by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure may be implemented by software or hardware, and the units described may also be arranged in a processor.
  • the names of these units do not constitute limitations on the units themselves in some cases.
  • the technical solution according to the implementation of the present disclosure can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a USB flash drive, a mobile hard disk, etc.) or on a network, including several instructions to enable a computing device (which can be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the implementation of the present disclosure.
  • a non-volatile storage medium which can be a CD-ROM, a USB flash drive, a mobile hard disk, etc.
  • a computing device which can be a personal computer, a server, a terminal device, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

A pose determination method, comprising: according to a first two-dimensional feature point, matching a previous frame of color image collected by a first camera, on a current frame of color image collected by the first camera, according to a third two-dimensional feature point in a first camera coordinate system converted from a second two-dimensional feature point, matching a previous frame of color image collected by a second camera, on a current frame of color image collected by the second camera, and according to three-dimensional feature points, in a world coordinate system, of the previous frames of color images respectively collected by the first camera and the second camera, determining the pose of the first camera when the first camera collects the current frame of color image.

Description

位姿确定方法及装置、计算机可读存储介质和电子设备Method and device for determining posture, computer-readable storage medium, and electronic device
本申请要求于2022年10月28日提交专利局、申请号为202211337302.9、申请名称为“位姿确定方法及装置、计算机可读存储介质和电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to a Chinese patent application filed with the Patent Office on October 28, 2022, with application number 202211337302.9 and application name “Posture determination method and device, computer-readable storage medium and electronic device”, the entire contents of which are incorporated by reference in this application.
技术领域Technical Field
本公开涉及计算机视觉技术领域,具体而言,涉及一种位姿确定方法、位姿确定装置、计算机可读存储介质和电子设备。The present disclosure relates to the field of computer vision technology, and in particular to a posture determination method, a posture determination device, a computer-readable storage medium, and an electronic device.
背景技术Background technique
在计算机视觉技术领域,视觉定位是一种利用相机拍摄的图像进行定位以确定相机在真实世界中位姿的技术,其在增强现实、虚拟现实、机器人、智能交通等领域均具有重要的应用价值。In the field of computer vision technology, visual positioning is a technology that uses images taken by a camera to determine the camera's position in the real world. It has important application value in augmented reality, virtual reality, robotics, intelligent transportation and other fields.
发明内容Summary of the invention
本公开提供一种位姿确定方法、位姿确定装置、计算机可读存储介质和电子设备。The present disclosure provides a posture determination method, a posture determination device, a computer-readable storage medium, and an electronic device.
根据本公开的第一方面,提供了一种位姿确定方法,应用于终端设备,终端设备配置有第一相机和至少一个第二相机,该位姿确定方法包括:获取第一相机采集的当前帧彩色图像,确定第一相机采集的当前帧彩色图像上与第一相机采集的上一帧彩色图像匹配的第一二维特征点;获取第二相机采集的当前帧彩色图像,确定第二相机采集的当前帧彩色图像上与第二相机采集的上一帧彩色图像匹配的第二二维特征点;利用第一相机的第一相机坐标系与第二相机的第二相机坐标系之间的转换矩阵将第二二维特征点转换为第一相机坐标系下的第三二维特征点;根据第一二维特征点、第三二维特征点、第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点以及第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点,确定第一相机采集当前帧彩色图像时的位姿。According to a first aspect of the present disclosure, a posture determination method is provided, which is applied to a terminal device, wherein the terminal device is configured with a first camera and at least one second camera, and the posture determination method includes: obtaining a current frame color image captured by the first camera, and determining first two-dimensional feature points on the current frame color image captured by the first camera that match a previous frame color image captured by the first camera; obtaining a current frame color image captured by the second camera, and determining second two-dimensional feature points on the current frame color image captured by the second camera that match a previous frame color image captured by the second camera; converting the second two-dimensional feature points into third two-dimensional feature points in the first camera coordinate system by using a conversion matrix between a first camera coordinate system of the first camera and a second camera coordinate system of the second camera; and determining the posture of the first camera when capturing the current frame color image according to the first two-dimensional feature points, the third two-dimensional feature points, the three-dimensional feature points of the previous frame color image captured by the first camera in the world coordinate system, and the three-dimensional feature points of the previous frame color image captured by the second camera in the world coordinate system.
根据本公开的第二方面,提供了一种位姿确定装置,配置于终端设备,终端设备还配置有第一相机和至少一个第二相机,该位姿确定装置包括:第一特征点确定模块,用于获取第一相机采集的当前帧彩色图像,确定第一相机采集的当前帧彩色图像上与第一相机采集的上一帧彩色图像匹配的第一二维特征点;第二特征点确定模块,用于获取第二相机采集的当前帧彩色图像,确定第二相机采集的当前帧彩色图像上与第二相机采集的上一帧彩色图像匹配的第二二维特征点;特征点转换模块,用于利用第一相机的第一相机坐标系与第二相机的第二相机坐标系之间的转换矩阵将第二二维特征点转换为第一相机坐标系下的第三二维特征点;位姿确定模块,用于根据第一二维特征点、第三二维特征点、第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点以及第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点,确定第一相机采集当前帧彩色图像时的位姿。According to a second aspect of the present disclosure, a posture determination device is provided, which is configured in a terminal device, and the terminal device is further configured with a first camera and at least one second camera. The posture determination device includes: a first feature point determination module, which is used to obtain a current frame color image captured by the first camera, and determine a first two-dimensional feature point on the current frame color image captured by the first camera that matches a previous frame color image captured by the first camera; a second feature point determination module, which is used to obtain a current frame color image captured by the second camera, and determine a second two-dimensional feature point on the current frame color image captured by the second camera that matches a previous frame color image captured by the second camera; a feature point conversion module, which is used to convert the second two-dimensional feature point into a third two-dimensional feature point in the first camera coordinate system by using a conversion matrix between a first camera coordinate system of the first camera and a second camera coordinate system of the second camera; and a posture determination module, which is used to determine the posture of the first camera when capturing the current frame color image according to the first two-dimensional feature point, the third two-dimensional feature point, the three-dimensional feature point of the previous frame color image captured by the first camera in the world coordinate system, and the three-dimensional feature point of the previous frame color image captured by the second camera in the world coordinate system.
根据本公开的第三方面,提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述的位姿确定方法。According to a third aspect of the present disclosure, a computer-readable storage medium is provided, on which a computer program is stored, and when the program is executed by a processor, the above-mentioned posture determination method is implemented.
根据本公开的第四方面,提供了一种电子设备,包括处理器;存储器,用于存储一个 或多个程序,当一个或多个程序被处理器执行时,使得处理器实现上述的位姿确定方法。According to a fourth aspect of the present disclosure, there is provided an electronic device, comprising a processor; a memory for storing a or multiple programs, when one or more programs are executed by a processor, the processor implements the above-mentioned posture determination method.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。在附图中:The drawings herein are incorporated into the specification and constitute a part of the specification, showing embodiments consistent with the present disclosure, and together with the specification, are used to explain the principles of the present disclosure. Obviously, the drawings described below are only some embodiments of the present disclosure, and for ordinary technicians in this field, other drawings can be obtained based on these drawings without creative work. In the drawings:
图1示出了本公开实施例的位姿确定系统的系统架构的示意图;FIG1 is a schematic diagram showing a system architecture of a posture determination system according to an embodiment of the present disclosure;
图2示出了本公开实施例的双相机在终端设备上的摆放方式的示意图;FIG2 is a schematic diagram showing a placement of dual cameras on a terminal device according to an embodiment of the present disclosure;
图3示出了本公开实施例的双相机的摆放角度的示意图;FIG3 is a schematic diagram showing the placement angles of the dual cameras according to an embodiment of the present disclosure;
图4示出了本公开实施例的位姿确定方案所涉及的各个处理阶段的示意图;FIG4 is a schematic diagram showing various processing stages involved in the posture determination solution of an embodiment of the present disclosure;
图5示意性示出了本公开示例性实施方式的位姿确定方法的流程图;FIG5 schematically shows a flow chart of a method for determining a posture according to an exemplary embodiment of the present disclosure;
图6示出了本公开实施例的双相机点对匹配的示意图;FIG6 is a schematic diagram showing dual-camera point pair matching according to an embodiment of the present disclosure;
图7示出了本公开实施例的定位初始化的过程的流程图;FIG7 shows a flowchart of a positioning initialization process according to an embodiment of the present disclosure;
图8示出了本公开实施例的确定两个平面的示意图;FIG8 is a schematic diagram showing a method of determining two planes according to an embodiment of the present disclosure;
图9示出了本公开实施例的确定地平面的示意图;FIG9 is a schematic diagram showing a method of determining a ground plane according to an embodiment of the present disclosure;
图10示出了本公开实施例的确定第一相机坐标系与世界坐标系之间的转换矩阵的过程的流程图;FIG10 is a flowchart showing a process of determining a transformation matrix between a first camera coordinate system and a world coordinate system according to an embodiment of the present disclosure;
图11示意性示出了本公开第一示例性实施方式的位姿确定装置的方框图;FIG11 schematically shows a block diagram of a posture determination apparatus according to a first exemplary embodiment of the present disclosure;
图12示意性示出了本公开第二示例性实施方式的位姿确定装置的方框图;FIG12 schematically shows a block diagram of a posture determination apparatus according to a second exemplary embodiment of the present disclosure;
图13示意性示出了本公开第三示例性实施方式的位姿确定装置的方框图;FIG13 schematically shows a block diagram of a posture determination apparatus according to a third exemplary embodiment of the present disclosure;
图14示意性示出了本公开第四示例性实施方式的位姿确定装置的方框图;FIG14 schematically shows a block diagram of a posture determination apparatus according to a fourth exemplary embodiment of the present disclosure;
图15示意性示出了根据本公开的示例性实施方式的电子设备的方框图。FIG. 15 schematically shows a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
具体实施方式Detailed ways
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本公开将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施方式中。在下面的描述中,提供许多具体细节从而给出对本公开的实施方式的充分理解。然而,本领域技术人员将意识到,可以实践本公开的技术方案而省略所述特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知技术方案以避免喧宾夺主而使得本公开的各方面变得模糊。Example embodiments will now be described more fully with reference to the accompanying drawings. However, example embodiments can be implemented in a variety of forms and should not be construed as being limited to the examples set forth herein; on the contrary, these embodiments are provided so that the present disclosure will be more comprehensive and complete, and the concepts of the example embodiments are fully conveyed to those skilled in the art. The described features, structures, or characteristics may be combined in one or more embodiments in any suitable manner. In the following description, many specific details are provided to provide a full understanding of the embodiments of the present disclosure. However, those skilled in the art will appreciate that the technical solutions of the present disclosure may be practiced while omitting one or more of the specific details, or other methods, components, devices, steps, etc. may be adopted. In other cases, known technical solutions are not shown or described in detail to avoid obscuring various aspects of the present disclosure.
此外,附图仅为本公开的示意性图解,并非一定是按比例绘制。图中相同的附图标记表示相同或类似的部分,因而将省略对它们的重复描述。附图中所示的一些方框图是功能实体,不一定必须与物理或逻辑上独立的实体相对应。可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理 器装置和/或微控制器装置中实现这些功能实体。In addition, the drawings are only schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings represent the same or similar parts, and their repeated description will be omitted. Some of the block diagrams shown in the drawings are functional entities and do not necessarily correspond to physically or logically independent entities. These functional entities can be implemented in software form, or in one or more hardware modules or integrated circuits, or in different networks and/or processing. These functional entities are implemented in a device device and/or a microcontroller device.
附图中所示的流程图仅是示例性说明,不是必须包括所有的步骤。例如,有的步骤还可以分解,而有的步骤可以合并或部分合并,因此实际执行的顺序有可能根据实际情况改变。另外,下面所有的术语“第一”、“第二”、“第三”等仅是为了区分的目的,不应作为本公开内容的限制。The flowcharts shown in the accompanying drawings are only exemplary and do not necessarily include all the steps. For example, some steps may be decomposed, while some steps may be combined or partially combined, so the actual execution order may change according to the actual situation. In addition, all the terms "first", "second", "third", etc. below are only for the purpose of distinction and should not be used as limitations of the present disclosure.
通过视觉定位技术,使得计算机设备可以自主感知自身在环境中的位姿状态,以便执行跟踪、监控、交互、显示画面、播放音频等任意用户提出的任务。定位的精确程度极大影响计算机设备功能的实现。Through visual positioning technology, computer devices can autonomously perceive their own position in the environment, so as to perform any tasks proposed by the user, such as tracking, monitoring, interaction, displaying images, playing audio, etc. The accuracy of positioning greatly affects the realization of computer device functions.
为了挺高设备视觉定位的精确程度,本公开实施方式提供了一种新的定位方案。In order to improve the accuracy of device visual positioning, the embodiments of the present disclosure provide a new positioning solution.
本申请实施例提供一种位姿确定方法,应用于终端设备,所述终端设备配置有第一相机和至少一个第二相机,所述位姿确定方法包括:The present application provides a posture determination method, which is applied to a terminal device, wherein the terminal device is configured with a first camera and at least one second camera, and the posture determination method includes:
获取所述第一相机采集的当前帧彩色图像,确定所述第一相机采集的当前帧彩色图像上与所述第一相机采集的上一帧彩色图像匹配的第一二维特征点;Acquire a current frame color image acquired by the first camera, and determine a first two-dimensional feature point on the current frame color image acquired by the first camera that matches a previous frame color image acquired by the first camera;
获取所述第二相机采集的当前帧彩色图像,确定所述第二相机采集的当前帧彩色图像上与所述第二相机采集的上一帧彩色图像匹配的第二二维特征点;Acquire a current frame color image acquired by the second camera, and determine a second two-dimensional feature point on the current frame color image acquired by the second camera that matches a previous frame color image acquired by the second camera;
利用所述第一相机的第一相机坐标系与所述第二相机的第二相机坐标系之间的转换矩阵将所述第二二维特征点转换为所述第一相机坐标系下的第三二维特征点;Convert the second two-dimensional feature points to third two-dimensional feature points in the first camera coordinate system using a conversion matrix between a first camera coordinate system of the first camera and a second camera coordinate system of the second camera;
根据所述第一二维特征点、所述第三二维特征点、所述第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点以及所述第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点,确定所述第一相机采集当前帧彩色图像时的位姿。The posture of the first camera when capturing the current frame of color image is determined based on the first two-dimensional feature points, the third two-dimensional feature points, the three-dimensional feature points of the previous frame of color image captured by the first camera in the world coordinate system, and the three-dimensional feature points of the previous frame of color image captured by the second camera in the world coordinate system.
在一实施例中,确定所述第一相机采集的当前帧彩色图像上与所述第一相机采集的上一帧彩色图像匹配的第一二维特征点包括:In one embodiment, determining a first two-dimensional feature point on a current frame color image acquired by the first camera that matches a previous frame color image acquired by the first camera includes:
提取所述第一相机采集的当前帧彩色图像的特征点;Extracting feature points of the current frame color image acquired by the first camera;
利用所述第一相机采集的当前帧彩色图像的特征点以及所述第一相机采集的上一帧彩色图像的特征点进行光流跟踪,以确定出所述第一二维特征点。Optical flow tracking is performed using feature points of a current frame color image captured by the first camera and feature points of a previous frame color image captured by the first camera to determine the first two-dimensional feature points.
在一实施例中,利用所述第一相机的第一相机坐标系与所述第二相机的第二相机坐标系之间的转换矩阵将所述第二二维特征点转换为所述第一相机坐标系下的第三二维特征点包括:In one embodiment, using a transformation matrix between a first camera coordinate system of the first camera and a second camera coordinate system of the second camera to transform the second two-dimensional feature points into third two-dimensional feature points in the first camera coordinate system includes:
获取所述第一相机的第一相机坐标系与所述第二相机的第二相机坐标系之间的转换矩阵以及所述第二二维特征点的深度信息;Acquire a transformation matrix between a first camera coordinate system of the first camera and a second camera coordinate system of the second camera and depth information of the second two-dimensional feature point;
根据所述转换矩阵、所述第二二维特征点的深度信息以及所述第二二维特征点确定所述第三二维特征点。The third two-dimensional feature point is determined according to the conversion matrix, the depth information of the second two-dimensional feature point, and the second two-dimensional feature point.
在一实施例中,根据所述转换矩阵、所述第二二维特征点的深度信息以及所述第二二维特征点确定所述第三二维特征点包括:In one embodiment, determining the third two-dimensional feature point according to the conversion matrix, the depth information of the second two-dimensional feature point, and the second two-dimensional feature point includes:
将所述转换矩阵、所述第二二维特征点的深度信息以及所述第二二维特征点相乘,并对相乘的结果进行归一化处理,以确定出所述第三二维特征点。 The conversion matrix, the depth information of the second two-dimensional feature point, and the second two-dimensional feature point are multiplied, and a result of the multiplication is normalized to determine the third two-dimensional feature point.
在一实施例中,所述第一二维特征点和所述第三二维特征点组成二维坐标信息,所述第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点和所述第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点组成三维坐标信息;其中,根据所述第一二维特征点、所述第三二维特征点、所述第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点以及所述第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点,确定所述第一相机采集当前帧彩色图像时的位姿,包括:In one embodiment, the first two-dimensional feature points and the third two-dimensional feature points constitute two-dimensional coordinate information, and the three-dimensional feature points of the previous frame color image acquired by the first camera in the world coordinate system and the three-dimensional feature points of the previous frame color image acquired by the second camera in the world coordinate system constitute three-dimensional coordinate information; wherein, according to the first two-dimensional feature points, the third two-dimensional feature points, the three-dimensional feature points of the previous frame color image acquired by the first camera in the world coordinate system, and the three-dimensional feature points of the previous frame color image acquired by the second camera in the world coordinate system, determining the position and posture of the first camera when acquiring the current frame color image includes:
将所述二维坐标信息与所述三维坐标信息关联,以得到点对信息;Associating the two-dimensional coordinate information with the three-dimensional coordinate information to obtain point pair information;
利用所述点对信息求解透视n点问题,并结合求解结果确定所述第一相机采集当前帧彩色图像时的位姿。The point pair information is used to solve the perspective n-point problem, and the pose of the first camera when capturing the current frame color image is determined in combination with the solution result.
在一实施例中,所述位姿确定方法还包括:In one embodiment, the posture determination method further includes:
获取所述第一相机采集的上一帧彩色图像,提取所述第一相机采集的上一帧彩色图像的特征点;Acquire a previous frame of color image captured by the first camera, and extract feature points of the previous frame of color image captured by the first camera;
利用与所述第一相机采集的上一帧彩色图像对齐的上一帧深度图像,对所述第一相机采集的上一帧彩色图像的特征点进行空间投射,以得到所述第一相机采集的上一帧彩色图像在所述第一相机坐标系下的三维特征点;Using a previous frame of depth image aligned with the previous frame of color image acquired by the first camera, spatially projecting feature points of the previous frame of color image acquired by the first camera to obtain three-dimensional feature points of the previous frame of color image acquired by the first camera in the first camera coordinate system;
根据所述第一相机采集上一帧彩色图像时的位姿,对所述第一相机坐标系下的三维特征点进行转换,以得到所述第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点。According to the posture of the first camera when capturing the last frame of color image, the three-dimensional feature points in the first camera coordinate system are transformed to obtain the three-dimensional feature points of the last frame of color image captured by the first camera in the world coordinate system.
在一实施例中,利用与所述第一相机采集的上一帧彩色图像对齐的上一帧深度图像,对所述第一相机采集的上一帧彩色图像的特征点进行空间投射,以得到所述第一相机采集的上一帧彩色图像在所述第一相机坐标系下的三维特征点,包括:In one embodiment, using a previous frame of depth image aligned with a previous frame of color image acquired by the first camera, spatially projecting feature points of the previous frame of color image acquired by the first camera to obtain three-dimensional feature points of the previous frame of color image acquired by the first camera in the first camera coordinate system includes:
利用与所述第一相机采集的上一帧彩色图像对齐的上一帧深度图像,对所述第一相机采集的上一帧彩色图像的特征点中处于预定深度范围内的特征点进行空间投射,以得到所述第一相机采集的上一帧彩色图像在所述第一相机坐标系下的三维特征点;Using a previous frame of depth image aligned with the previous frame of color image acquired by the first camera, spatially projecting feature points within a predetermined depth range among feature points of the previous frame of color image acquired by the first camera, so as to obtain three-dimensional feature points of the previous frame of color image acquired by the first camera in the first camera coordinate system;
其中,所述预定深度范围基于深度测量的量程确定出。The predetermined depth range is determined based on a range of depth measurement.
在一实施例中,所述位姿确定方法还包括:In one embodiment, the posture determination method further includes:
获取所述第二相机采集的上一帧彩色图像,提取所述第二相机采集的上一帧彩色图像的特征点;Acquire a previous frame of color image captured by the second camera, and extract feature points of the previous frame of color image captured by the second camera;
利用与所述第二相机采集的上一帧彩色图像对齐的上一帧深度图像,对所述第二相机采集的上一帧彩色图像的特征点进行空间投射,以得到所述第二相机采集的上一帧彩色图像在所述第二相机坐标系下的三维特征点;Using a previous frame of depth image aligned with a previous frame of color image acquired by the second camera, spatially projecting feature points of the previous frame of color image acquired by the second camera to obtain three-dimensional feature points of the previous frame of color image acquired by the second camera in the second camera coordinate system;
利用所述第一相机坐标系与所述第二相机坐标系之间的转换矩阵将所述第二相机采集的上一帧彩色图像在所述第二相机坐标系下的三维特征点转换为所述第一相机坐标系下的三维特征点;Using a transformation matrix between the first camera coordinate system and the second camera coordinate system, the three-dimensional feature points of the previous frame of color image acquired by the second camera in the second camera coordinate system are transformed into three-dimensional feature points in the first camera coordinate system;
根据所述第一相机采集上一帧彩色图像时的位姿,对所述第一相机坐标系下的三维特征点进行转换,以得到所述第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点。According to the posture of the first camera when capturing the last frame of color image, the three-dimensional feature points in the first camera coordinate system are transformed to obtain the three-dimensional feature points of the last frame of color image captured by the second camera in the world coordinate system.
在一实施例中,利用与所述第二相机采集的上一帧彩色图像对齐的上一帧深度图像, 对所述第二相机采集的上一帧彩色图像的特征点进行空间投射,以得到所述第二相机采集的上一帧彩色图像在所述第二相机坐标系下的三维特征点,包括:In one embodiment, using a previous frame of depth image aligned with a previous frame of color image acquired by the second camera, Performing spatial projection on feature points of the previous frame of color image acquired by the second camera to obtain three-dimensional feature points of the previous frame of color image acquired by the second camera in the second camera coordinate system includes:
利用与所述第二相机采集的上一帧彩色图像对齐的上一帧深度图像,对所述第二相机采集的上一帧彩色图像的特征点中处于预定深度范围内的特征点进行空间投射,以得到所述第二相机采集的上一帧彩色图像在所述第二相机坐标系下的三维特征点;Using a previous frame of depth image aligned with the previous frame of color image acquired by the second camera, spatially projecting feature points within a predetermined depth range among feature points of the previous frame of color image acquired by the second camera, so as to obtain three-dimensional feature points of the previous frame of color image acquired by the second camera in the second camera coordinate system;
其中,所述预定深度范围基于深度测量的量程确定出。The predetermined depth range is determined based on a range of depth measurement.
在一实施例中,所述位姿确定方法还包括:In one embodiment, the posture determination method further includes:
获取所述第一相机采集的初始帧彩色图像,提取所述第一相机采集的初始帧彩色图像的特征点;Acquire an initial frame color image captured by the first camera, and extract feature points of the initial frame color image captured by the first camera;
利用与所述第一相机采集的初始帧彩色图像对齐的初始帧深度图像,对所述第一相机采集的初始帧彩色图像的特征点进行空间投射,以得到所述第一相机采集的初始帧彩色图像在所述第一相机坐标系下的三维特征点;Using the initial frame depth image aligned with the initial frame color image acquired by the first camera, spatially projecting the feature points of the initial frame color image acquired by the first camera to obtain three-dimensional feature points of the initial frame color image acquired by the first camera in the first camera coordinate system;
根据所述第一相机采集的初始帧彩色图像在所述第一相机坐标系下的三维特征点、初始旋转矩阵和初始平移向量,确定所述第一相机在所述第一相机坐标系下的初始定位结果;Determine an initial positioning result of the first camera in the first camera coordinate system according to the three-dimensional feature points, the initial rotation matrix, and the initial translation vector of the initial frame color image acquired by the first camera in the first camera coordinate system;
利用所述第一相机坐标系与所述世界坐标系之间的转换矩阵,对所述第一相机在所述第一相机坐标系下的初始定位结果进行转换,以确定出所述第一相机采集初始帧彩色图像时的位姿。The initial positioning result of the first camera in the first camera coordinate system is transformed by using the transformation matrix between the first camera coordinate system and the world coordinate system, so as to determine the position and posture of the first camera when capturing the initial frame color image.
在一实施例中,所述位姿确定方法还包括:In one embodiment, the posture determination method further includes:
获取所述第一相机输出的参考深度图像;Acquire a reference depth image output by the first camera;
在结合所述第一相机输出的参考深度图像确定出指定平面的情况下,根据所述指定平面的法向量和重力向量确定所述第一相机坐标系与所述世界坐标系之间的转换矩阵。When a designated plane is determined in combination with the reference depth image output by the first camera, a transformation matrix between the first camera coordinate system and the world coordinate system is determined according to a normal vector and a gravity vector of the designated plane.
在一实施例中,所述位姿确定方法还包括:In one embodiment, the posture determination method further includes:
结合所述第一相机输出的参考深度图像,确定出所述第一相机对应的参考点云;Determine a reference point cloud corresponding to the first camera in combination with a reference depth image output by the first camera;
提取所述参考点云的平面信息;Extracting plane information of the reference point cloud;
根据所述参考点云的平面信息筛选所述指定平面。The designated plane is selected according to the plane information of the reference point cloud.
在一实施例中,结合所述第一相机输出的参考深度图像,确定出所述第一相机对应的参考点云,包括:In one embodiment, determining a reference point cloud corresponding to the first camera in combination with a reference depth image output by the first camera includes:
针对所述第一相机输出的参考深度图像上的每一个像素点,根据所述像素点、所述像素点的深度值和所述第一相机的相机内参确定所述像素点的三维空间点;For each pixel point on the reference depth image output by the first camera, determine the three-dimensional space point of the pixel point according to the pixel point, the depth value of the pixel point and the camera intrinsic parameter of the first camera;
结合所述第一相机输出的参考深度图像上的每一个像素点的三维空间点,构建所述第一相机对应的参考点云。A reference point cloud corresponding to the first camera is constructed by combining the three-dimensional space point of each pixel point on the reference depth image output by the first camera.
在一实施例中,结合所述第一相机输出的参考深度图像上的每一个像素点的三维空间点,构建所述第一相机对应的参考点云,包括:In one embodiment, combining the three-dimensional space point of each pixel point on the reference depth image output by the first camera to construct a reference point cloud corresponding to the first camera includes:
获取所述第二相机输出的参考深度图像;Acquire a reference depth image output by the second camera;
确定所述第二相机输出的参考深度图像上每一个像素点的三维空间点;Determine a three-dimensional spatial point of each pixel on the reference depth image output by the second camera;
根据所述第一相机坐标系与所述第二相机坐标系之间的转换矩阵将所述第二相机输出 的参考深度图像上每一个像素点的三维空间点进行转换,以得到转换后的三维空间点;The second camera output is converted into a coordinate system according to the conversion matrix between the first camera coordinate system and the second camera coordinate system. The three-dimensional space point of each pixel on the reference depth image is converted to obtain a converted three-dimensional space point;
将所述第一相机输出的参考深度图像上的每一个像素点的三维空间点与所述转换后的三维空间点合并,以构建出所述第一相机对应的参考点云。The three-dimensional space point of each pixel point on the reference depth image output by the first camera is merged with the converted three-dimensional space point to construct a reference point cloud corresponding to the first camera.
在一实施例中,根据所述参考点云的平面信息筛选所述指定平面包括:In one embodiment, screening the designated plane according to the plane information of the reference point cloud includes:
根据所述参考点云的平面信息中包含的平面距所述第一相机的距离信息筛选所述指定平面。The designated plane is filtered according to distance information between the plane and the first camera included in the plane information of the reference point cloud.
在一实施例中,根据所述参考点云的平面信息中包含的平面距所述第一相机的距离信息筛选所述指定平面包括:In one embodiment, screening the designated plane according to the distance information between the plane and the first camera included in the plane information of the reference point cloud includes:
在所述距离信息中包含预定距离范围内的距离的情况下,确定与所述距离信息中处于所述预定距离范围内的距离对应的候选平面;In a case where the distance information includes a distance within a predetermined distance range, determining a candidate plane corresponding to the distance in the distance information within the predetermined distance range;
在所述候选平面的数量为一个的情况下,将所述候选平面确定为所述指定平面;When the number of the candidate plane is one, determining the candidate plane as the designated plane;
在所述候选平面的数量为多个的情况下,将距所述第一相机的距离最接近距离阈值的候选平面确定为所述指定平面;When there are multiple candidate planes, determine a candidate plane whose distance from the first camera is closest to a distance threshold as the designated plane;
其中,所述距离阈值在所述预定距离范围内。Wherein, the distance threshold is within the predetermined distance range.
在一实施例中,所述指定平面为地平面。In one embodiment, the designated plane is a ground plane.
图1示出了本公开实施例的位姿确定系统的系统架构的示意图。参考图1,终端设备1可以包括处理器100、第一相机110和至少一个第二相机120。FIG1 is a schematic diagram showing a system architecture of a position and posture determination system according to an embodiment of the present disclosure. Referring to FIG1 , a terminal device 1 may include a processor 100 , a first camera 110 , and at least one second camera 120 .
终端设备1可以例如包括机器人、智能监控设备、智能跟踪设备等。其可以是一个设备整体,也可以是由多个实体单元组成的设备系统。The terminal device 1 may include, for example, a robot, an intelligent monitoring device, an intelligent tracking device, etc. It may be a whole device, or a device system composed of multiple entity units.
例如,终端设备1可以是机器狗。机器狗是一种机器人形态,具有灵活、移动能力强等优点,可以实现安防巡逻、运送物品、情感陪伴等任务。For example, the terminal device 1 may be a robot dog. A robot dog is a robot form with advantages such as flexibility and strong mobility, and can perform tasks such as security patrol, transporting items, and emotional companionship.
第一相机110和至少一个第二相机120作为本公开实施方式位姿确定方案的输入传感器,可以将感测到的彩色图像和深度图像传输至处理器100。The first camera 110 and the at least one second camera 120 serve as input sensors of the posture determination solution of the embodiment of the present disclosure, and can transmit the sensed color image and depth image to the processor 100 .
例如,第一相机110和第二相机120可以是RealsenseD455相机。RealsenseD455相机由一个RGB相机、两个IR(红外)相机、一个IR发射器组成。RGB相机输出彩色图像,两个IR相机可以输出与彩色图像对齐的稠密深度图。RealsenseD455相机的FOV(视场角)为水平方向90°、竖直方向65°。For example, the first camera 110 and the second camera 120 may be Realsense D455 cameras. The Realsense D455 camera consists of an RGB camera, two IR (infrared) cameras, and an IR transmitter. The RGB camera outputs a color image, and the two IR cameras may output a dense depth map aligned with the color image. The FOV (field of view) of the Realsense D455 camera is 90° horizontally and 65° vertically.
在终端设备1包括第一相机110和一个第二相机120的情况下,第一相机110可以是左目(left)相机,第二相机120可以是右目(right)相机,在下述实施例中,涉及的左目相机可以理解为是第一相机110,涉及的右目相机可以理解为是第二相机120。然而,应当理解的是,“左”、“右”、“第一”、“第二”仅是为了区分的示例性描述,在本公开另一些实施例中,第一相机110可以是右目相机,第二相机120可以是左目相机,本公开对此不做限制。In the case where the terminal device 1 includes a first camera 110 and a second camera 120, the first camera 110 may be a left camera, and the second camera 120 may be a right camera. In the following embodiments, the left camera involved may be understood as the first camera 110, and the right camera involved may be understood as the second camera 120. However, it should be understood that "left", "right", "first", and "second" are merely exemplary descriptions for distinction. In other embodiments of the present disclosure, the first camera 110 may be a right camera, and the second camera 120 may be a left camera, and the present disclosure does not limit this.
以第一相机110和一个第二相机120共两个相机为例,图2示出了本公开实施例的该双相机在终端设备上的摆放方式的示意图。应当理解的是,图2所示的摆放方式仅是示例性的说明,根据终端设备的类型以及相机配置空间,还可以存在多种摆放方式,本公开对 此不做限制。Taking a first camera 110 and a second camera 120 as an example, FIG2 shows a schematic diagram of the placement of the dual cameras on a terminal device according to an embodiment of the present disclosure. It should be understood that the placement shown in FIG2 is only an exemplary description, and there may be multiple placements according to the type of terminal device and the camera configuration space. This is not a restriction.
图3示出了本公开实施例的双相机的摆放角度的示意图。对于均是竖直摆放的第一相机110和第二相机120,它们的视角均为65°,分别对应图3中的角A和角B。在摆放时,可以将第一相机110的最左边视线与第二相机120的最右边视线平行,此时两个相机可以得到最大视野范围,即130°,对应图3中的角C。第一相机110与第二相机120之间存在狭小的共视区域。按照上述角度的设计,可以确定出第一相机110与第二相机120放置的夹角为115°,对应图3中的角D。FIG3 shows a schematic diagram of the placement angles of the dual cameras of the embodiment of the present disclosure. For the first camera 110 and the second camera 120, both of which are placed vertically, their viewing angles are both 65°, corresponding to angles A and B in FIG3 , respectively. When placed, the leftmost line of sight of the first camera 110 can be parallel to the rightmost line of sight of the second camera 120. At this time, the two cameras can obtain the maximum field of view, that is, 130°, corresponding to angle C in FIG3 . There is a narrow common viewing area between the first camera 110 and the second camera 120. According to the design of the above angles, it can be determined that the angle between the first camera 110 and the second camera 120 is 115°, corresponding to angle D in FIG3 .
由此,将第一相机110与第二相机120按115°夹角竖直并列摆放,两个相机的视场为水平方向130°、竖直方向90°。实现了对两个相机视野的最大化叠加,有效增加了终端设备1的视野,为后续定位算法提供更加充足的准确。Thus, the first camera 110 and the second camera 120 are placed vertically side by side at an angle of 115°, and the fields of view of the two cameras are 130° in the horizontal direction and 90° in the vertical direction. This achieves the maximum superposition of the fields of view of the two cameras, effectively increases the field of view of the terminal device 1, and provides more sufficient accuracy for the subsequent positioning algorithm.
另外,第一相机110与第二相机120支持多相机硬件同步,可以将第一相机110与第二相机120通过导线连接起来,使用同一脉冲信号触发两个相机同时曝光,实现了多个相机的硬件同步。经过硬件同步设置后,输入到后续定位算法中的图像即是同一时刻拍摄出的图像。由此,避免了由于多相机拍摄时刻不一致而造成的额外误差。In addition, the first camera 110 and the second camera 120 support multi-camera hardware synchronization. The first camera 110 and the second camera 120 can be connected by a wire, and the same pulse signal is used to trigger the two cameras to expose simultaneously, thereby realizing hardware synchronization of multiple cameras. After the hardware synchronization setting, the image input into the subsequent positioning algorithm is the image taken at the same time. In this way, additional errors caused by inconsistent shooting time of multiple cameras are avoided.
在通过上述方式摆放第一相机和第二相机之后,可以对两个相机分别进行内外参的标定,以供后续算法使用。本公开对标定的过程不做限制。After placing the first camera and the second camera in the above manner, the internal and external parameters of the two cameras can be calibrated respectively for use by subsequent algorithms. The present disclosure does not limit the calibration process.
在本公开实施方式的位姿确定方案中,处理器100可以获取第一相机110采集的当前帧彩色图像,并确定第一相机110采集的当前帧彩色图像上与第一相机110采集的上一帧彩色图像匹配的第一二维特征点。In the posture determination scheme of the embodiment of the present disclosure, the processor 100 can obtain the current frame color image captured by the first camera 110, and determine the first two-dimensional feature points on the current frame color image captured by the first camera 110 that match the previous frame color image captured by the first camera 110.
处理器100可以获取第二相机120采集的当前帧彩色图像,并确定第二相机120采集的当前帧彩色图像上与第二相机120采集的上一帧彩色图像匹配的第二二维特征点。处理器100利用第一相机110的第一相机坐标系与第二相机的第二相机坐标系之间的转换矩阵将该第二二维特征点转换为第一相机坐标系下的第三二维特征点。The processor 100 may acquire a current frame color image captured by the second camera 120, and determine a second two-dimensional feature point on the current frame color image captured by the second camera 120 that matches a previous frame color image captured by the second camera 120. The processor 100 converts the second two-dimensional feature point into a third two-dimensional feature point in the first camera coordinate system using a conversion matrix between the first camera coordinate system of the first camera 110 and the second camera coordinate system of the second camera.
接下来,处理器100可以根据第一二维特征点、第三二维特征点、第一相机110采集的上一帧彩色图像在世界坐标系下的三维特征点以及第二相机120采集的上一帧彩色图像在世界坐标系下的三维特征点,确定第一相机110采集当前帧彩色图像时的位姿。Next, the processor 100 can determine the posture of the first camera 110 when capturing the current frame color image based on the first two-dimensional feature points, the third two-dimensional feature points, the three-dimensional feature points of the previous frame color image captured by the first camera 110 in the world coordinate system, and the three-dimensional feature points of the previous frame color image captured by the second camera 120 in the world coordinate system.
在终端设备1配置有多个第二相机120的情况下,每个第二相机120的特征点数据均可以映射到第一相机110的第一相机坐标系下进行处理。In the case where the terminal device 1 is configured with a plurality of second cameras 120 , feature point data of each second camera 120 may be mapped to the first camera coordinate system of the first camera 110 for processing.
可以理解的是,第一相机110和第二相机120在终端设备1上的摆放位置固定,在确定出第一相机110当前位姿的情况下,即可以得到第二相机120当前的位姿和终端设备1的当前位姿。It is understandable that the placement positions of the first camera 110 and the second camera 120 on the terminal device 1 are fixed, and when the current posture of the first camera 110 is determined, the current posture of the second camera 120 and the current posture of the terminal device 1 can be obtained.
此外,在终端设备1配置两个以上相机的情况下,可以将任意一个相机确定为算法实现上的第一相机110,并将其余相机确定为第二相机120。In addition, when the terminal device 1 is configured with more than two cameras, any one of the cameras may be determined as the first camera 110 in algorithm implementation, and the remaining cameras may be determined as the second camera 120 .
基于本公开实施方式的位姿确定方案,通过将第二相机120采集的特征点转换到第一相机坐标系下,以与第一相机110采集的特征点一并进行位姿计算,由于特征点来自至少两个相机,并进行了坐标系的统一,采集的特征点更多,即参与统一处理的特征点更全面, 确定出的位姿更准确,提高了定位的精确度。另外,本公开的位姿确定过程考虑到了帧间的关联性,结合了上一帧图像的特征信息,用上一帧的数据进行约束,进一步提高了定位的精确度。Based on the posture determination scheme of the embodiment of the present disclosure, the feature points collected by the second camera 120 are converted to the first camera coordinate system to perform posture calculation together with the feature points collected by the first camera 110. Since the feature points come from at least two cameras and the coordinate systems are unified, more feature points are collected, that is, the feature points involved in the unified processing are more comprehensive. The determined position and posture are more accurate, which improves the accuracy of positioning. In addition, the position and posture determination process of the present disclosure takes into account the correlation between frames, combines the feature information of the previous frame image, and uses the data of the previous frame for constraints, which further improves the accuracy of positioning.
在实现本公开实施方式的位姿确定过程中,涉及多个处理阶段。参考图4,涉及的处理阶段包括但不限于坐标系对齐阶段、定位初始化阶段和实时定位阶段。In the process of realizing the posture determination of the embodiment of the present disclosure, multiple processing stages are involved. Referring to FIG4 , the processing stages involved include but are not limited to a coordinate system alignment stage, a positioning initialization stage, and a real-time positioning stage.
针对坐标系对齐阶段,终端设备确定出第一相机坐标系与世界坐标系之间的转换矩阵。For the coordinate system alignment stage, the terminal device determines the transformation matrix between the first camera coordinate system and the world coordinate system.
首先,终端设备可以利用第一相机输出的深度图像和第二相机输出的深度图像,构建点云。其中,可以两幅深度图像对应的三维空间点进行合并,以得到三维特征点的点云。First, the terminal device can construct a point cloud using the depth image output by the first camera and the depth image output by the second camera, wherein the three-dimensional space points corresponding to the two depth images can be merged to obtain a point cloud of three-dimensional feature points.
接下来,终端设备利用平面检测算法从点云中提取平面信息,并根据提取到的平面信息筛选出指定平面(如地平面)。Next, the terminal device uses a plane detection algorithm to extract plane information from the point cloud, and selects a specified plane (such as the ground plane) based on the extracted plane information.
然后,终端设备可以根据指定平面的法向量和重力向量计算转换矩阵,以实现第一相机坐标系与世界坐标系的对齐。Then, the terminal device may calculate a transformation matrix according to the normal vector and the gravity vector of the specified plane to align the first camera coordinate system with the world coordinate system.
另外,可以理解的是,基于预先的内外参标定结果,可以获知第一相机坐标系与第二相机坐标系之间的转换矩阵。在这种情况下,也可以得到第二相机坐标系与世界坐标系之间的转换矩阵,实现第一相机坐标系、第二相机坐标系、世界坐标系三者之间的对齐。In addition, it can be understood that based on the pre-calibrated results of the internal and external parameters, the transformation matrix between the first camera coordinate system and the second camera coordinate system can be obtained. In this case, the transformation matrix between the second camera coordinate system and the world coordinate system can also be obtained to achieve alignment among the first camera coordinate system, the second camera coordinate system, and the world coordinate system.
针对定位初始化阶段,终端设备可以确定第一相机初始拍摄彩色图像时的位姿。应当理解的是,本公开所说的相机拍摄图像时的位姿指的是在世界坐标系下的位姿。For the positioning initialization stage, the terminal device can determine the position and posture of the first camera when initially capturing a color image. It should be understood that the position and posture of the camera when capturing an image in the present disclosure refers to the position and posture in the world coordinate system.
一方面,终端设备可以确定第一相机采集的初始帧彩色图像对应的三维特征点,该三维特征点是在第一相机坐标系下的特征点。On the one hand, the terminal device can determine the three-dimensional feature points corresponding to the initial frame color image captured by the first camera, and the three-dimensional feature points are feature points in the first camera coordinate system.
另一方面,可以设置初始旋转矩阵和初始平移向量。例如,初始旋转矩阵为单位矩阵,初始平移向量为[0,0,0]。On the other hand, the initial rotation matrix and the initial translation vector can be set. For example, the initial rotation matrix is the identity matrix, and the initial translation vector is [0,0,0].
在确定出初始帧彩色图像对应的三维特征点以及初始旋转矩阵和初始平移向量之后,即完成在第一相机坐标系下的定位初始化。After the three-dimensional feature points corresponding to the initial frame color image, the initial rotation matrix and the initial translation vector are determined, the positioning initialization in the first camera coordinate system is completed.
接下来,结合坐标系对齐阶段确定出的第一相机坐标系与世界坐标系之间的转换矩阵,可以将第一相机坐标系下的定位初始化结果转换为世界坐标系下的定位初始化结果,即确定出第一相机采集初始帧彩色图像时的位姿。Next, combined with the transformation matrix between the first camera coordinate system and the world coordinate system determined in the coordinate system alignment stage, the positioning initialization result in the first camera coordinate system can be converted into the positioning initialization result in the world coordinate system, that is, the position and posture of the first camera when capturing the initial frame color image is determined.
针对实时定位阶段,终端设备可以结合定位初始化阶段确定出的初始位姿对实时得到当前帧的位姿。在此过程中,可以将第二相机的特征转移到第一相机坐标系下,与第一相机的特征联合进行位姿求解,完成当前帧的位姿预测。In the real-time positioning stage, the terminal device can combine the initial pose determined in the positioning initialization stage to obtain the pose of the current frame in real time. In this process, the features of the second camera can be transferred to the coordinate system of the first camera, and the pose can be solved in conjunction with the features of the first camera to complete the pose prediction of the current frame.
下面对本公开实施方式的位姿确定方法进行示例性说明。The following is an exemplary description of the posture determination method of the embodiment of the present disclosure.
图5示意性示出了本公开的示例性实施方式的位姿确定方法的流程图。参考图5,该位姿确定方法可以包括以下步骤:FIG5 schematically shows a flow chart of a method for determining a posture according to an exemplary embodiment of the present disclosure. Referring to FIG5 , the method for determining a posture may include the following steps:
S52.获取第一相机采集的当前帧彩色图像,确定第一相机采集的当前帧彩色图像上与第一相机采集的上一帧彩色图像匹配的第一二维特征点。S52. Obtain a current frame color image acquired by the first camera, and determine a first two-dimensional feature point on the current frame color image acquired by the first camera that matches a previous frame color image acquired by the first camera.
在本公开的示例性实施方式中,当前帧彩色图像为相机在当前时刻采集到的彩色图像,上一帧彩色图像为相机在上一帧采集到的彩色图像。本公开对图像的尺寸、拍摄场景等均 不做限制。In an exemplary embodiment of the present disclosure, the current frame color image is the color image captured by the camera at the current moment, and the previous frame color image is the color image captured by the camera in the previous frame. No restrictions.
在获取到第一相机采集的当前帧彩色图像之后,终端设备可以提取第一相机采集的当前彩色图像的特征点。After acquiring the current frame color image captured by the first camera, the terminal device may extract feature points of the current color image captured by the first camera.
本公开示例性实施方式采用的特征提取算法可以包括但不限于FAST特征点检测算法、DOG特征点检测算法、Harris特征点检测算法、SIFT特征点检测算法、SURF特征点检测算法等。特征描述子可以包括但不限于BRIEF特征点描述子、BRISK特征点描述子、FREAK特征点描述子等。The feature extraction algorithm used in the exemplary embodiments of the present disclosure may include but is not limited to the FAST feature point detection algorithm, the DOG feature point detection algorithm, the Harris feature point detection algorithm, the SIFT feature point detection algorithm, the SURF feature point detection algorithm, etc. The feature descriptor may include but is not limited to the BRIEF feature point descriptor, the BRISK feature point descriptor, the FREAK feature point descriptor, etc.
根据本公开的一个实施例,特征提取算法和特征描述子的组合可以是FAST特征点检测算法和BRIEF特征点描述子。根据本公开的另一些实施例,特征提取算法和特征描述子的组合可以是DOG特征点检测算法和FREAK特征点描述子。According to one embodiment of the present disclosure, the combination of the feature extraction algorithm and the feature descriptor may be a FAST feature point detection algorithm and a BRIEF feature point descriptor. According to other embodiments of the present disclosure, the combination of the feature extraction algorithm and the feature descriptor may be a DOG feature point detection algorithm and a FREAK feature point descriptor.
应当理解的是,还可以针对不同纹理场景采用不同的组合形式,例如,针对强纹理场景,可以采用FAST特征点检测算法和BRIEF特征点描述子来进行特征提取;针对弱纹理场景,可以采用DOG特征点检测算法和FREAK特征点描述子来进行特征提取。It should be understood that different combinations can be used for different texture scenes. For example, for strong texture scenes, the FAST feature point detection algorithm and the BRIEF feature point descriptor can be used for feature extraction; for weak texture scenes, the DOG feature point detection algorithm and the FREAK feature point descriptor can be used for feature extraction.
在当前帧彩色图像对应的上一帧彩色图像的处理过程中,同样存在提取特征点的过程。由此,终端设备可以利用第一相机采集的当前帧彩色图像的特征点以及第一相机采集的上一帧彩色图像的特征点,确定出两张图像之间匹配的二维特征点,即本公开所说的第一二维特征点。In the process of processing the previous color image frame corresponding to the current color image frame, there is also a process of extracting feature points. Therefore, the terminal device can use the feature points of the current color image frame captured by the first camera and the feature points of the previous color image frame captured by the first camera to determine the two-dimensional feature points that match between the two images, that is, the first two-dimensional feature points mentioned in the present disclosure.
具体的,可以采用光流法确定特征点的匹配关系,即利用第一相机采集的当前帧彩色图像的特征点以及第一相机采集的上一帧彩色图像的特征点进行光流跟踪,以确定出第一二维特征点。另外,还可以采用其他图像匹配方法来确定2D-2D特征点对,本公开对此不做限制。Specifically, the optical flow method can be used to determine the matching relationship of the feature points, that is, the feature points of the current frame color image captured by the first camera and the feature points of the previous frame color image captured by the first camera are used for optical flow tracking to determine the first two-dimensional feature points. In addition, other image matching methods can also be used to determine 2D-2D feature point pairs, which is not limited in the present disclosure.
S54.获取第二相机采集的当前帧彩色图像,确定第二相机采集的当前帧彩色图像上与第二相机采集的上一帧彩色图像匹配的第二二维特征点。S54. Obtain a current frame color image acquired by the second camera, and determine a second two-dimensional feature point on the current frame color image acquired by the second camera that matches a previous frame color image acquired by the second camera.
应当理解的是,与步骤S52相比,虽然都存在当前帧彩色图像和上一帧彩色图像的描述,然而,步骤S52中的当前帧彩色图像和上一帧彩色图像是由第一相机采集,步骤S54中的当前帧彩色图像和上一帧彩色图像是由第二相机采集。It should be understood that, compared with step S52, although there are descriptions of the current frame color image and the previous frame color image, the current frame color image and the previous frame color image in step S52 are captured by the first camera, and the current frame color image and the previous frame color image in step S54 are captured by the second camera.
在获取到第二相机采集的当前帧彩色图像之后,终端设备可以提取第二相机采集的当前彩色图像的特征点。特征点的提取方式可以与步骤S52中提取特征点的方式相同,不再赘述。After acquiring the current frame color image captured by the second camera, the terminal device can extract feature points of the current color image captured by the second camera. The method of extracting feature points can be the same as the method of extracting feature points in step S52, which will not be repeated.
终端设备可以利用第二相机采集的当前帧彩色图像的特征点以及第二相机采集的上一帧彩色图像的特征点进行光流跟踪,以确定出第二二维特征点。The terminal device may perform optical flow tracking using the feature points of the current frame color image captured by the second camera and the feature points of the previous frame color image captured by the second camera to determine the second two-dimensional feature points.
S56.利用第一相机的第一相机坐标系与第二相机的第二相机坐标系之间的转换矩阵将第二二维特征点转换为第一相机坐标系下的第三二维特征点。S56. Use the conversion matrix between the first camera coordinate system of the first camera and the second camera coordinate system of the second camera to convert the second two-dimensional feature points into third two-dimensional feature points in the first camera coordinate system.
在本公开的示例性实施方式中,为了区分,将第一相机的相机坐标系记为第一相机坐标系,将第二相机的相机坐标系记为第二相机坐标系。In an exemplary embodiment of the present disclosure, for the purpose of distinction, a camera coordinate system of the first camera is recorded as a first camera coordinate system, and a camera coordinate system of the second camera is recorded as a second camera coordinate system.
在第一相机和第二相机于终端设备上的摆放位置固定的情况下,预先对第一相机和第 二相机进行内参、外参的标定,从标定结果中可以确定出第一相机的第一相机坐标系与第二相机的第二相机坐标系之间的转换矩阵。In the case where the first camera and the second camera are placed at fixed positions on the terminal device, the first camera and the second camera are pre-positioned. The two cameras are calibrated with internal and external parameters, and the conversion matrix between the first camera coordinate system of the first camera and the second camera coordinate system of the second camera can be determined from the calibration results.
终端设备可以获取第一相机坐标系与第二相机坐标系之间的转换矩阵以及第二二维特征点的深度信息,并根据该转换矩阵、第二二维特征点的深度信息以及第二特征点确定第三二维特征点。该第三二维特征点为第二二维特征点转换到第一相机坐标系下的二维特征点。The terminal device can obtain the conversion matrix between the first camera coordinate system and the second camera coordinate system and the depth information of the second two-dimensional feature point, and determine the third two-dimensional feature point according to the conversion matrix, the depth information of the second two-dimensional feature point and the second feature point. The third two-dimensional feature point is the two-dimensional feature point converted from the second two-dimensional feature point to the first camera coordinate system.
具体的,可以将转换矩阵、第二二维特征点的深度信息以及第二二维特征点相乘,并对相乘的结果进行归一化处理,以确定出第三二维特征点。其中,乘法运算中的第二二维特征点指的是这些特征点的位置坐标信息。可以利用公式1确定出第三二维特征点
Specifically, the transformation matrix, the depth information of the second two-dimensional feature points, and the second two-dimensional feature points can be multiplied, and the multiplication result can be normalized to determine the third two-dimensional feature points. The second two-dimensional feature points in the multiplication operation refer to the position coordinate information of these feature points. The third two-dimensional feature points can be determined using formula 1
其中,Tlr为第一相机坐标系与第二相机坐标系之间的转换矩阵,dj为第二二维特征点的深度值,为第二二维特征点。Wherein, T lr is the transformation matrix between the first camera coordinate system and the second camera coordinate system, d j is the depth value of the second two-dimensional feature point, is the second two-dimensional feature point.
S58.根据第一二维特征点、第三二维特征点、第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点以及第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点,确定第一相机采集当前帧彩色图像时的位姿。S58. Determine the posture of the first camera when acquiring the current frame of color image based on the first two-dimensional feature points, the third two-dimensional feature points, the three-dimensional feature points of the previous frame of color image acquired by the first camera in the world coordinate system, and the three-dimensional feature points of the previous frame of color image acquired by the second camera in the world coordinate system.
在本公开的示例性实施方式中,第一二维特征点和第三二维特征点组成二维坐标信息,第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点和第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点组成三维坐标信息。In an exemplary embodiment of the present disclosure, the first two-dimensional feature points and the third two-dimensional feature points constitute two-dimensional coordinate information, and the three-dimensional feature points of the previous frame color image captured by the first camera in the world coordinate system and the three-dimensional feature points of the previous frame color image captured by the second camera in the world coordinate system constitute three-dimensional coordinate information.
终端设备可以将二维坐标系信息与三维坐标信息关联,以得到点对信息,并利用该点对信息求解透视n点(Perspective-n-Point,PnP)问题,并结合求解结果确定所述第一相机采集当前帧彩色图像时的位姿。The terminal device can associate the two-dimensional coordinate system information with the three-dimensional coordinate information to obtain point pair information, and use the point pair information to solve the perspective-n-Point (PnP) problem, and determine the posture of the first camera when capturing the current frame color image based on the solution result.
其中,PnP是机器视觉领域的方法,可以根据场景中的n个特征点来确定相机的相对位姿。具体可以根据场景上的n个特征点来确定相机的旋转矩阵和平移向量。Among them, PnP is a method in the field of machine vision, which can determine the relative position of the camera based on n feature points in the scene. Specifically, the rotation matrix and translation vector of the camera can be determined based on the n feature points on the scene.
应当注意的是,本公开确定上一帧彩色图像在世界坐标系下的三维特征点的过程可以当前帧的处理过程中进行,也可以在上一帧的处理过程中进行,本公开对此不做限制。It should be noted that the process of determining the three-dimensional feature points of the previous frame color image in the world coordinate system in the present disclosure can be performed during the processing of the current frame or during the processing of the previous frame, and the present disclosure does not impose any limitation on this.
下面对确定第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点的过程进行说明。The following is an explanation of the process of determining the three-dimensional feature points of the previous frame of color image captured by the first camera in the world coordinate system.
首先,终端设备可以获取第一相机采集的上一帧彩色图像,并提取第一相机采集的上一帧彩色图像的特征点。其中,提取特征点的过程与步骤S52中的过程相同,在此不在赘 述。First, the terminal device can obtain the last frame of color image captured by the first camera, and extract the feature points of the last frame of color image captured by the first camera. The process of extracting feature points is the same as the process in step S52, which will not be repeated here. State.
接下来,终端设备可以利用与第一相机采集的上一帧彩色图像对齐的上一帧深度图像,对第一相机采集的上一帧彩色图像的特征点进行空间投射,以得到第一相机采集的上一帧彩色图像在第一相机坐标系下的三维特征点。其中,该上一帧深度图像可以由第一相机输出,或者可以由终端设备配备的其他深度相机得到,本公开对此不做限制。Next, the terminal device can use the previous frame depth image aligned with the previous frame color image captured by the first camera to perform spatial projection on the feature points of the previous frame color image captured by the first camera to obtain the three-dimensional feature points of the previous frame color image captured by the first camera in the first camera coordinate system. The previous frame depth image can be output by the first camera, or can be obtained by other depth cameras equipped by the terminal device, and the present disclosure does not limit this.
另外,为了进一步提高本公开定位的精度,还可以对空间投射过程进行约束。具体的,终端设备可以利用与第一相机采集的上一帧彩色图像对齐的上一帧深度图像,对第一相机采集的上一帧彩色图像的特征点中处于预定深度范围内的特征点进行空间投射,以得到第一相机采集的上一帧彩色图像在第一相机坐标系下的三维特征点。In addition, in order to further improve the accuracy of the positioning disclosed in the present invention, the spatial projection process can also be constrained. Specifically, the terminal device can use the previous frame of depth image aligned with the previous frame of color image captured by the first camera to perform spatial projection on the feature points within a predetermined depth range among the feature points of the previous frame of color image captured by the first camera, so as to obtain the three-dimensional feature points of the previous frame of color image captured by the first camera in the first camera coordinate system.
预定深度范围基于深度测量的量程确定出,深度相机类型、型号的不同,预定深度范围的取值可能存在差异,本公开对预定深度范围的具体取值不做限制。例如,深度值大于0.5m且小于6m的特征点进行空间投射。The predetermined depth range is determined based on the range of the depth measurement. The value of the predetermined depth range may vary depending on the type and model of the depth camera. The present disclosure does not limit the specific value of the predetermined depth range. For example, feature points with a depth value greater than 0.5m and less than 6m are spatially projected.
然后,终端设备可以根据第一相机采集上一帧彩色图像时的位姿,对第一相机坐标系下的三维特征点进行转换,以得到第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点。参考公式2:
Then, the terminal device can transform the three-dimensional feature points in the first camera coordinate system according to the posture when the first camera captured the last frame of color image, so as to obtain the three-dimensional feature points in the world coordinate system of the last frame of color image captured by the first camera. Refer to formula 2:
其中,为第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点,为第一相机采集的上一帧彩色图像在第一相机坐标系下的三维特征点,Tw_last为第一相机采集上一帧彩色图像时的位姿。in, is the 3D feature point of the previous color image captured by the first camera in the world coordinate system, are the three-dimensional feature points of the last frame of color image captured by the first camera in the first camera coordinate system, and T w_last is the position and posture of the first camera when capturing the last frame of color image.
需要说明的是,第一相机采集上一帧彩色图像时的位姿在上一帧图像的处理过程中可以确定出,也就是说,在当前帧的处理过程中,上一帧对应的位姿是已知的。对于初始的位姿,在本公开定位初始化的过程中进行说明。It should be noted that the position and posture of the first camera when capturing the previous color image can be determined during the processing of the previous image, that is, during the processing of the current frame, the position and posture corresponding to the previous frame is known. The initial position and posture are explained in the process of positioning initialization of the present disclosure.
下面对确定第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点的过程进行说明。The following is a description of the process of determining the three-dimensional feature points of the previous frame of color image captured by the second camera in the world coordinate system.
首先,终端设备可以获取第二相机采集的上一帧彩色图像,并提取第二相机采集的上一帧彩色图像的特征点。其中,提取特征点的过程与步骤S52中的过程相同,在此不在赘述。First, the terminal device can obtain the last frame of color image captured by the second camera, and extract feature points of the last frame of color image captured by the second camera. The process of extracting feature points is the same as the process in step S52, which will not be repeated here.
接下来,终端设备可以利用与第二相机采集的上一帧彩色图像对齐的上一帧深度图像,对第二相机采集的上一帧彩色图像的特征点进行空间投射,以得到第二相机采集的上一帧彩色图像在第二相机坐标系下的三维特征点。其中,该上一帧深度图像可以由第二相机输出,或者可以由终端设备配备的其他深度相机得到,本公开对此不做限制。Next, the terminal device can use the previous frame of depth image aligned with the previous frame of color image captured by the second camera to perform spatial projection on the feature points of the previous frame of color image captured by the second camera to obtain the three-dimensional feature points of the previous frame of color image captured by the second camera in the second camera coordinate system. The previous frame of depth image can be output by the second camera, or can be obtained by other depth cameras equipped by the terminal device, and the present disclosure does not limit this.
类似地,为了进一步提高本公开定位的精度,还可以对空间投射过程进行约束。具体的,终端设备可以利用与第二相机采集的上一帧彩色图像对齐的上一帧深度图像,对第二相机采集的上一帧彩色图像的特征点中处于预定深度范围内的特征点进行空间投射,以得到第二相机采集的上一帧彩色图像在第二相机坐标系下的三维特征点。 Similarly, in order to further improve the accuracy of the positioning disclosed in the present invention, the spatial projection process can also be constrained. Specifically, the terminal device can use the previous frame of depth image aligned with the previous frame of color image captured by the second camera to perform spatial projection on the feature points within a predetermined depth range among the feature points of the previous frame of color image captured by the second camera, so as to obtain the three-dimensional feature points of the previous frame of color image captured by the second camera in the second camera coordinate system.
预定深度范围基于深度测量的量程确定出,深度相机类型、型号的不同,预定深度范围的取值可能存在差异,本公开对预定深度范围的具体取值不做限制。例如,深度值大于0.5m且小于6m的特征点进行空间投射。The predetermined depth range is determined based on the range of the depth measurement. The value of the predetermined depth range may vary depending on the type and model of the depth camera. The present disclosure does not limit the specific value of the predetermined depth range. For example, feature points with a depth value greater than 0.5m and less than 6m are spatially projected.
随后,终端设备可以利用第一相机坐标系与第二相机坐标系之间的转换矩阵将第二相机采集的上一帧彩色图像在第二相机坐标系下的三维特征点转换为第一相机坐标系下的三维特征点。Subsequently, the terminal device may use the transformation matrix between the first camera coordinate system and the second camera coordinate system to transform the three-dimensional feature points of the previous frame color image captured by the second camera in the second camera coordinate system into the three-dimensional feature points in the first camera coordinate system.
然后,终端设备可以根据第一相机采集上一帧彩色图像时的位姿,对该转换而来的第一相机坐标系下的三维特征点再次进行转换,以得到第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点。Then, the terminal device can convert the three-dimensional feature points in the converted first camera coordinate system again according to the posture of the first camera when capturing the previous frame of color image, so as to obtain the three-dimensional feature points of the previous frame of color image captured by the second camera in the world coordinate system.
下面参考公式3对上述过程进行说明:
The above process is explained below with reference to Formula 3:
其中,为第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点,为第二相机采集的上一帧彩色图像在第二相机坐标系下的三维特征点,Tw_last为第一相机采集上一帧彩色图像时的位姿,Tlr为第一相机坐标系与第二相机坐标系之间的转换矩阵。in, is the 3D feature point of the previous color image captured by the second camera in the world coordinate system, are the three-dimensional feature points of the last frame color image captured by the second camera in the second camera coordinate system, T w_last is the position and posture when the first camera captured the last frame color image, and T lr is the transformation matrix between the first camera coordinate system and the second camera coordinate system.
结合上述点对匹配关系,图6给出了第一相机和第二相机点对匹配进而实现PnP位姿求解的示意图,其中涉及当前帧2D-2D特征点匹配的关系以及3D-2D特征点的匹配关系。Combined with the above point pair matching relationship, FIG6 shows a schematic diagram of point pair matching between the first camera and the second camera to achieve PnP pose solution, which involves the matching relationship of 2D-2D feature points of the current frame and the matching relationship of 3D-2D feature points.
在上述确定上一帧彩色图像在世界坐标系下的三维特征点的过程中,利用了第一相机采集上一帧彩色图像时的位姿。下面对第一相机的初始位姿的确定过程进行说明。In the above process of determining the three-dimensional feature points of the previous color image in the world coordinate system, the position and posture of the first camera when capturing the previous color image is used. The process of determining the initial position and posture of the first camera is described below.
根据本公开的一些实施例,首先,终端设备可以获取第一相机采集的初始帧彩色图像,并提取第一相机采集的初始帧彩色图像的特征点。其中,提取特征点的过程与步骤S52中的过程相同,在此不在赘述。According to some embodiments of the present disclosure, first, the terminal device may obtain an initial frame color image captured by the first camera, and extract feature points of the initial frame color image captured by the first camera. The process of extracting feature points is the same as the process in step S52, and will not be repeated here.
接下来,终端设备可以利用与第一相机采集的初始帧彩色图像对齐的初始帧深度图像,对第一相机采集的初始帧彩色图像的特征点进行空间投射,以得到第一相机采集的初始帧彩色图像在第一相机坐标系下的三维特征点。Next, the terminal device can use the initial frame depth image aligned with the initial frame color image captured by the first camera to spatially project the feature points of the initial frame color image captured by the first camera to obtain the three-dimensional feature points of the initial frame color image captured by the first camera in the first camera coordinate system.
类似地,为了进一步提高本公开定位的精度,还可以对空间投射过程进行约束。具体的,终端设备可以利用与第一相机采集的初始帧彩色图像的特征点中处于预定深度范围内的特征点进行空间投射,以得到第一相机采集的初始帧彩色图像在第一相机坐标系下的三维特征点。Similarly, in order to further improve the accuracy of the positioning disclosed in the present invention, the spatial projection process can also be constrained. Specifically, the terminal device can use the feature points in the initial frame color image captured by the first camera that are within a predetermined depth range for spatial projection to obtain the three-dimensional feature points of the initial frame color image captured by the first camera in the first camera coordinate system.
预定深度范围基于深度测量的量程确定出,深度相机类型、型号的不同,预定深度范围的取值可能存在差异,本公开对预定深度范围的具体取值不做限制。例如,深度值大于0.5m且小于6m的特征点进行空间投射。The predetermined depth range is determined based on the range of the depth measurement. The value of the predetermined depth range may vary depending on the type and model of the depth camera. The present disclosure does not limit the specific value of the predetermined depth range. For example, feature points with a depth value greater than 0.5m and less than 6m are spatially projected.
随后,终端设备可以根据第一相机采集的初始帧彩色图像在第一相机坐标系下的三维特征点、初始旋转矩阵和初始平移向量,确定出第一相机在第一相机坐标系下的初始定位 结果。Subsequently, the terminal device can determine the initial positioning of the first camera in the first camera coordinate system based on the three-dimensional feature points, initial rotation matrix and initial translation vector of the initial frame color image captured by the first camera in the first camera coordinate system. result.
在本公开的一个实施例中,可以将初始旋转矩阵设定为单位矩阵,将平移向量设置为[0,0,0]。In one embodiment of the present disclosure, the initial rotation matrix may be set to the identity matrix, and the translation vector may be set to [0, 0, 0].
应当注意的是,在得知第一相机采集的初始帧彩色图像在第一相机坐标系下的三维特征点、初始旋转矩阵和初始平移向量的情况下,此时确定出的仅是第一相机在第一相机坐标系下的位姿。为了得到应用于后续当前帧处理过程的位姿,需要对该位姿进行转换,以得到第一相机在世界坐标系下的位姿。It should be noted that, when the three-dimensional feature points, initial rotation matrix and initial translation vector of the initial frame color image captured by the first camera in the first camera coordinate system are known, only the position and posture of the first camera in the first camera coordinate system is determined at this time. In order to obtain the position and posture applied to the subsequent current frame processing process, the position and posture needs to be transformed to obtain the position and posture of the first camera in the world coordinate system.
具体的,终端设备可以利用第一相机坐标系与世界坐标系之间的转换矩阵,对第一相机在第一相机坐标系下的初始定位结果进行转换,以确定出第一相机采集初始帧彩色图像时的位姿。Specifically, the terminal device may transform the initial positioning result of the first camera in the first camera coordinate system using the transformation matrix between the first camera coordinate system and the world coordinate system, so as to determine the posture of the first camera when capturing the initial frame color image.
根据本公开的另一些实施例,针对第一相机的初始位姿的确定过程还可以结合第二相机的特征数据,下面对此过程进行说明。According to some other embodiments of the present disclosure, the process of determining the initial position and posture of the first camera may also be combined with feature data of the second camera, and this process is described below.
一方面,终端设备可以确定出第一相机采集的初始帧彩色图像在第一相机坐标系下的三维特征点。On the one hand, the terminal device can determine the three-dimensional feature points of the initial frame color image captured by the first camera in the first camera coordinate system.
另一方面,终端设备可以获取第二相机采集的初始帧彩色图像,并提取第二相机采集的初始帧彩色图像的特征点。其中,提取特征点的过程与步骤S52中的过程相同,在此不在赘述。On the other hand, the terminal device can obtain the initial frame color image captured by the second camera, and extract feature points of the initial frame color image captured by the second camera. The process of extracting feature points is the same as the process in step S52, which will not be repeated here.
终端设备可以利用与第二相机采集的初始帧彩色图像对齐的初始帧深度图像,对第二相机采集的初始帧彩色图像的特征点进行空间投射,以得到第二相机采集的初始帧彩色图像在第二相机坐标系下的三维特征点。The terminal device can use the initial frame depth image aligned with the initial frame color image captured by the second camera to spatially project the feature points of the initial frame color image captured by the second camera to obtain the three-dimensional feature points of the initial frame color image captured by the second camera in the second camera coordinate system.
类似地,还可以对空间投射过程进行约束。具体的,终端设备可以利用与第二相机采集的初始帧彩色图像的特征点中处于预定深度范围内的特征点进行空间投射,以得到第二相机采集的初始帧彩色图像在第二相机坐标系下的三维特征点。Similarly, the spatial projection process can also be constrained. Specifically, the terminal device can use the feature points in the initial frame color image captured by the second camera that are within a predetermined depth range to perform spatial projection to obtain the three-dimensional feature points of the initial frame color image captured by the second camera in the second camera coordinate system.
预定深度范围基于深度测量的量程确定出,深度相机类型、型号的不同,预定深度范围的取值可能存在差异,本公开对预定深度范围的具体取值不做限制。例如,深度值大于0.5m且小于6m的特征点进行空间投射。The predetermined depth range is determined based on the range of the depth measurement. The value of the predetermined depth range may vary depending on the type and model of the depth camera. The present disclosure does not limit the specific value of the predetermined depth range. For example, feature points with a depth value greater than 0.5m and less than 6m are spatially projected.
接下来,终端设备可以利用第一相机的第一相机坐标系与第二相机的第二相机坐标系之间的转换矩阵,将第二相机采集的初始帧彩色图像在第二相机坐标系下的三维特征点转换至第一相机坐标系下的三维特征点。Next, the terminal device can use the transformation matrix between the first camera coordinate system of the first camera and the second camera coordinate system of the second camera to transform the three-dimensional feature points of the initial frame color image captured by the second camera in the second camera coordinate system into the three-dimensional feature points in the first camera coordinate system.
该转换后的三维特征点和第一相机采集的初始帧彩色图像在第一相机坐标系下的三维特征点可以合并,得到合并后的三维特征点。可以理解的是,合并后的三维特征点是在第一相机坐标系下的三维特征点。The converted 3D feature points and the 3D feature points of the initial frame color image captured by the first camera in the first camera coordinate system can be combined to obtain combined 3D feature points. It can be understood that the combined 3D feature points are 3D feature points in the first camera coordinate system.
随后,终端设备可以根据合并后的三维特征点、初始旋转矩阵和初始平移向量,确定出第一相机在第一相机坐标系下的初始定位结果。例如,可以将初始旋转矩阵设定为单位矩阵,将平移向量设置为[0,0,0]。Subsequently, the terminal device can determine the initial positioning result of the first camera in the first camera coordinate system according to the combined three-dimensional feature points, the initial rotation matrix and the initial translation vector. For example, the initial rotation matrix can be set to the unit matrix and the translation vector can be set to [0,0,0].
然后,终端设备可以利用第一相机坐标系与世界坐标系之间的转换矩阵,对第一相机 在第一相机坐标系下的初始定位结果进行转换,以确定出第一相机采集初始帧彩色图像时的位姿。Then, the terminal device can use the conversion matrix between the first camera coordinate system and the world coordinate system to convert the first camera The initial positioning result in the first camera coordinate system is transformed to determine the position and posture of the first camera when it captures the initial frame color image.
下面将参考图7对本公开实施例的定位初始化的过程进行说明。The process of positioning initialization according to the embodiment of the present disclosure will be described below with reference to FIG. 7 .
在步骤S702中,终端设备可以获取第一相机采集的初始帧彩色图像,并提取第一相机采集的初始帧彩色图像的特征点。In step S702, the terminal device may acquire an initial frame color image captured by the first camera, and extract feature points of the initial frame color image captured by the first camera.
在步骤S704中,终端设备可以结合与第一相机采集的初始帧彩色图像对齐的深度图像进行空间投射,以得到第一相机采集的初始帧彩色图像在第一相机坐标系下的三维特征点。如上述实施例中说明的是,步骤S704确定出的三维特征点还可以包括第二相机采集初始帧彩色图像对应的三维特征点。In step S704, the terminal device may perform spatial projection in combination with the depth image aligned with the initial frame color image captured by the first camera to obtain the three-dimensional feature points of the initial frame color image captured by the first camera in the first camera coordinate system. As described in the above embodiment, the three-dimensional feature points determined in step S704 may also include the three-dimensional feature points corresponding to the initial frame color image captured by the second camera.
在步骤S706中,终端设备可以根据步骤S704确定出的三维特征点、初始旋转矩阵和初始平移向量,确定第一相机在第一相机坐标系下的初始定位结果。In step S706, the terminal device may determine an initial positioning result of the first camera in the first camera coordinate system according to the three-dimensional feature points, the initial rotation matrix, and the initial translation vector determined in step S704.
在步骤S708中,终端设备可以利用第一相机坐标系与世界坐标系之间的转换矩阵对初始定位结果进行转换,以确定出第一相机采集初始帧彩色时的位姿,完成定位初始化。In step S708, the terminal device may transform the initial positioning result using the transformation matrix between the first camera coordinate system and the world coordinate system to determine the position and posture of the first camera when capturing the initial frame color, thereby completing the positioning initialization.
在上述处理过程中,利用到了第一相机坐标系与世界坐标系之间的转换矩阵,对于该预先确定的转换矩阵,本公开实施方式提供了一种坐标系对齐方案。具体的,结合深度信息来实现坐标系对齐,为了区分,在下面的实施例中,采用参考深度图像的术语对坐标系对齐的过程进行说明。In the above processing, the transformation matrix between the first camera coordinate system and the world coordinate system is used. For the predetermined transformation matrix, the embodiment of the present disclosure provides a coordinate system alignment solution. Specifically, the coordinate system alignment is achieved in combination with the depth information. For the sake of distinction, in the following embodiments, the coordinate system alignment process is described using the terminology of the reference depth image.
首先,终端设备可以获取第一相机输出的参考深度图像。First, the terminal device can obtain a reference depth image output by the first camera.
接下来,在结合第一相机输出的参考深度图像确定出场景中存在指定平面的情况下,终端设备可以根据指定平面的法向量和重力向量确定第一相机坐标系与世界坐标系之间的转换矩阵。Next, when it is determined that there is a designated plane in the scene in combination with the reference depth image output by the first camera, the terminal device can determine the transformation matrix between the first camera coordinate system and the world coordinate system according to the normal vector and gravity vector of the designated plane.
其中,重力向量可以为Ng(0,0,1),在这种情况下,指定平面通常为地平面,以与终端设备为例如机器狗的场景匹配。然而,可以理解的是,指定平面还可以是特定场景下人为指定的平面,例如墙面、桌面等,本公开对此不做限制。Among them, the gravity vector can be Ng (0,0,1), in which case the designated plane is usually the ground plane to match the scenario where the terminal device is, for example, a robot dog. However, it is understandable that the designated plane can also be a plane manually designated in a specific scenario, such as a wall, a desktop, etc., and the present disclosure does not limit this.
如果将指定平面的法向量记为nc,将nc旋转Rwc之后,可与Ng重合,即可实现第一相机坐标系与世界坐标系的对齐。其中。Rwc为第一相机坐标系与世界坐标系之间的转换矩阵,Rwc的转轴ω可以由Ngnc叉乘得到,如公式4所示:
ω=Ng×nc       (公式4)
If the normal vector of the specified plane is recorded as n c , after rotating n c by R wc , it can coincide with N g , and the alignment of the first camera coordinate system and the world coordinate system can be achieved. Where R wc is the transformation matrix between the first camera coordinate system and the world coordinate system, and the rotation axis ω of R wc can be obtained by the cross product of N g n c , as shown in Formula 4:
ω=N g ×n c (Formula 4)
Rwc的转角θ可以由Ng与nc点乘得到,如公式5所示:
The rotation angle θ of R wc can be obtained by multiplying N g and n c , as shown in Formula 5:
转轴ω和转角θ构成了第一相机坐标系与世界坐标系之间的旋转向量,根据罗德里格斯 公式,终端设备可以计算出第一相机坐标系与世界坐标系之间的转换矩阵Rwc。由此,坐标系对齐的线程结束。The rotation axis ω and the rotation angle θ constitute the rotation vector between the first camera coordinate system and the world coordinate system. According to Rodriguez The terminal device can calculate the transformation matrix R wc between the first camera coordinate system and the world coordinate system. Thus, the thread of coordinate system alignment ends.
在上述处理过程中,如果场景中不存在指定平面,则终端设备可以返回获取参考深度图像的步骤,重新获取参考深度图像,并进行是否存在指定平面的判断过程。In the above processing, if the designated plane does not exist in the scene, the terminal device may return to the step of acquiring the reference depth image, reacquire the reference depth image, and perform a process of determining whether the designated plane exists.
下面对指定平面的确定过程进行说明。The process of determining the specified plane is described below.
首先,终端设备可以结合第一相机输出的参考深度图像,确定出第一相机对应的点云,记为参考点云。First, the terminal device can determine the point cloud corresponding to the first camera in combination with the reference depth image output by the first camera, which is recorded as the reference point cloud.
根据本公开的一些实施例,终端设备针对第一相机输出的参考深度图像上的每一个像素点,根据像素点、像素点的深度值和第一相机的相机内参确定参考深度图像上各像素点的三维空间点。公式6给出了此处确定三维空间点的方式:
P=z*K-1*p     (公式6)
According to some embodiments of the present disclosure, the terminal device determines the three-dimensional space point of each pixel on the reference depth image output by the first camera according to the pixel, the depth value of the pixel and the camera internal parameter of the first camera. Formula 6 gives the method of determining the three-dimensional space point here:
P=z*K -1 *p (Formula 6)
其中,P表示投射到空间的三维空间点,z表示该像素点的深度值,K-1表示相机内参矩阵的逆,p表示该像素点的坐标位置。Among them, P represents the three-dimensional space point projected into the space, z represents the depth value of the pixel point, K -1 represents the inverse of the camera intrinsic parameter matrix, and p represents the coordinate position of the pixel point.
在这些实施例中,可以由经此过程得到三维空间点构建出第一相机对应的参考点云。In these embodiments, a reference point cloud corresponding to the first camera may be constructed from the three-dimensional space points obtained through this process.
根据本公开的另一些实施例,一方面,终端设备针对第一相机输出的参考深度图像上的每一个像素点,根据像素点、像素点的深度值和第一相机的相机内参确定参考深度图像上各像素点的三维空间点。According to some other embodiments of the present disclosure, on the one hand, the terminal device determines, for each pixel point on the reference depth image output by the first camera, the three-dimensional spatial point of each pixel point on the reference depth image according to the pixel point, the depth value of the pixel point and the camera intrinsic parameters of the first camera.
另一方面,终端设备可以获取第二相机输出的参考深度图像,并结合上述公式6确定第二相机输出的参考深度图像上每一个像素点的三维空间点。On the other hand, the terminal device can obtain the reference depth image output by the second camera, and determine the three-dimensional space point of each pixel on the reference depth image output by the second camera in combination with the above formula 6.
终端设备可以根据第一相机坐标系与第二相机坐标系之间的转换矩阵将第二相机输出的参考深度图像上每一个像素点的三维空间点进行转换,以得到转换后的三维空间点。The terminal device can transform the three-dimensional space point of each pixel on the reference depth image output by the second camera according to the transformation matrix between the first camera coordinate system and the second camera coordinate system to obtain the transformed three-dimensional space point.
由此,将第一相机输出的参考深度图像上每一个像素点的三维空间点与上述转换后的三维空间点合并,以构建出第一相机对应的参考点云。参考公式7:
PC_mixture=PC_left+Tlr*PC_right   (公式7)
Therefore, the three-dimensional space point of each pixel on the reference depth image output by the first camera is combined with the three-dimensional space point after the above conversion to construct a reference point cloud corresponding to the first camera. Refer to formula 7:
PC_mixture=PC_left+T lr *PC_right (Formula 7)
其中,PC_mixture为确定出的参考点云,PC_right为第二相机输出的参考深度图像上每一个像素点的三维空间点,PC_left为第一相机输出的参考深度图像上每一个像素点的三维空间点,Tlr为第一相机坐标系与第二相机坐标系之间的转换矩阵。Among them, PC_mixture is the determined reference point cloud, PC_right is the three-dimensional space point of each pixel point on the reference depth image output by the second camera, PC_left is the three-dimensional space point of each pixel point on the reference depth image output by the first camera, and T lr is the transformation matrix between the first camera coordinate system and the second camera coordinate system.
在这些实施例中,参考点云的构建融合了第二相机输出的深度图像的信息,由此,空间特征点更加全面,提高算法的准确度。In these embodiments, the construction of the reference point cloud incorporates information of the depth image output by the second camera, thereby making the spatial feature points more comprehensive and improving the accuracy of the algorithm.
在确定出第一相机对应的参考点云之后,终端设备可以提取参考点云的平面信息。本公开对平面提取方式不做限制,可以采用ransac拟合的方式、法向量区域生长的方式、层次聚类的方式等等,只要能够提取出场景中的平面信息即可。本公开一些实施例采用了基于层次聚类的平面提取算法peac,参考图8,利用该算法可以提取到的两个平面,图8仅是示例,利用上述算法可以提取到场景中的所有平面。After determining the reference point cloud corresponding to the first camera, the terminal device can extract the plane information of the reference point cloud. The present disclosure does not limit the plane extraction method, and can adopt the RANSAC fitting method, the normal vector region growing method, the hierarchical clustering method, etc., as long as the plane information in the scene can be extracted. Some embodiments of the present disclosure adopt the plane extraction algorithm PEAC based on hierarchical clustering. Referring to Figure 8, two planes can be extracted using this algorithm. Figure 8 is only an example. All planes in the scene can be extracted using the above algorithm.
可以理解的是,提取到的平面信息包括但不限于平面id、平面的法向量、平面距相机 的距离等。It is understood that the extracted plane information includes but is not limited to the plane ID, the plane normal vector, the plane distance from the camera, distance, etc.
在基于参考点云提取到平面之后,终端设备可以根据参考点云的平面信息筛选指定平面。具体的,终端设备可以根据参考点云的平面信息中包含的平面距第一相机的距离信息筛选指定平面。After extracting the plane based on the reference point cloud, the terminal device may filter the designated plane according to the plane information of the reference point cloud. Specifically, the terminal device may filter the designated plane according to the distance information of the plane from the first camera included in the plane information of the reference point cloud.
在该距离信息中包含预定距离范围内的距离的情况下,终端设备可以确定与该距离对应的候选平面,此时确定出的候选平面的数量为一个或多个。In a case where the distance information includes a distance within a predetermined distance range, the terminal device may determine a candidate plane corresponding to the distance, and in this case, the number of the determined candidate planes is one or more.
在候选平面的数量为一个的情况下,终端设备可以将该候选平面确定为指定平面。When the number of candidate planes is one, the terminal device may determine the candidate plane as the designated plane.
在候选平面的数量为多个的情况下,终端设备可以将距第一相机的距离最接近距离阈值的候选平面确定为指定平面。其中,该距离阈值在上述预定距离范围内。In the case where there are multiple candidate planes, the terminal device may determine the candidate plane whose distance from the first camera is closest to a distance threshold as the designated plane, wherein the distance threshold is within the above-mentioned predetermined distance range.
图9示出了筛选出地平面的示意图,相对于平面检测的结果,通过上述基于距离的筛选过程,剔除了例如天花板等平面。FIG. 9 is a schematic diagram showing the screening of the ground plane. Compared with the result of plane detection, planes such as the ceiling are eliminated through the above distance-based screening process.
以终端设备是机器狗为例,终端设备配置有第一相机和一个第二相机,两个相机的配置位置固定,在实施方案时,控制机器狗运动一小段时间,仅在地平面上运动。基于此先验条件,地平面在第一相机坐标系下的位置基本固定。地平面距离相机的高度与机器狗的高度相当,约为0.3m。由此,可以将上述预定距离范围设置为0.25m至0.35m,作为地平面。如果筛选出多个候选平面,则将距离最近接0.3m的平面作为地平面。Take the case where the terminal device is a robot dog. The terminal device is equipped with a first camera and a second camera. The configuration positions of the two cameras are fixed. When implementing the solution, the robot dog is controlled to move for a short period of time and only moves on the ground plane. Based on this prior condition, the position of the ground plane in the coordinate system of the first camera is basically fixed. The height of the ground plane from the camera is equivalent to the height of the robot dog, which is about 0.3m. Therefore, the above-mentioned predetermined distance range can be set to 0.25m to 0.35m as the ground plane. If multiple candidate planes are screened out, the plane with the closest distance of 0.3m is used as the ground plane.
应当理解的是,如果在此过程中未检测到地平面,则控制终端设备不断重复上述深度图像确定平面以及平面筛选的过程,直至终端设备检测到地平面为止。It should be understood that if the ground plane is not detected during this process, the terminal device is controlled to continuously repeat the above-mentioned process of determining the plane using the depth image and plane screening until the terminal device detects the ground plane.
下面参考图10对本公开实施例的坐标系对齐的过程进行说明。The coordinate system alignment process of the embodiment of the present disclosure is described below with reference to FIG. 10 .
在步骤S1002中,终端设备获取第一相机输出的参考深度图像,并将该参考深度图像反投影以得到空间中的三维空间点。In step S1002, the terminal device obtains a reference depth image output by the first camera, and back-projects the reference depth image to obtain a three-dimensional space point in space.
在步骤S1004中,终端设备获取第二相机输出的参考深度图像,并将该参考深度图像反投影以得到空间中的三维空间点。In step S1004, the terminal device obtains a reference depth image output by the second camera, and back-projects the reference depth image to obtain a three-dimensional space point in space.
在步骤S1006中,终端设备将步骤S1004得到的三维空间点转换至第一相机坐标系下的三维空间点。In step S1006, the terminal device converts the three-dimensional space point obtained in step S1004 to a three-dimensional space point in the first camera coordinate system.
在步骤S1008中,终端设备将步骤S1002得到的三维空间点与步骤S1006得到的三维空间点合并,以得到第一相机对应的参考点云。In step S1008, the terminal device merges the three-dimensional space point obtained in step S1002 with the three-dimensional space point obtained in step S1006 to obtain a reference point cloud corresponding to the first camera.
在步骤S1010中,终端设备可以基于参考点云提取平面信息。In step S1010, the terminal device may extract plane information based on the reference point cloud.
在步骤S1012中,终端设备可以对提取到的平面进行筛选,确定出地平面;In step S1012, the terminal device may screen the extracted planes to determine the ground plane;
在步骤S1014中,终端设备可以利用地平面的法向量和重力向量确定第一相机坐标系与世界坐标系之间的转换矩阵,以完成第一相机坐标系与世界坐标系的对齐。In step S1014, the terminal device may determine a transformation matrix between the first camera coordinate system and the world coordinate system using the normal vector of the ground plane and the gravity vector to complete the alignment of the first camera coordinate system and the world coordinate system.
另外,鉴于第一相机坐标系与第二相机坐标系之间的关系已通过标定确定出,由此,还可以得到第二相机坐标系与世界坐标系之间的转换矩阵,以实现第一相机坐标系、第二相机坐标系、世界坐标系三者的对齐。由此,可以将坐标系对齐结果应用于本公开上述位姿确定过程中。In addition, since the relationship between the first camera coordinate system and the second camera coordinate system has been determined through calibration, the transformation matrix between the second camera coordinate system and the world coordinate system can also be obtained to achieve alignment of the first camera coordinate system, the second camera coordinate system, and the world coordinate system. Therefore, the coordinate system alignment result can be applied to the above-mentioned posture determination process of the present disclosure.
应当注意,尽管在附图中以特定顺序描述了本公开中方法的各个步骤,但是,这并非 要求或者暗示必须按照该特定顺序来执行这些步骤,或是必须执行全部所示的步骤才能实现期望的结果。附加的或备选的,可以省略某些步骤,将多个步骤合并为一个步骤执行,以及/或者将一个步骤分解为多个步骤执行等。It should be noted that although the steps of the method in the present disclosure are described in a specific order in the drawings, this is not necessarily the case. It is required or implied that the steps must be performed in this particular order, or that all the steps shown must be performed to achieve the desired result. Additionally or alternatively, some steps may be omitted, multiple steps may be combined into one step, and/or one step may be decomposed into multiple steps, etc.
进一步的,本示例实施方式中还提供了一种位姿确定装置。该位姿确定装置配置于终端设备,终端设备还配置有第一相机和至少一个第二相机。Furthermore, this exemplary embodiment also provides a posture determination device, which is configured in a terminal device, and the terminal device is also configured with a first camera and at least one second camera.
图11示意性示出了本公开的示例性实施方式的位姿确定装置的方框图。参考图11,根据本公开的示例性实施方式的位姿确定装置11可以包括第一特征点确定模块111、第二特征点确定模块113、特征点转换模块115和位姿确定模块117。FIG11 schematically shows a block diagram of a posture determination device according to an exemplary embodiment of the present disclosure. Referring to FIG11 , the posture determination device 11 according to an exemplary embodiment of the present disclosure may include a first feature point determination module 111 , a second feature point determination module 113 , a feature point conversion module 115 , and a posture determination module 117 .
具体的,第一特征点确定模块111可以用于获取第一相机采集的当前帧彩色图像,确定第一相机采集的当前帧彩色图像上与第一相机采集的上一帧彩色图像匹配的第一二维特征点;第二特征点确定模块113可以用于获取第二相机采集的当前帧彩色图像,确定第二相机采集的当前帧彩色图像上与第二相机采集的上一帧彩色图像匹配的第二二维特征点;特征点转换模块115可以用于利用第一相机的第一相机坐标系与第二相机的第二相机坐标系之间的转换矩阵将第二二维特征点转换为第一相机坐标系下的第三二维特征点;位姿确定模块117可以用于根据第一二维特征点、第三二维特征点、第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点以及第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点,确定第一相机采集当前帧彩色图像时的位姿。Specifically, the first feature point determination module 111 can be used to obtain the current frame color image captured by the first camera, and determine the first two-dimensional feature points on the current frame color image captured by the first camera that match the previous frame color image captured by the first camera; the second feature point determination module 113 can be used to obtain the current frame color image captured by the second camera, and determine the second two-dimensional feature points on the current frame color image captured by the second camera that match the previous frame color image captured by the second camera; the feature point conversion module 115 can be used to use the conversion matrix between the first camera coordinate system of the first camera and the second camera coordinate system of the second camera to convert the second two-dimensional feature points into third two-dimensional feature points in the first camera coordinate system; the posture determination module 117 can be used to determine the posture of the first camera when capturing the current frame color image based on the first two-dimensional feature points, the third two-dimensional feature points, the three-dimensional feature points of the previous frame color image captured by the first camera in the world coordinate system, and the three-dimensional feature points of the previous frame color image captured by the second camera in the world coordinate system.
根据本公开的示例性实施例,第一特征点确定模块111可以被配置为执行:提取第一相机采集的当前帧彩色图像的特征点;利用第一相机采集的当前帧彩色图像的特征点以及第一相机采集的上一帧彩色图像的特征点进行光流跟踪,以确定出第一二维特征点。According to an exemplary embodiment of the present disclosure, the first feature point determination module 111 can be configured to perform: extracting feature points of the current frame color image captured by the first camera; performing optical flow tracking using the feature points of the current frame color image captured by the first camera and the feature points of the previous frame color image captured by the first camera to determine the first two-dimensional feature points.
根据本公开的示例性实施例,特征点转换模块115可以被配置为执行:获取第一相机的第一相机坐标系与第二相机的第二相机坐标系之间的转换矩阵以及第二二维特征点的深度信息;根据转换矩阵、第二二维特征点的深度信息以及第二二维特征点确定第三二维特征点。According to an exemplary embodiment of the present disclosure, the feature point conversion module 115 can be configured to perform: obtaining the conversion matrix between the first camera coordinate system of the first camera and the second camera coordinate system of the second camera and the depth information of the second two-dimensional feature point; determining the third two-dimensional feature point based on the conversion matrix, the depth information of the second two-dimensional feature point and the second two-dimensional feature point.
根据本公开的示例性实施例,特征点转换模块115可以被配置为执行:将转换矩阵、第二二维特征点的深度信息以及第二二维特征点相乘,并对相乘的结果进行归一化处理,以确定出第三二维特征点。According to an exemplary embodiment of the present disclosure, the feature point conversion module 115 may be configured to perform: multiplying the conversion matrix, the depth information of the second two-dimensional feature point, and the second two-dimensional feature point, and normalizing the multiplication result to determine a third two-dimensional feature point.
根据本公开的示例性实施例,第一二维特征点和第三二维特征点组成二维坐标信息,第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点和第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点组成三维坐标信息。在这种情况下,位姿确定模块117可以被配置为执行:将二维坐标信息与三维坐标信息关联,以得到点对信息;利用点对信息求解透视n点问题,并结合求解结果确定第一相机采集当前帧彩色图像时的位姿。According to an exemplary embodiment of the present disclosure, the first two-dimensional feature point and the third two-dimensional feature point constitute two-dimensional coordinate information, and the three-dimensional feature points of the previous frame color image captured by the first camera in the world coordinate system and the three-dimensional feature points of the previous frame color image captured by the second camera in the world coordinate system constitute three-dimensional coordinate information. In this case, the posture determination module 117 can be configured to perform: associating the two-dimensional coordinate information with the three-dimensional coordinate information to obtain point pair information; solving the perspective n-point problem using the point pair information, and determining the posture of the first camera when capturing the current frame color image in combination with the solution result.
根据本公开的示例性实施例,参考图12,相比于位姿确定装置11,位姿确定装置12还可以包括第三特征点确定模块121。According to an exemplary embodiment of the present disclosure, referring to FIG. 12 , compared with the position and posture determining apparatus 11 , the position and posture determining apparatus 12 may further include a third feature point determining module 121 .
具体的,第三特征点确定模块121可以被配置为执行:获取第一相机采集的上一帧彩色图像,提取第一相机采集的上一帧彩色图像的特征点;利用与第一相机采集的上一帧彩 色图像对齐的上一帧深度图像,对第一相机采集的上一帧彩色图像的特征点进行空间投射,以得到第一相机采集的上一帧彩色图像在第一相机坐标系下的三维特征点;根据第一相机采集上一帧彩色图像时的位姿,对第一相机坐标系下的三维特征点进行转换,以得到第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点。Specifically, the third feature point determination module 121 can be configured to execute: obtaining the last frame of color image collected by the first camera, extracting the feature points of the last frame of color image collected by the first camera; The last frame of depth image aligned with the color image is spatially projected on the feature points of the last frame of color image acquired by the first camera to obtain the three-dimensional feature points of the last frame of color image acquired by the first camera in the first camera coordinate system; according to the posture of the first camera when acquiring the last frame of color image, the three-dimensional feature points in the first camera coordinate system are transformed to obtain the three-dimensional feature points of the last frame of color image acquired by the first camera in the world coordinate system.
根据本公开的示例性实施例,第三特征点确定模块121可以被配置为执行:利用与第一相机采集的上一帧彩色图像对齐的上一帧深度图像,对第一相机采集的上一帧彩色图像的特征点中处于预定深度范围内的特征点进行空间投射,以得到第一相机采集的上一帧彩色图像在第一相机坐标系下的三维特征点;其中,预定深度范围基于深度测量的量程确定出。According to an exemplary embodiment of the present disclosure, the third feature point determination module 121 can be configured to execute: utilizing a previous frame depth image aligned with a previous frame color image captured by the first camera, and spatially projecting feature points within a predetermined depth range among the feature points of the previous frame color image captured by the first camera to obtain three-dimensional feature points of the previous frame color image captured by the first camera in the first camera coordinate system; wherein the predetermined depth range is determined based on the range of the depth measurement.
根据本公开的示例性实施例,第三特征点确定模块121还可以被配置为执行:获取第二相机采集的上一帧彩色图像,提取第二相机采集的上一帧彩色图像的特征点;利用与第二相机采集的上一帧彩色图像对齐的上一帧深度图像,对第二相机采集的上一帧彩色图像的特征点进行空间投射,以得到第二相机采集的上一帧彩色图像在第二相机坐标系下的三维特征点;利用第一相机坐标系与第二相机坐标系之间的转换矩阵将第二相机采集的上一帧彩色图像在第二相机坐标系下的三维特征点转换为第一相机坐标系下的三维特征点;根据第一相机采集上一帧彩色图像时的位姿,对第一相机坐标系下的三维特征点进行转换,以得到第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点。According to an exemplary embodiment of the present disclosure, the third feature point determination module 121 can also be configured to execute: obtaining the previous frame of color image captured by the second camera, and extracting the feature points of the previous frame of color image captured by the second camera; using the previous frame of depth image aligned with the previous frame of color image captured by the second camera, spatially projecting the feature points of the previous frame of color image captured by the second camera to obtain the three-dimensional feature points of the previous frame of color image captured by the second camera in the second camera coordinate system; using the transformation matrix between the first camera coordinate system and the second camera coordinate system to convert the three-dimensional feature points of the previous frame of color image captured by the second camera in the second camera coordinate system into the three-dimensional feature points in the first camera coordinate system; according to the posture of the first camera when capturing the previous frame of color image, converting the three-dimensional feature points in the first camera coordinate system to obtain the three-dimensional feature points of the previous frame of color image captured by the second camera in the world coordinate system.
根据本公开的示例性实施例,第三特征点确定模块121还可以被配置为执行:利用与第二相机采集的上一帧彩色图像对齐的上一帧深度图像,对第二相机采集的上一帧彩色图像的特征点中处于预定深度范围内的特征点进行空间投射,以得到第二相机采集的上一帧彩色图像在第二相机坐标系下的三维特征点;其中,预定深度范围基于深度测量的量程确定出。According to an exemplary embodiment of the present disclosure, the third feature point determination module 121 can also be configured to execute: utilizing a previous frame depth image aligned with a previous frame color image captured by the second camera, spatially projecting feature points within a predetermined depth range among the feature points of the previous frame color image captured by the second camera, so as to obtain three-dimensional feature points of the previous frame color image captured by the second camera in the second camera coordinate system; wherein the predetermined depth range is determined based on the range of the depth measurement.
根据本公开的示例性实施例,参考图13,相比于位姿确定装置11,位姿确定装置13还可以包括定位初始化模块131。According to an exemplary embodiment of the present disclosure, referring to FIG. 13 , compared with the position and posture determining apparatus 11 , the position and posture determining apparatus 13 may further include a positioning initialization module 131 .
具体的,定位初始化模块131可以被配置为执行:获取第一相机采集的初始帧彩色图像,提取第一相机采集的初始帧彩色图像的特征点;利用与第一相机采集的初始帧彩色图像对齐的初始帧深度图像,对第一相机采集的初始帧彩色图像的特征点进行空间投射,以得到第一相机采集的初始帧彩色图像在第一相机坐标系下的三维特征点;根据第一相机采集的初始帧彩色图像在第一相机坐标系下的三维特征点、初始旋转矩阵和初始平移向量,确定第一相机在第一相机坐标系下的初始定位结果;利用第一相机坐标系与世界坐标系之间的转换矩阵,对第一相机在第一相机坐标系下的初始定位结果进行转换,以确定出第一相机采集初始帧彩色图像时的位姿。Specifically, the positioning initialization module 131 can be configured to execute: obtaining an initial frame color image captured by the first camera, and extracting feature points of the initial frame color image captured by the first camera; using an initial frame depth image aligned with the initial frame color image captured by the first camera, spatially projecting the feature points of the initial frame color image captured by the first camera to obtain three-dimensional feature points of the initial frame color image captured by the first camera in the first camera coordinate system; determining an initial positioning result of the first camera in the first camera coordinate system based on the three-dimensional feature points, initial rotation matrix and initial translation vector of the initial frame color image captured by the first camera in the first camera coordinate system; using a transformation matrix between the first camera coordinate system and the world coordinate system, transforming the initial positioning result of the first camera in the first camera coordinate system to determine the posture of the first camera when the initial frame color image is captured.
根据本公开的示例性实施例,参考图14,相比于位姿确定装置13,位姿确定装置14还可以包括转换矩阵确定模块141。According to an exemplary embodiment of the present disclosure, referring to FIG. 14 , compared with the posture determination device 13 , the posture determination device 14 may further include a transformation matrix determination module 141 .
具体的,转换矩阵确定模块141可以被配置为执行:获取第一相机输出的参考深度图像;在结合第一相机输出的参考深度图像确定出指定平面的情况下,根据指定平面的法向 量和重力向量确定第一相机坐标系与世界坐标系之间的转换矩阵。Specifically, the transformation matrix determination module 141 may be configured to execute: obtaining a reference depth image output by the first camera; determining a specified plane in combination with the reference depth image output by the first camera, and converting the specified plane into a normal plane according to the normal plane. The volume and the gravity vector determine the transformation matrix between the first camera coordinate system and the world coordinate system.
根据本公开的示例性实施例,转换矩阵确定模块141可以被配置为执行:结合第一相机输出的参考深度图像,确定出第一相机对应的参考点云;提取参考点云的平面信息;根据参考点云的平面信息筛选指定平面。According to an exemplary embodiment of the present disclosure, the transformation matrix determination module 141 can be configured to perform: determining a reference point cloud corresponding to the first camera in combination with a reference depth image output by the first camera; extracting plane information of the reference point cloud; and filtering a specified plane according to the plane information of the reference point cloud.
根据本公开的示例性实施例,转换矩阵确定模块141确定参考点云的过程可以被配置为执行:针对第一相机输出的参考深度图像上的每一个像素点,根据像素点、像素点的深度值和第一相机的相机内参确定像素点的三维空间点;结合第一相机输出的参考深度图像上的每一个像素点的三维空间点,构建第一相机对应的参考点云。According to an exemplary embodiment of the present disclosure, the process of determining the reference point cloud by the transformation matrix determination module 141 can be configured to perform: for each pixel point on the reference depth image output by the first camera, determine the three-dimensional space point of the pixel point according to the pixel point, the depth value of the pixel point and the camera intrinsic parameters of the first camera; and construct a reference point cloud corresponding to the first camera in combination with the three-dimensional space point of each pixel point on the reference depth image output by the first camera.
根据本公开的示例性实施例,转换矩阵确定模块141确定参考点云的过程还可以被配置为执行:获取第二相机输出的参考深度图像;确定第二相机输出的参考深度图像上每一个像素点的三维空间点;根据第一相机坐标系与第二相机坐标系之间的转换矩阵将第二相机输出的参考深度图像上每一个像素点的三维空间点进行转换,以得到转换后的三维空间点;将第一相机输出的参考深度图像上的每一个像素点的三维空间点与转换后的三维空间点合并,以构建出第一相机对应的参考点云。According to an exemplary embodiment of the present disclosure, the process of determining the reference point cloud by the transformation matrix determination module 141 can also be configured to execute: obtaining a reference depth image output by the second camera; determining the three-dimensional space point of each pixel point on the reference depth image output by the second camera; transforming the three-dimensional space point of each pixel point on the reference depth image output by the second camera according to the transformation matrix between the first camera coordinate system and the second camera coordinate system to obtain the transformed three-dimensional space point; merging the three-dimensional space point of each pixel point on the reference depth image output by the first camera with the transformed three-dimensional space point to construct a reference point cloud corresponding to the first camera.
根据本公开的示例性实施例,转换矩阵确定模块141筛选指定平面的过程可以被配置为执行:根据参考点云的平面信息中包含的平面距第一相机的距离信息筛选指定平面。According to an exemplary embodiment of the present disclosure, the process of selecting the designated plane by the transformation matrix determination module 141 may be configured to perform: selecting the designated plane according to the distance information of the plane from the first camera included in the plane information of the reference point cloud.
根据本公开的示例性实施例,转换矩阵确定模块141筛选指定平面的过程可以被配置为执行:在距离信息中包含预定距离范围内的距离的情况下,确定与距离信息中处于预定距离范围内的距离对应的候选平面;在候选平面的数量为一个的情况下,将候选平面确定为指定平面;在候选平面的数量为多个的情况下,将距第一相机的距离最接近距离阈值的候选平面确定为指定平面;其中,距离阈值在预定距离范围内。According to an exemplary embodiment of the present disclosure, the process of the transformation matrix determination module 141 screening the designated plane can be configured to perform: when the distance information contains a distance within a predetermined distance range, determining a candidate plane corresponding to the distance in the distance information within the predetermined distance range; when the number of candidate planes is one, determining the candidate plane as the designated plane; when the number of candidate planes is multiple, determining the candidate plane whose distance from the first camera is closest to a distance threshold as the designated plane; wherein the distance threshold is within the predetermined distance range.
根据本公开的示例性实施例,指定平面为地平面。According to an exemplary embodiment of the present disclosure, the designated plane is a ground plane.
由于本公开实施方式的位姿确定装置的各个功能模块与上述方法实施方式中相同,因此在此不再赘述。Since the various functional modules of the posture determination device of the embodiment of the present disclosure are the same as those in the above-mentioned method implementation, they will not be described again here.
图15示出了适于用来实现本公开示例性实施方式的电子设备的示意图。本公开示例性实施方式的终端设备可以被配置为如图15的形式。需要说明的是,图15示出的电子设备仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。FIG15 shows a schematic diagram of an electronic device suitable for implementing an exemplary embodiment of the present disclosure. The terminal device of the exemplary embodiment of the present disclosure may be configured as shown in FIG15. It should be noted that the electronic device shown in FIG15 is only an example and should not bring any limitation to the functions and scope of use of the embodiments of the present disclosure.
本公开的电子设备至少包括处理器和存储器,存储器用于存储一个或多个程序,当一个或多个程序被处理器执行时,使得处理器可以实现本公开示例性实施方式的位姿确定方法。The electronic device of the present disclosure includes at least a processor and a memory, and the memory is used to store one or more programs. When the one or more programs are executed by the processor, the processor can implement the posture determination method of the exemplary embodiment of the present disclosure.
具体的,如图15所示,电子设备150至少包括:处理器1510、内部存储器1521、外部存储器接口1522、通用串行总线(Universal Serial Bus,USB)接口1530、充电管理模块1540、电源管理模块1541、电池1542、天线、无线通信模块1550、音频模块1560、显示屏1570、传感器模块1580、摄像模组1590等。其中传感器模块1580可以包括深度传感器、压力传感器、陀螺仪传感器、气压传感器、磁传感器、加速度传感器、距离传感器、接近光传感器、指纹传感器、温度传感器、触摸传感器、环境光传感器及骨传导传感器等。 Specifically, as shown in FIG15 , the electronic device 150 at least includes: a processor 1510, an internal memory 1521, an external memory interface 1522, a Universal Serial Bus (USB) interface 1530, a charging management module 1540, a power management module 1541, a battery 1542, an antenna, a wireless communication module 1550, an audio module 1560, a display screen 1570, a sensor module 1580, a camera module 1590, etc. The sensor module 1580 may include a depth sensor, a pressure sensor, a gyroscope sensor, an air pressure sensor, a magnetic sensor, an acceleration sensor, a distance sensor, a proximity light sensor, a fingerprint sensor, a temperature sensor, a touch sensor, an ambient light sensor, a bone conduction sensor, etc.
可以理解的是,本公开实施例示意的结构并不构成对电子设备150的具体限定。在本公开另一些实施例中,电子设备150可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件、软件或软件和硬件的组合实现。It is to be understood that the structure illustrated in the embodiment of the present disclosure does not constitute a specific limitation on the electronic device 150. In other embodiments of the present disclosure, the electronic device 150 may include more or fewer components than shown in the figure, or combine some components, or split some components, or arrange the components differently. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
处理器1510可以包括一个或多个处理单元,例如:处理器1510可以包括应用处理器(Application Processor,AP)、调制解调处理器、图形处理器(Graphics Processing Unit,GPU)、图像信号处理器(Image Signal Processor,ISP)、控制器、视频编解码器、数字信号处理器(Digital Signal Processor,DSP)、基带处理器和/或神经网络处理器(Neural-network Processing Unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。另外,处理器1510中还可以设置存储器,用于存储指令和数据。The processor 1510 may include one or more processing units, for example, the processor 1510 may include an application processor (AP), a modem processor, a graphics processor (GPU), an image signal processor (ISP), a controller, a video codec, a digital signal processor (DSP), a baseband processor and/or a neural network processor (NPU). Different processing units may be independent devices or integrated in one or more processors. In addition, a memory may be provided in the processor 1510 for storing instructions and data.
电子设备150可以通过ISP、摄像模组1590、视频编解码器、GPU、显示屏1570及应用处理器等实现拍摄功能。在一些实施例中,电子设备150可以包括至少两个摄像模组1590,在实现本公开方案时,将一个摄像模组确定为基准相机,其他摄像模组采集到的特征数据转移到该基准相机的坐标系下进行处理。例如,电子设备150配置有两个RealsenseD455相机。The electronic device 150 can implement the shooting function through the ISP, the camera module 1590, the video codec, the GPU, the display screen 1570 and the application processor. In some embodiments, the electronic device 150 may include at least two camera modules 1590. When implementing the disclosed solution, one camera module is determined as the reference camera, and the feature data collected by the other camera modules is transferred to the coordinate system of the reference camera for processing. For example, the electronic device 150 is configured with two Realsense D455 cameras.
内部存储器1521可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。内部存储器1521可以包括存储程序区和存储数据区。外部存储器接口1522可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备150的存储能力。The internal memory 1521 can be used to store computer executable program codes, which include instructions. The internal memory 1521 can include a program storage area and a data storage area. The external memory interface 1522 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 150.
本公开还提供了一种计算机可读存储介质,该计算机可读存储介质可以是上述实施例中描述的电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。The present disclosure also provides a computer-readable storage medium, which may be included in the electronic device described in the above embodiments; or may exist independently without being assembled into the electronic device.
计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。Computer-readable storage media may be, for example, but not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or components, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memories (RAM), read-only memories (ROM), erasable programmable read-only memories (EPROM or flash memory), optical fibers, portable compact disk read-only memories (CD-ROMs), optical storage devices, magnetic storage devices, or any suitable combination thereof. In the present disclosure, computer-readable storage media may be any tangible medium containing or storing a program that may be used by or in conjunction with an instruction execution system, device, or device.
计算机可读存储介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读存储介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。Computer-readable storage media can send, propagate or transmit programs for use by or in conjunction with an instruction execution system, apparatus or device. The program code contained on the computer-readable storage medium can be transmitted using any appropriate medium, including but not limited to: wireless, wire, optical cable, RF, etc., or any suitable combination of the above.
计算机可读存储介质承载有一个或者多个程序,当上述一个或者多个程序被一个该电子设备执行时,使得该电子设备实现如本公开实施例中所述的方法。The computer-readable storage medium carries one or more programs. When the one or more programs are executed by an electronic device, the electronic device implements the method described in the embodiments of the present disclosure.
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框 中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图或流程图中的每个方框、以及框图或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the accompanying drawings illustrate the possible architecture, functions and operations of the systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each box in the flowchart or block diagram may represent a module, a program segment, or a portion of code, which contains one or more executable instructions for implementing the specified logical functions. It should also be noted that in some alternative implementations, the box may be a module, a program segment, or a portion of code. The functions noted in the figures may also occur in a different order than that noted in the figures. For example, two blocks shown in succession may actually be executed substantially in parallel, or they may sometimes be executed in the opposite order, depending on the functions involved. It should also be noted that each block in a block diagram or flow chart, and combinations of blocks in a block diagram or flow chart, may be implemented by a dedicated hardware-based system that performs the specified functions or operations, or may be implemented by a combination of dedicated hardware and computer instructions.
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现,所描述的单元也可以设置在处理器中。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定。The units involved in the embodiments described in the present disclosure may be implemented by software or hardware, and the units described may also be arranged in a processor. The names of these units do not constitute limitations on the units themselves in some cases.
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本公开实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、终端装置、或者网络设备等)执行根据本公开实施方式的方法。Through the description of the above implementation, it is easy for those skilled in the art to understand that the example implementation described here can be implemented by software, or by software combined with necessary hardware. Therefore, the technical solution according to the implementation of the present disclosure can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a USB flash drive, a mobile hard disk, etc.) or on a network, including several instructions to enable a computing device (which can be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the implementation of the present disclosure.
此外,上述附图仅是根据本公开示例性实施例的方法所包括的处理的示意性说明,而不是限制目的。易于理解,上述附图所示的处理并不表明或限制这些处理的时间顺序。另外,也易于理解,这些处理可以是例如在多个模块中同步或异步执行的。In addition, the above-mentioned figures are only schematic illustrations of the processes included in the method according to the exemplary embodiments of the present disclosure, and are not intended to be limiting. It is easy to understand that the processes shown in the above-mentioned figures do not indicate or limit the time sequence of these processes. In addition, it is also easy to understand that these processes can be performed synchronously or asynchronously, for example, in multiple modules.
应当注意,尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本公开的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。It should be noted that, although several modules or units of the device for action execution are mentioned in the above detailed description, this division is not mandatory. In fact, according to the embodiments of the present disclosure, the features and functions of two or more modules or units described above can be embodied in one module or unit. Conversely, the features and functions of one module or unit described above can be further divided into multiple modules or units to be embodied.
本领域技术人员在考虑说明书及实践这里公开的内容后,将容易想到本公开的其他实施例。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由权利要求指出。Those skilled in the art will readily appreciate other embodiments of the present disclosure after considering the specification and practicing what is disclosed herein. This application is intended to cover any variations, uses, or adaptations of the present disclosure that follow the general principles of the present disclosure and include common knowledge or customary technical means in the art that are not disclosed in the present disclosure. The specification and embodiments are to be considered as exemplary only, and the true scope and spirit of the present disclosure are indicated by the claims.
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限。 It should be understood that the present disclosure is not limited to the exact structures that have been described above and shown in the drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (20)

  1. 一种位姿确定方法,其中,应用于终端设备,所述终端设备配置有第一相机和至少一个第二相机,所述位姿确定方法包括:A method for determining a posture, wherein the method is applied to a terminal device, wherein the terminal device is configured with a first camera and at least one second camera, and the method for determining a posture comprises:
    获取所述第一相机采集的当前帧彩色图像,确定所述第一相机采集的当前帧彩色图像上与所述第一相机采集的上一帧彩色图像匹配的第一二维特征点;Acquire a current frame color image acquired by the first camera, and determine a first two-dimensional feature point on the current frame color image acquired by the first camera that matches a previous frame color image acquired by the first camera;
    获取所述第二相机采集的当前帧彩色图像,确定所述第二相机采集的当前帧彩色图像上与所述第二相机采集的上一帧彩色图像匹配的第二二维特征点;Acquire a current frame color image acquired by the second camera, and determine a second two-dimensional feature point on the current frame color image acquired by the second camera that matches a previous frame color image acquired by the second camera;
    利用所述第一相机的第一相机坐标系与所述第二相机的第二相机坐标系之间的转换矩阵将所述第二二维特征点转换为所述第一相机坐标系下的第三二维特征点;Convert the second two-dimensional feature points to third two-dimensional feature points in the first camera coordinate system using a conversion matrix between a first camera coordinate system of the first camera and a second camera coordinate system of the second camera;
    根据所述第一二维特征点、所述第三二维特征点、所述第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点以及所述第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点,确定所述第一相机采集当前帧彩色图像时的位姿。The posture of the first camera when capturing the current frame of color image is determined based on the first two-dimensional feature points, the third two-dimensional feature points, the three-dimensional feature points of the previous frame of color image captured by the first camera in the world coordinate system, and the three-dimensional feature points of the previous frame of color image captured by the second camera in the world coordinate system.
  2. 根据权利要求1所述的位姿确定方法,其中,确定所述第一相机采集的当前帧彩色图像上与所述第一相机采集的上一帧彩色图像匹配的第一二维特征点包括:The method for determining the position and posture of claim 1, wherein determining the first two-dimensional feature point on the current frame color image captured by the first camera that matches the previous frame color image captured by the first camera comprises:
    提取所述第一相机采集的当前帧彩色图像的特征点;Extracting feature points of the current frame color image acquired by the first camera;
    利用所述第一相机采集的当前帧彩色图像的特征点以及所述第一相机采集的上一帧彩色图像的特征点进行光流跟踪,以确定出所述第一二维特征点。Optical flow tracking is performed using feature points of a current frame color image captured by the first camera and feature points of a previous frame color image captured by the first camera to determine the first two-dimensional feature points.
  3. 根据权利要求1所述的位姿确定方法,其中,利用所述第一相机的第一相机坐标系与所述第二相机的第二相机坐标系之间的转换矩阵将所述第二二维特征点转换为所述第一相机坐标系下的第三二维特征点包括:The method for determining the position and posture of claim 1, wherein using a transformation matrix between a first camera coordinate system of the first camera and a second camera coordinate system of the second camera to transform the second two-dimensional feature points into third two-dimensional feature points in the first camera coordinate system comprises:
    获取所述第一相机的第一相机坐标系与所述第二相机的第二相机坐标系之间的转换矩阵以及所述第二二维特征点的深度信息;Acquire a transformation matrix between a first camera coordinate system of the first camera and a second camera coordinate system of the second camera and depth information of the second two-dimensional feature point;
    根据所述转换矩阵、所述第二二维特征点的深度信息以及所述第二二维特征点确定所述第三二维特征点。The third two-dimensional feature point is determined according to the conversion matrix, the depth information of the second two-dimensional feature point, and the second two-dimensional feature point.
  4. 根据权利要求3所述的位姿确定方法,其中,根据所述转换矩阵、所述第二二维特征点的深度信息以及所述第二二维特征点确定所述第三二维特征点包括:The method for determining the posture according to claim 3, wherein determining the third two-dimensional feature point according to the transformation matrix, the depth information of the second two-dimensional feature point, and the second two-dimensional feature point comprises:
    将所述转换矩阵、所述第二二维特征点的深度信息以及所述第二二维特征点相乘,并对相乘的结果进行归一化处理,以确定出所述第三二维特征点。The conversion matrix, the depth information of the second two-dimensional feature point, and the second two-dimensional feature point are multiplied, and a result of the multiplication is normalized to determine the third two-dimensional feature point.
  5. 根据权利要求1所述的位姿确定方法,其中,所述第一二维特征点和所述第三二维特征点组成二维坐标信息,所述第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点和所述第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点组成三维坐标信息;其中,根据所述第一二维特征点、所述第三二维特征点、所述第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点以及所述第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点,确定所述第一相机采集当前帧彩色图像时的位姿,包括:The method for determining the position and posture of claim 1, wherein the first two-dimensional feature points and the third two-dimensional feature points constitute two-dimensional coordinate information, and the three-dimensional feature points of the previous frame of color image acquired by the first camera in the world coordinate system and the three-dimensional feature points of the previous frame of color image acquired by the second camera in the world coordinate system constitute three-dimensional coordinate information; wherein, according to the first two-dimensional feature points, the third two-dimensional feature points, the three-dimensional feature points of the previous frame of color image acquired by the first camera in the world coordinate system, and the three-dimensional feature points of the previous frame of color image acquired by the second camera in the world coordinate system, determining the position and posture of the first camera when acquiring the current frame of color image includes:
    将所述二维坐标信息与所述三维坐标信息关联,以得到点对信息;Associating the two-dimensional coordinate information with the three-dimensional coordinate information to obtain point pair information;
    利用所述点对信息求解透视n点问题,并结合求解结果确定所述第一相机采集当前帧 彩色图像时的位姿。The point pair information is used to solve the perspective n-point problem, and the first camera is used to capture the current frame based on the solution result. Color image pose.
  6. 根据权利要求1所述的位姿确定方法,其中,所述位姿确定方法还包括:The method for determining a posture according to claim 1, wherein the method for determining a posture further comprises:
    获取所述第一相机采集的上一帧彩色图像,提取所述第一相机采集的上一帧彩色图像的特征点;Acquire a previous frame of color image captured by the first camera, and extract feature points of the previous frame of color image captured by the first camera;
    利用与所述第一相机采集的上一帧彩色图像对齐的上一帧深度图像,对所述第一相机采集的上一帧彩色图像的特征点进行空间投射,以得到所述第一相机采集的上一帧彩色图像在所述第一相机坐标系下的三维特征点;Using a previous frame of depth image aligned with the previous frame of color image acquired by the first camera, spatially projecting feature points of the previous frame of color image acquired by the first camera to obtain three-dimensional feature points of the previous frame of color image acquired by the first camera in the first camera coordinate system;
    根据所述第一相机采集上一帧彩色图像时的位姿,对所述第一相机坐标系下的三维特征点进行转换,以得到所述第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点。According to the posture of the first camera when capturing the last frame of color image, the three-dimensional feature points in the first camera coordinate system are transformed to obtain the three-dimensional feature points of the last frame of color image captured by the first camera in the world coordinate system.
  7. 根据权利要求6所述的位姿确定方法,其中,利用与所述第一相机采集的上一帧彩色图像对齐的上一帧深度图像,对所述第一相机采集的上一帧彩色图像的特征点进行空间投射,以得到所述第一相机采集的上一帧彩色图像在所述第一相机坐标系下的三维特征点,包括:The method for determining the position and posture of claim 6, wherein the feature points of the previous frame color image captured by the first camera are spatially projected using the previous frame depth image aligned with the previous frame color image captured by the first camera to obtain the three-dimensional feature points of the previous frame color image captured by the first camera in the first camera coordinate system, comprising:
    利用与所述第一相机采集的上一帧彩色图像对齐的上一帧深度图像,对所述第一相机采集的上一帧彩色图像的特征点中处于预定深度范围内的特征点进行空间投射,以得到所述第一相机采集的上一帧彩色图像在所述第一相机坐标系下的三维特征点;Using a previous frame of depth image aligned with the previous frame of color image acquired by the first camera, spatially projecting feature points within a predetermined depth range among feature points of the previous frame of color image acquired by the first camera, so as to obtain three-dimensional feature points of the previous frame of color image acquired by the first camera in the first camera coordinate system;
    其中,所述预定深度范围基于深度测量的量程确定出。The predetermined depth range is determined based on a range of depth measurement.
  8. 根据权利要求1所述的位姿确定方法,其中,所述位姿确定方法还包括:The method for determining a posture according to claim 1, wherein the method for determining a posture further comprises:
    获取所述第二相机采集的上一帧彩色图像,提取所述第二相机采集的上一帧彩色图像的特征点;Acquire a previous frame of color image captured by the second camera, and extract feature points of the previous frame of color image captured by the second camera;
    利用与所述第二相机采集的上一帧彩色图像对齐的上一帧深度图像,对所述第二相机采集的上一帧彩色图像的特征点进行空间投射,以得到所述第二相机采集的上一帧彩色图像在所述第二相机坐标系下的三维特征点;Using a previous frame of depth image aligned with a previous frame of color image acquired by the second camera, spatially projecting feature points of the previous frame of color image acquired by the second camera to obtain three-dimensional feature points of the previous frame of color image acquired by the second camera in the second camera coordinate system;
    利用所述第一相机坐标系与所述第二相机坐标系之间的转换矩阵将所述第二相机采集的上一帧彩色图像在所述第二相机坐标系下的三维特征点转换为所述第一相机坐标系下的三维特征点;Using a transformation matrix between the first camera coordinate system and the second camera coordinate system, the three-dimensional feature points of the previous frame of color image acquired by the second camera in the second camera coordinate system are transformed into three-dimensional feature points in the first camera coordinate system;
    根据所述第一相机采集上一帧彩色图像时的位姿,对所述第一相机坐标系下的三维特征点进行转换,以得到所述第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点。According to the posture of the first camera when capturing the last frame of color image, the three-dimensional feature points in the first camera coordinate system are transformed to obtain the three-dimensional feature points of the last frame of color image captured by the second camera in the world coordinate system.
  9. 根据权利要求8所述的位姿确定方法,其中,利用与所述第二相机采集的上一帧彩色图像对齐的上一帧深度图像,对所述第二相机采集的上一帧彩色图像的特征点进行空间投射,以得到所述第二相机采集的上一帧彩色图像在所述第二相机坐标系下的三维特征点,包括:The method for determining the position and posture of claim 8, wherein the feature points of the previous frame color image captured by the second camera are spatially projected using the previous frame depth image aligned with the previous frame color image captured by the second camera to obtain the three-dimensional feature points of the previous frame color image captured by the second camera in the second camera coordinate system, comprising:
    利用与所述第二相机采集的上一帧彩色图像对齐的上一帧深度图像,对所述第二相机采集的上一帧彩色图像的特征点中处于预定深度范围内的特征点进行空间投射,以得到所述第二相机采集的上一帧彩色图像在所述第二相机坐标系下的三维特征点;Using a previous frame of depth image aligned with the previous frame of color image acquired by the second camera, spatially projecting feature points within a predetermined depth range among feature points of the previous frame of color image acquired by the second camera, so as to obtain three-dimensional feature points of the previous frame of color image acquired by the second camera in the second camera coordinate system;
    其中,所述预定深度范围基于深度测量的量程确定出。 The predetermined depth range is determined based on a range of depth measurement.
  10. 根据权利要求1至9中任一项所述的位姿确定方法,其中,所述位姿确定方法还包括:The method for determining a posture according to any one of claims 1 to 9, wherein the method for determining a posture further comprises:
    获取所述第一相机采集的初始帧彩色图像,提取所述第一相机采集的初始帧彩色图像的特征点;Acquire an initial frame color image captured by the first camera, and extract feature points of the initial frame color image captured by the first camera;
    利用与所述第一相机采集的初始帧彩色图像对齐的初始帧深度图像,对所述第一相机采集的初始帧彩色图像的特征点进行空间投射,以得到所述第一相机采集的初始帧彩色图像在所述第一相机坐标系下的三维特征点;Using the initial frame depth image aligned with the initial frame color image acquired by the first camera, spatially projecting the feature points of the initial frame color image acquired by the first camera to obtain three-dimensional feature points of the initial frame color image acquired by the first camera in the first camera coordinate system;
    根据所述第一相机采集的初始帧彩色图像在所述第一相机坐标系下的三维特征点、初始旋转矩阵和初始平移向量,确定所述第一相机在所述第一相机坐标系下的初始定位结果;Determine an initial positioning result of the first camera in the first camera coordinate system according to the three-dimensional feature points, the initial rotation matrix, and the initial translation vector of the initial frame color image acquired by the first camera in the first camera coordinate system;
    利用所述第一相机坐标系与所述世界坐标系之间的转换矩阵,对所述第一相机在所述第一相机坐标系下的初始定位结果进行转换,以确定出所述第一相机采集初始帧彩色图像时的位姿。The initial positioning result of the first camera in the first camera coordinate system is transformed by using the transformation matrix between the first camera coordinate system and the world coordinate system, so as to determine the position and posture of the first camera when capturing the initial frame color image.
  11. 根据权利要求10所述的位姿确定方法,其中,所述位姿确定方法还包括:The method for determining a posture according to claim 10, wherein the method for determining a posture further comprises:
    获取所述第一相机输出的参考深度图像;Acquire a reference depth image output by the first camera;
    在结合所述第一相机输出的参考深度图像确定出指定平面的情况下,根据所述指定平面的法向量和重力向量确定所述第一相机坐标系与所述世界坐标系之间的转换矩阵。When a designated plane is determined in combination with the reference depth image output by the first camera, a transformation matrix between the first camera coordinate system and the world coordinate system is determined according to a normal vector and a gravity vector of the designated plane.
  12. 根据权利要求11所述的位姿确定方法,其中,所述位姿确定方法还包括:The method for determining a posture according to claim 11, wherein the method for determining a posture further comprises:
    结合所述第一相机输出的参考深度图像,确定出所述第一相机对应的参考点云;Determine a reference point cloud corresponding to the first camera in combination with a reference depth image output by the first camera;
    提取所述参考点云的平面信息;Extracting plane information of the reference point cloud;
    根据所述参考点云的平面信息筛选所述指定平面。The designated plane is selected according to the plane information of the reference point cloud.
  13. 根据权利要求12所述的位姿确定方法,其中,结合所述第一相机输出的参考深度图像,确定出所述第一相机对应的参考点云,包括:The method for determining a posture according to claim 12, wherein determining a reference point cloud corresponding to the first camera in combination with a reference depth image output by the first camera comprises:
    针对所述第一相机输出的参考深度图像上的每一个像素点,根据所述像素点、所述像素点的深度值和所述第一相机的相机内参确定所述像素点的三维空间点;For each pixel point on the reference depth image output by the first camera, determine the three-dimensional space point of the pixel point according to the pixel point, the depth value of the pixel point and the camera intrinsic parameter of the first camera;
    结合所述第一相机输出的参考深度图像上的每一个像素点的三维空间点,构建所述第一相机对应的参考点云。A reference point cloud corresponding to the first camera is constructed by combining the three-dimensional space point of each pixel point on the reference depth image output by the first camera.
  14. 根据权利要求13所述的位姿确定方法,其中,结合所述第一相机输出的参考深度图像上的每一个像素点的三维空间点,构建所述第一相机对应的参考点云,包括:The method for determining the position and posture of claim 13, wherein, combining the three-dimensional space point of each pixel point on the reference depth image output by the first camera to construct a reference point cloud corresponding to the first camera comprises:
    获取所述第二相机输出的参考深度图像;Acquire a reference depth image output by the second camera;
    确定所述第二相机输出的参考深度图像上每一个像素点的三维空间点;Determine a three-dimensional spatial point of each pixel on the reference depth image output by the second camera;
    根据所述第一相机坐标系与所述第二相机坐标系之间的转换矩阵将所述第二相机输出的参考深度图像上每一个像素点的三维空间点进行转换,以得到转换后的三维空间点;transforming a three-dimensional space point of each pixel on the reference depth image output by the second camera according to a transformation matrix between the first camera coordinate system and the second camera coordinate system to obtain a transformed three-dimensional space point;
    将所述第一相机输出的参考深度图像上的每一个像素点的三维空间点与所述转换后的三维空间点合并,以构建出所述第一相机对应的参考点云。The three-dimensional space point of each pixel point on the reference depth image output by the first camera is merged with the converted three-dimensional space point to construct a reference point cloud corresponding to the first camera.
  15. 根据权利要求12所述的位姿确定方法,其中,根据所述参考点云的平面信息筛选所述指定平面包括: The method for determining a posture according to claim 12, wherein screening the specified plane according to the plane information of the reference point cloud comprises:
    根据所述参考点云的平面信息中包含的平面距所述第一相机的距离信息筛选所述指定平面。The designated plane is filtered according to distance information between the plane and the first camera included in the plane information of the reference point cloud.
  16. 根据权利要求15所述的位姿确定方法,其中,根据所述参考点云的平面信息中包含的平面距所述第一相机的距离信息筛选所述指定平面包括:The method for determining the position and posture of claim 15, wherein screening the designated plane according to the distance information of the plane from the first camera contained in the plane information of the reference point cloud comprises:
    在所述距离信息中包含预定距离范围内的距离的情况下,确定与所述距离信息中处于所述预定距离范围内的距离对应的候选平面;In a case where the distance information includes a distance within a predetermined distance range, determining a candidate plane corresponding to the distance in the distance information within the predetermined distance range;
    在所述候选平面的数量为一个的情况下,将所述候选平面确定为所述指定平面;When the number of the candidate plane is one, determining the candidate plane as the designated plane;
    在所述候选平面的数量为多个的情况下,将距所述第一相机的距离最接近距离阈值的候选平面确定为所述指定平面;When there are multiple candidate planes, determine a candidate plane whose distance from the first camera is closest to a distance threshold as the designated plane;
    其中,所述距离阈值在所述预定距离范围内。Wherein, the distance threshold is within the predetermined distance range.
  17. 根据权利要求11所述的位姿确定方法,其中,所述指定平面为地平面。The method for determining a posture according to claim 11, wherein the specified plane is a ground plane.
  18. 一种位姿确定装置,其中,配置于终端设备,所述终端设备还配置有第一相机和至少一个第二相机,所述位姿确定装置包括:A posture determination device, wherein the device is configured in a terminal device, the terminal device is further configured with a first camera and at least one second camera, and the posture determination device comprises:
    第一特征点确定模块,用于获取所述第一相机采集的当前帧彩色图像,确定所述第一相机采集的当前帧彩色图像上与所述第一相机采集的上一帧彩色图像匹配的第一二维特征点;A first feature point determination module is used to obtain a current frame color image captured by the first camera, and determine a first two-dimensional feature point on the current frame color image captured by the first camera that matches a previous frame color image captured by the first camera;
    第二特征点确定模块,用于获取所述第二相机采集的当前帧彩色图像,确定所述第二相机采集的当前帧彩色图像上与所述第二相机采集的上一帧彩色图像匹配的第二二维特征点;A second feature point determination module is used to obtain a current frame color image captured by the second camera, and determine a second two-dimensional feature point on the current frame color image captured by the second camera that matches a previous frame color image captured by the second camera;
    特征点转换模块,用于利用所述第一相机的第一相机坐标系与所述第二相机的第二相机坐标系之间的转换矩阵将所述第二二维特征点转换为所述第一相机坐标系下的第三二维特征点;a feature point conversion module, configured to convert the second two-dimensional feature points into third two-dimensional feature points in the first camera coordinate system by using a conversion matrix between a first camera coordinate system of the first camera and a second camera coordinate system of the second camera;
    位姿确定模块,用于根据所述第一二维特征点、所述第三二维特征点、所述第一相机采集的上一帧彩色图像在世界坐标系下的三维特征点以及所述第二相机采集的上一帧彩色图像在世界坐标系下的三维特征点,确定所述第一相机采集当前帧彩色图像时的位姿。A posture determination module is used to determine the posture of the first camera when capturing the current frame of color image based on the first two-dimensional feature points, the third two-dimensional feature points, the three-dimensional feature points of the previous frame of color image captured by the first camera in the world coordinate system, and the three-dimensional feature points of the previous frame of color image captured by the second camera in the world coordinate system.
  19. 一种计算机可读存储介质,其上存储有计算机程序,其中,所述程序被处理器执行时实现如权利要求1至17中任一项所述的位姿确定方法。A computer-readable storage medium having a computer program stored thereon, wherein when the program is executed by a processor, the posture determination method according to any one of claims 1 to 17 is implemented.
  20. 一种电子设备,其中,包括:An electronic device, comprising:
    处理器;processor;
    存储器,用于存储一个或多个程序,当所述一个或多个程序被所述处理器执行时,使得所述处理器实现如权利要求1至17中任一项所述的位姿确定方法。 A memory for storing one or more programs, which, when executed by the processor, enables the processor to implement the posture determination method as described in any one of claims 1 to 17.
PCT/CN2023/118181 2022-10-28 2023-09-12 Pose determination method and apparatus, computer readable storage medium, and electronic device WO2024087917A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211337302.9A CN117994333A (en) 2022-10-28 2022-10-28 Pose determination method and device, computer readable storage medium and electronic equipment
CN202211337302.9 2022-10-28

Publications (1)

Publication Number Publication Date
WO2024087917A1 true WO2024087917A1 (en) 2024-05-02

Family

ID=90829950

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/118181 WO2024087917A1 (en) 2022-10-28 2023-09-12 Pose determination method and apparatus, computer readable storage medium, and electronic device

Country Status (2)

Country Link
CN (1) CN117994333A (en)
WO (1) WO2024087917A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10388029B1 (en) * 2017-09-07 2019-08-20 Northrop Grumman Systems Corporation Multi-sensor pose-estimate system
CN111415387A (en) * 2019-01-04 2020-07-14 南京人工智能高等研究院有限公司 Camera pose determining method and device, electronic equipment and storage medium
CN114897988A (en) * 2022-07-14 2022-08-12 苏州魔视智能科技有限公司 Multi-camera positioning method, device and equipment in hinge type vehicle

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10388029B1 (en) * 2017-09-07 2019-08-20 Northrop Grumman Systems Corporation Multi-sensor pose-estimate system
CN111415387A (en) * 2019-01-04 2020-07-14 南京人工智能高等研究院有限公司 Camera pose determining method and device, electronic equipment and storage medium
CN114897988A (en) * 2022-07-14 2022-08-12 苏州魔视智能科技有限公司 Multi-camera positioning method, device and equipment in hinge type vehicle

Also Published As

Publication number Publication date
CN117994333A (en) 2024-05-07

Similar Documents

Publication Publication Date Title
US10311648B2 (en) Systems and methods for scanning three-dimensional objects
WO2019170164A1 (en) Depth camera-based three-dimensional reconstruction method and apparatus, device, and storage medium
WO2021017882A1 (en) Image coordinate system conversion method and apparatus, device and storage medium
US9177381B2 (en) Depth estimate determination, systems and methods
US9129435B2 (en) Method for creating 3-D models by stitching multiple partial 3-D models
WO2020236307A1 (en) Image-based localization
WO2019238114A1 (en) Three-dimensional dynamic model reconstruction method, apparatus and device, and storage medium
WO2021082801A1 (en) Augmented reality processing method and apparatus, system, storage medium and electronic device
CN108495089A (en) vehicle monitoring method, device, system and computer readable storage medium
WO2018112788A1 (en) Image processing method and device
CN108958469B (en) Method for adding hyperlinks in virtual world based on augmented reality
WO2021136386A1 (en) Data processing method, terminal, and server
CN112927362A (en) Map reconstruction method and device, computer readable medium and electronic device
CN110296686A (en) Localization method, device and the equipment of view-based access control model
CN112927363A (en) Voxel map construction method and device, computer readable medium and electronic equipment
WO2022237048A1 (en) Pose acquisition method and apparatus, and electronic device, storage medium and program
WO2023169281A1 (en) Image registration method and apparatus, storage medium, and electronic device
CN110310325B (en) Virtual measurement method, electronic device and computer readable storage medium
WO2019000464A1 (en) Image display method and device, storage medium, and terminal
CN111079470B (en) Method and device for detecting human face living body
CN113362467B (en) Point cloud preprocessing and ShuffleNet-based mobile terminal three-dimensional pose estimation method
JP2001291108A (en) Device and method for image processing and program recording medium
CN112073640B (en) Panoramic information acquisition pose acquisition method, device and system
CN112365530A (en) Augmented reality processing method and device, storage medium and electronic equipment
WO2024087917A1 (en) Pose determination method and apparatus, computer readable storage medium, and electronic device