WO2018205803A1 - Pose estimation method and apparatus - Google Patents

Pose estimation method and apparatus Download PDF

Info

Publication number
WO2018205803A1
WO2018205803A1 PCT/CN2018/083376 CN2018083376W WO2018205803A1 WO 2018205803 A1 WO2018205803 A1 WO 2018205803A1 CN 2018083376 W CN2018083376 W CN 2018083376W WO 2018205803 A1 WO2018205803 A1 WO 2018205803A1
Authority
WO
WIPO (PCT)
Prior art keywords
depth
depth image
determining
pixel point
pixel
Prior art date
Application number
PCT/CN2018/083376
Other languages
French (fr)
Chinese (zh)
Inventor
孙志明
张潮
李雨倩
吴迪
樊晨
李政
贾士伟
李祎翔
张连川
刘新月
Original Assignee
北京京东尚科信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东尚科信息技术有限公司, 北京京东世纪贸易有限公司 filed Critical 北京京东尚科信息技术有限公司
Publication of WO2018205803A1 publication Critical patent/WO2018205803A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Definitions

  • the present application relates to the field of computer vision technology, and in particular to the field of pose estimation, and in particular to a pose estimation method and apparatus.
  • Pose estimation especially visual pose estimation, involves knowledge of many disciplines such as image processing, computer vision, inertial navigation, mathematical statistics, optimization, etc. It is the basic technology of many emerging industries and industries, in current and future production and Life will play an important role.
  • the existing pose estimation method usually needs to extract the feature points of the image and establish a descriptor, so it has the characteristics of consuming a large amount of computing resources, and the real-time performance of the pose estimation is poor.
  • the purpose of the present application is to propose a pose estimation method and apparatus to solve the technical problems mentioned in the background art section above.
  • an embodiment of the present application provides a pose estimation method for acquiring a depth image video from a depth sensor, and selecting a first depth image and a second depth image from each frame depth image of the depth image video, where Determining at least one pixel point indicating the same object in the first depth image and the second depth image; determining a first pixel point set in the first depth image and a second one in the second depth image a set of pixel points, wherein each pixel point in the first set of pixel points is in one-to-one correspondence with each pixel point in the second set of pixel points, and corresponding two pixel points indicate the same object; Any one of the pixels in the set of pixels, based on the first two-dimensional coordinates of the pixel in the first depth image and the first pixel corresponding to the pixel in the second depth image a depth value that determines a pose transformation parameter of the depth sensor.
  • the method before the selecting the first depth image and the second depth image from the frame depth images, the method further includes deleting, for each frame depth image, the preset in the frame depth image The pixel of the condition; smoothing the deleted depth image.
  • the deleting the pixel points in the frame depth image that meet the preset condition comprises: detecting a depth value of each pixel point; and the depth value is greater than the first preset value, and the depth value is less than the second pre- The pixel of the set value is deleted.
  • the deleting a pixel point in the frame depth image that meets a preset condition comprises: determining a first partial derivative of the frame depth image in a horizontal direction and a second partial derivative in a vertical direction; Determining, by the first partial derivative and the second partial derivative, a geometric edge pixel in the frame depth image; deleting the geometric edge pixel.
  • the deleting a pixel point in the frame depth image that meets a preset condition includes: determining a failed pixel point that the depth value does not exist in the frame depth image; and the invalidating pixel point and the invalidation The pixels adjacent to the pixel are deleted.
  • the first depth value based on the first two-dimensional coordinates of the pixel in the first depth image and the corresponding pixel corresponding to the pixel in the second depth image Determining the pose transformation parameter of the depth sensor, comprising: mapping a first two-dimensional coordinate of the pixel in the first depth image to a first three-dimensional coordinate of a coordinate system to which the first depth image belongs; Converting the first three-dimensional space coordinate to a coordinate system to which the second depth image belongs to obtain a second three-dimensional space coordinate; mapping the second three-dimensional space coordinate into the second depth image to obtain a second two-dimensional coordinate Determining a second depth value of the second two-dimensional coordinate in the second depth image; determining a pose transformation parameter of the depth sensor based on the first depth value and the second depth value.
  • determining the pose transformation parameter of the depth sensor based on the first depth value and the second depth value comprises: according to the first depth value and the second depth value Determining a depth difference between the first depth image and the second depth image; determining the depth difference as a depth residual, and based on the depth residual, performing the following iterative steps: based on the depth Residual, determining a pose estimation increment; determining whether the depth residual is less than a preset threshold; and accumulating the pose estimation increment and the first depth value in response to the depth residual being less than a preset threshold, Determining a pose estimation value; determining a pose transformation parameter of the depth sensor according to the pose estimation value; determining, in response to the depth residual being greater than or equal to a preset threshold, determining the cumulative pose estimation increment as The depth residual continues to perform the iterative step.
  • the method further comprises: obtaining angular velocity and acceleration from an inertial measurement device physically bound to the depth sensor; determining a pose transformation of the inertial measurement device based on the angular velocity and the acceleration a parameter; a pose transformation parameter of the depth sensor and a pose transformation parameter of the inertial measurement device are combined to determine an integrated pose transformation parameter.
  • determining the pose transformation parameter of the inertial measurement device according to the angular velocity and the acceleration comprises: determining a first pose transformation parameter of the inertial measurement device according to the angular velocity; Determining, according to the acceleration, a second pose transformation parameter of the inertial measurement device; and combining the first pose transformation parameter and the second pose transformation parameter to determine a pose transformation parameter of the inertial measurement device.
  • the embodiment of the present application provides a pose estimating apparatus, the apparatus comprising: a first acquiring unit, configured to acquire a depth image video from a depth sensor; and an image selecting unit, configured to use the depth image video Selecting a first depth image and a second depth image in each of the frame depth images, wherein the first depth image and the second depth image share at least one pixel point indicating the same object; the pixel point set determining unit, Determining a first set of pixel points in the first depth image and a second set of pixel points in the second depth image, wherein each pixel point in the first set of pixel points and the second pixel Each of the pixel points in the point set corresponds to the one-to-one correspondence, and the corresponding two pixel points indicate the same object; the first parameter determining unit is configured to use, according to the pixel point, any pixel point in the first pixel point set. Determining the depth sensing by the first two-dimensional coordinates in the first depth image and the first depth value of the
  • the apparatus further includes a pre-processing unit, the pre-processing unit includes a pixel point deletion module and a smoothing module; the pixel point deletion module is configured to select, in the image selection unit, the frame depth Before selecting the first depth image and the second depth image in the image, for each frame depth image, the pixel points in the frame depth image that meet the preset condition are deleted; the smoothing module is configured to perform smoothing on the deleted depth image. .
  • the pixel point deleting module is further configured to: detect a depth value of each pixel point; delete a pixel point whose depth value is greater than the first preset value, and the depth value is smaller than the second preset value.
  • the pixel point deleting module is further configured to: determine a first partial derivative of the frame depth image in a horizontal direction and a second partial derivative in a vertical direction; according to the first partial derivative and the Deriving a second partial derivative to determine a geometric edge pixel in the frame depth image; deleting the geometric edge pixel.
  • the pixel point deleting module is further configured to: determine a failed pixel point that the depth value does not exist in the frame depth image; delete the failed pixel point and a pixel point adjacent to the failed pixel point .
  • the first parameter determining unit includes: a first mapping module, configured to map the first two-dimensional coordinates of the pixel in the first depth image to the coordinates of the first depth image a first three-dimensional space coordinate; a transformation module, configured to transform the first three-dimensional space coordinate to a coordinate system to which the second depth image belongs, to obtain a second three-dimensional space coordinate; and a second mapping module, configured to: Mapping a second three-dimensional space coordinate to the second depth image to obtain a second two-dimensional coordinate; a depth value determining module, configured to determine a second depth value of the second two-dimensional coordinate in the second depth image a first parameter determining module, configured to determine a pose transformation parameter of the depth sensor based on the first depth value and the second depth value.
  • a first mapping module configured to map the first two-dimensional coordinates of the pixel in the first depth image to the coordinates of the first depth image a first three-dimensional space coordinate
  • a transformation module configured to transform the first three-dimensional space coordinate to a coordinate system to which the second
  • the first parameter determining module is further configured to: determine a depth between the first depth image and the second depth image according to the first depth value and the second depth value Determining the depth difference as a depth residual, and based on the depth residual, performing an iterative step of determining a pose estimation increment based on the depth residual; determining whether the depth residual is less than a pre- Setting a threshold value; in response to the depth residual being less than a preset threshold, accumulating the pose estimation increment and the first depth value, determining a pose estimation value; determining the depth sensor according to the pose estimation value a pose transformation parameter; in response to the depth residual being greater than or equal to a preset threshold, determining that the accumulated pose estimation increment is the depth residual, and continuing to perform the iterative step.
  • the apparatus further includes: a second acquisition unit configured to acquire angular velocity and acceleration from an inertial measurement device physically bound to the depth sensor; a second parameter determination unit configured to determine an angular velocity Determining a pose transformation parameter of the inertial measurement device with the acceleration; a parameter fusion unit configured to fuse a pose transformation parameter of the depth sensor and a pose transformation parameter of the inertial measurement device to determine a comprehensive pose transformation parameter.
  • the second parameter determining unit includes: a first sub-parameter determining module, configured to determine a first pose transformation parameter of the inertial measurement device according to the angular velocity; and a second sub-parameter determination module, Determining, according to the acceleration, a second pose transformation parameter of the inertial measurement device; a fusion module, configured to fuse the first pose transformation parameter and the second pose transformation parameter, and determine the inertial measurement The pose transformation parameters of the device.
  • an embodiment of the present application provides a server, including: one or more processors; and a storage device, configured to store one or more programs, when the one or more programs are the one or more The processor executes such that the one or more processors implement the methods described in any of the above embodiments.
  • the embodiment of the present application provides a computer readable storage medium, where a computer program is stored, and when the program is executed by the processor, the method described in any one of the foregoing embodiments is implemented.
  • the pose estimation method and apparatus acquires a depth image video collected by the depth sensor from the depth sensor, and selects two frames from which at least one pixel image indicating the same object is selected, and then determines the mutual image in the two frame depth images. Corresponding first pixel point set and second pixel point set, and then for each pixel point in the first pixel point set, according to the first two-dimensional coordinate in the first depth image and corresponding to the pixel point A first depth value of the pixel in the second depth image determines a pose transformation parameter of the depth sensor.
  • the pose estimation method of the present application uses the depth image to perform pose estimation, reduces the consumption of computing resources, improves the calculation efficiency, and ensures the real-time performance of the pose estimation.
  • FIG. 1 is an exemplary system architecture diagram to which the present application can be applied;
  • FIG. 2 is a flow chart of one embodiment of a pose estimation method in accordance with the present application.
  • FIG. 3 is a schematic diagram of an application scenario of a pose estimation method according to the present application.
  • FIG. 4 is a flowchart of determining a pose transformation parameter of a depth sensor in a pose estimation method according to the present application
  • FIG. 5 is a schematic diagram of a principle of pose transformation in a pose estimation method according to the present application.
  • FIG. 6 is a flow chart of another embodiment of a pose estimation method according to the present application.
  • FIG. 7 is a schematic structural view of an embodiment of a pose estimating apparatus according to the present application.
  • FIG. 8 is a block diagram showing the structure of a computer system suitable for implementing the server of the embodiment of the present application.
  • FIG. 1 illustrates an exemplary system architecture 100 in which embodiments of the pose estimation method or pose estimation apparatus of the present application may be applied.
  • system architecture 100 can include depth sensor 101, network 102, and server 103.
  • Network 102 is used to provide a medium for the communication link between depth sensor 101 and server 103.
  • Network 102 can include a variety of connection types, such as wired, wireless communication links, fiber optic cables, and the like.
  • the depth sensor 101 interacts with the server 103 via the network 102 to transmit depth image video or the like.
  • the depth sensor 101 can be mounted on various moving objects, such as an unmanned vehicle, a robot, an unmanned delivery vehicle, a smart wearable device, a virtual reality device, and the like.
  • the depth sensor 101 may be various depth sensors capable of continuously acquiring multi-frame depth images.
  • the server 103 may be a server that provides various services, such as a background server that processes depth image video acquired by the depth sensor 101.
  • the background server can analyze and process data such as the received depth image video.
  • the pose estimation method provided by the embodiment of the present application is generally performed by the server 103. Accordingly, the pose estimation apparatus is generally disposed in the server 103.
  • depth sensors, networks, and servers in Figure 1 are merely illustrative. Depending on the implementation needs, there can be any number of depth sensors, networks, and servers.
  • the pose estimation method of this embodiment includes the following steps:
  • Step 201 Acquire a depth image video from the depth sensor.
  • the electronic device on which the pose estimation method operates can acquire the depth image video from the depth sensor by a wired connection or a wireless connection.
  • Each frame image in the depth image video is a depth image.
  • the depth image also called the distance image, refers to the distance (depth) of the image collector to each point in the scene as the image of the pixel value, which directly reflects the geometry of the visible surface of the scene.
  • Each pixel in the depth image represents the distance between the object at a particular coordinate and the camera plane of the depth sensor in the field of view of the depth sensor.
  • wireless connection manners may include, but are not limited to, 3G/4G connection, WiFi connection, Bluetooth connection, WiMAX connection, Zigbee connection, UWB (ultra wideband) connection, and other wireless connection methods now known or developed in the future.
  • Step 202 Select a first depth image and a second depth image from each frame depth image of the depth image video.
  • the first depth image and the second depth image may be adjacent two-frame depth images in the depth image video, or may be two-frame depth images in which the sequence numbers in the depth image video differ by less than a preset value.
  • Step 203 Determine a first pixel point set in the first depth image and a second pixel point set in the second depth image.
  • Each pixel point in the first pixel point set corresponds to each pixel point in the second pixel point set, and the corresponding two pixel points indicate the same object, and more specifically, the corresponding two pixel point indications The same location of the same object. It can be understood that the number of pixels in the first set of pixel points is equal to the number of pixels in the second set of pixel points, and the number thereof is the same as the pixel indicating the same object shared by the first depth image and the second depth image. The number is the same.
  • Step 204 For any pixel in the first set of pixel points, based on the first two-dimensional coordinates of the pixel in the first depth image and the corresponding pixel corresponding to the pixel in the second depth image A depth value determines the pose transformation parameter of the depth sensor.
  • the server may be based on the first two-dimensional coordinates of each pixel in the first set of pixel points in the first depth image and the corresponding pixel points in the second set of pixels corresponding to the pixel in the second depth image.
  • a depth value is used to determine the pose transformation parameters of the depth sensor. It can be understood that the first two-dimensional coordinate is a coordinate of the pixel in the image coordinate system of the first depth image, and the depth value of the pixel is not included in the first two-dimensional coordinate.
  • a corresponding pixel point corresponding to the pixel point exists in the second pixel point set. Since each pixel point has a depth value, the first depth value of the corresponding pixel point may be determined by the second depth image.
  • the pose transformation parameter may be a pose transformation parameter between the first depth image and the second depth image.
  • FIG. 3 is a schematic diagram of an application scenario of the pose estimation method according to the present embodiment.
  • the depth sensor 301 is installed on the unmanned vehicle 302.
  • the depth sensor 301 collects the depth image video and sends the collected depth image video to the server 303.
  • the server 303 determines the pose transformation parameter of the depth sensor 301, and then sends the pose transformation parameter to the unmanned vehicle 302, and the unmanned vehicle 302 can navigate according to the pose transformation parameter. And obstacle avoidance.
  • the pose estimation method acquires the depth image video collected by the depth sensor from the depth sensor, and selects two depth images of at least one pixel point indicating the same object from the two frames, and then determines the two-frame depth image. And corresponding to the first pixel point set and the second pixel point set, and then for each pixel point in the first pixel point set, according to the first two-dimensional coordinate in the first depth image and corresponding to the pixel point.
  • the first depth value of the corresponding pixel in the second depth image determines the pose transformation parameter of the depth sensor, reduces the consumption of computing resources, improves the calculation efficiency, and ensures the real-time performance of the pose estimation.
  • the foregoing method further includes the following steps not shown in FIG. 2:
  • pixels in the frame depth image that meet the preset condition are deleted; and the deleted depth image is smoothed.
  • the depth sensor typically emits probe light (eg, infrared, laser, radar) and receives the reflected light reflected from the surface of the object to determine the distance between the object and the depth sensor. Due to the occlusion of the object, the absorption of the detected light by the surface of the object, and the diffuse reflection, the depth sensor cannot completely receive the reflected light reflected back. Therefore, many pixel points in the depth image have no depth value or the depth value is inaccurate. In this implementation manner, in order to ensure the accuracy of the pose estimation, it is necessary to delete the pixel points in the depth image of each frame that meet the preset condition.
  • probe light eg, infrared, laser, radar
  • the depth image from which the pixel is deleted can be smoothed.
  • the above smoothing processing may include linear smoothing, interpolation smoothing, convolution smoothing, Gaussian filtering, bilateral filtering, and the like.
  • the depth value of each pixel point may be detected first, and the pixel point whose depth value is greater than the first preset value and smaller than the second preset value may be deleted.
  • the values of the first preset value and the second preset value are related to the model of the depth sensor, which is not limited in this implementation manner.
  • the first partial derivative Zu of the depth image in the horizontal direction u direction and the second partial derivative Zv in the vertical direction v direction may be determined, and then determined according to Zu and Zv.
  • the geometric edge pixel points in the frame depth image are removed from the determined geometric edge pixel points.
  • the depth value of the pixel at the edge of the object has a high degree of uncertainty, and at the same time on both sides of the pixel where the edge is located The pixel depth value will jump, and in order to ensure the accuracy of the pose estimation, the above geometric edge pixel points can be deleted.
  • the failed pixel points where the depth value does not exist in each depth image may be determined, and then the failed pixel points and the pixel points adjacent to the failed pixel points are deleted.
  • the detection light receiver cannot receive the detection light reflected back from the object, and thus the depth value of the pixel cannot be determined. These pixels are called invalid pixels. point.
  • the pixel points adjacent to the failed pixel point are also deleted.
  • the pose transformation parameter of the depth sensor can be determined by the following steps:
  • Step 401 Map the first two-dimensional coordinates of the pixel in the first depth image to the first three-dimensional coordinate of the coordinate system to which the first depth image belongs.
  • the first two-dimensional coordinates (x 1 , y 1 ) of the pixel in the first depth image may be first determined, and then the first two-dimensional coordinates (x 1) , y 1 ) is mapped to the first three-dimensional space coordinates.
  • the first three-dimensional space coordinates (x 1 ', y 1 ', z 1 ') of the coordinate system to which the first depth image belongs may be obtained by ⁇ -1 mapping in the pinhole camera model.
  • the pinhole camera model includes a mapping relationship ⁇ of two-dimensional coordinates that project a three-dimensional space point to a pixel plane and a two-dimensional coordinate of a point with depth on the image as a mapping relationship ⁇ -1 of the three-dimensional space point.
  • Step 402 Transform the first three-dimensional space coordinate into a coordinate system to which the second depth image belongs, to obtain a second three-dimensional space coordinate.
  • the pose transformation parameter between the first depth image and the second depth image is recorded as T 1 ⁇ 2 , as shown in FIG. 5 .
  • a pedestrian in the world coordinate system x w -y w -z w is denoted as point P
  • the depth sensor is located on the left side at time t 1 and on the right side at time t 2 .
  • Point P is the point P 1 in the depth image obtained at time t 1 , its coordinate is (x 1 , y 1 ), its depth value is Z 1 , and its coordinate system is x c1 -y c1 -z c1 .
  • Point P is a point P 2 in the depth image obtained at time t 2 , its coordinate is (x 2 , y 2 ), its depth value is Z 2 , and its associated coordinate system is x c2 -y c2 -z c2 .
  • the pose transformation parameter between the coordinate system x c1 -y c1 -z c1 and the coordinate system x c2 -y c2 -z c2 is T 1 ⁇ 2 .
  • the above-described pose transformation parameter T 1 ⁇ 2 is represented by a Lie algebra se(3) as a ⁇ , and is represented by a matrix as a Lie group T( ⁇ ).
  • a Lie group T( ⁇ ) may be preset, and the preset Lie group T( ⁇ ) is used to complete the transformation to obtain the second three-dimensional coordinate. (x 2 ', y 2 ', z 2 ').
  • Step 403 Map the second three-dimensional space coordinates into the second depth image to obtain the second two-dimensional coordinates.
  • the second two-dimensional coordinates (x 2 , y 2 ) can be obtained by using the ⁇ mapping in the pinhole camera model.
  • Step 404 determining a second depth value of the second two-dimensional coordinate in the second depth image.
  • a second depth value of the second two-dimensional coordinate (x 2 , y 2 ) is determined in the second depth image.
  • the second depth value should be the same as the first depth value of the corresponding pixel point corresponding to the pixel point in the second pixel point set.
  • the two are often different.
  • Step 405 Determine a pose change parameter of the depth sensor based on the first depth value and the second depth value.
  • the pose change parameter of the depth sensor may be determined in conjunction with the first depth value.
  • step 405 can be implemented by the following steps not shown in FIG. 4:
  • the pose estimation value determines a pose transformation parameter of the depth sensor; in response to the depth residual being greater than or equal to a preset threshold, determining that the accumulated pose estimation increment is a depth residual, and continuing to perform the iterative step.
  • determining a difference between the two is a depth difference between the first depth image and the second depth image.
  • determining the pose estimation increment based on the depth residual, and then determining whether the depth residual is less than a preset threshold, and if less than, determining the pose
  • the increment and the first depth value are accumulated to obtain a pose estimation value, and the difference between the pose estimate and the second depth value at this time is within an acceptable range, and thus can be directly determined according to the pose estimate described above.
  • Pose transformation parameters of the depth sensor If the depth residual is greater than or equal to the preset threshold, the pose estimation increment obtained each time the iterative step is performed is accumulated, the accumulated value is taken as a new depth residual, and then the iterative step is continued.
  • the pose transformation parameter of the depth sensor may be determined according to the following formula:
  • ⁇ * arg min ⁇
  • ⁇ * is an estimated value of the pose transformation parameter
  • Z 2 is a second depth image
  • P 1 is a pixel point in the first depth image
  • P 2 ' is a corresponding pixel point of P 1 in the second depth image
  • Z 2 (P 2 ') is the depth value of the P 2 ' point in the second depth image (ie, the first depth value)
  • T is the Lie group
  • is the pose transformation parameter
  • T( ⁇ ) is the preset bit.
  • ⁇ -1 is the ⁇ -1 map in the pinhole camera model
  • [T( ⁇ ) ⁇ -1 (P 1 )] Z is the point where P 1 is transformed into a three-dimensional space point by ⁇ mapping and then The space is transformed to the depth value of the pixel point obtained by the ⁇ -1 mapping of the coordinate system to which the second depth image belongs
  • the arg min function is such that ⁇
  • the ⁇ value at the minimum value is denoted as ⁇ *.
  • the pose estimation method provided by the above embodiment of the present application utilizes the depth residual between the depth images to solve the pose transformation parameters, thereby avoiding the complicated process of extracting feature points and establishing descriptors in the prior art, and saving calculations. Resources ensure the real-time performance of the calculation.
  • the pose estimation method of the embodiment may further include the following steps after obtaining the pose transformation parameter of the depth sensor:
  • Step 601 obtaining angular velocity and acceleration from an inertial measurement device physically bound to the depth sensor.
  • an inertial measurement unit may be physically bound to the depth sensor.
  • the above physical binding can be understood as the fact that the inertial measurement device is coincident with the center of the depth sensor and fixed together.
  • the inertial measurement device measures the angular velocity and acceleration of the object's movement.
  • the server performing the pose estimation method can acquire the angular velocity and the acceleration from the inertial measurement device by wire or wirelessly.
  • Step 602 Determine a pose transformation parameter of the inertial measurement device according to the angular velocity and the acceleration.
  • the server may determine the pose transformation parameters of the inertial measurement device after acquiring the angular velocity and acceleration described above.
  • step 602 may be implemented by the following steps not shown in FIG. 6:
  • determining the first pose transformation parameter by using the angular velocity, and determining the second pose transformation parameter by using the acceleration is well known to those skilled in the art, and details are not described herein again.
  • the two After obtaining the first pose transformation parameter and the second pose transformation parameter, the two can be fused to determine the pose transformation parameter of the inertial measurement device.
  • Step 603 combining the pose transformation parameter of the depth sensor and the pose transformation parameter of the inertial measurement device to determine the integrated pose transformation parameter.
  • the coupling methods can be used to fuse the two to determine the integrated pose transformation parameters.
  • the acceleration and angular velocity may be first filtered prior to step 602.
  • a complementary filter can be used to remove the acceleration and angular velocity noise, and improve the accuracy of the pose transformation parameters.
  • the pose estimation method provided by the above embodiment of the present application can improve the accuracy of the pose estimation parameter.
  • the present application provides an embodiment of a pose estimation apparatus, the apparatus embodiment corresponding to the method embodiment shown in FIG. 2, the apparatus specific Can be applied to a variety of electronic devices.
  • the pose estimating apparatus 700 of the present embodiment includes a first acquiring unit 701, an image selecting unit 702, a pixel point set determining unit 703, and a first parameter determining unit 704.
  • the first obtaining unit 701 is configured to acquire a depth image video from the depth sensor.
  • the image selecting unit 702 is configured to select the first depth image and the second depth image from each frame depth image of the depth image video.
  • At least one pixel point indicating the same object is shared in the first depth image and the second depth image.
  • the pixel point set determining unit 703 is configured to determine a first pixel point set in the first depth image and a second pixel point set in the second depth image.
  • the pixel points in the first pixel point set are in one-to-one correspondence with the pixel points in the second pixel point set, and the corresponding two pixel points indicate the same object.
  • the first parameter determining unit 704 is configured to: for any pixel in the first set of pixel points, based on the first two-dimensional coordinates of the pixel in the first depth image and the corresponding pixel corresponding to the pixel A first depth value in the two depth images determines a pose transformation parameter of the depth sensor.
  • the foregoing apparatus 700 may further include a pre-processing unit not shown in FIG. 7.
  • the pre-processing unit includes a pixel point deletion module and a smoothing module.
  • the pixel point deleting module is configured to: before the image selecting unit 702 selects the first depth image and the second depth image from each frame depth image, delete the pixel that meets the preset condition in the frame depth image for each frame depth image. point.
  • a smoothing module for smoothing the deleted depth image For smoothing the deleted depth image.
  • the pixel point deleting module may be further configured to: detect a depth value of each pixel point; set the depth value to be greater than the first preset value, and the depth value is smaller than the second preset The pixel of the value is deleted.
  • the pixel point deleting module may be further configured to: determine a first partial derivative of the frame depth image in a horizontal direction and a second partial derivative in a vertical direction; A partial derivative and a second partial derivative are used to determine geometric edge pixel points in the frame depth image; the geometric edge pixel points are deleted.
  • the pixel deletion module may be further configured to: determine a failed pixel point where the depth value does not exist in the frame depth image; and the failed pixel point and the adjacent to the failed pixel point The pixel is deleted.
  • the first parameter determining unit 704 may further include a first mapping module, a transform module, a second mapping module, a depth value determining module, and the first Parameter determination module.
  • the first mapping module is configured to map the first two-dimensional coordinates of the pixel in the first depth image to the first three-dimensional coordinate of the coordinate system to which the first depth image belongs.
  • a transform module configured to transform the first three-dimensional space coordinate to a coordinate system to which the second depth image belongs, to obtain a second three-dimensional space coordinate.
  • the second mapping module is configured to map the second three-dimensional space coordinates into the second depth image to obtain the second two-dimensional coordinates.
  • a depth value determining module configured to determine a second depth value of the second two-dimensional coordinate in the second depth image.
  • the first parameter determining module is configured to determine a pose transformation parameter of the depth sensor based on the first depth value and the second depth value.
  • the first parameter determining module may be further configured to: determine a depth difference between the first depth image and the second depth image according to the first depth value and the second depth value Determining the depth difference as the depth residual, and based on the depth residual, performing the following iterative steps: determining the pose estimation increment based on the depth residual; determining whether the depth residual is less than a preset threshold; and responding to the depth residual is less than a preset threshold, a cumulative pose estimation increment and a first depth value, determining a pose estimation value; determining a pose transformation parameter of the depth sensor according to the pose estimation value; determining, in response to the depth residual being greater than or equal to a preset threshold, determining The accumulated pose estimation increment is the depth residual and the above iterative steps are continued.
  • the pose estimating apparatus 700 may further include a second acquiring unit, a second parameter determining unit, and a parameter fusion unit not shown in FIG. 7.
  • a second acquisition unit is configured to acquire angular velocity and acceleration from an inertial measurement device physically bound to the depth sensor.
  • the second parameter determining unit is configured to determine a pose transformation parameter of the inertial measurement device according to the angular velocity and the acceleration.
  • the parameter fusion unit is configured to combine the pose transformation parameter of the depth sensor and the pose transformation parameter of the inertial measurement device to determine the integrated pose transformation parameter.
  • the second parameter determining unit may further include a first sub-parameter determining module, a second sub-parameter determining module, and a converging module, which are not illustrated in FIG. 7 .
  • the first sub-parameter determining module is configured to determine a first pose change parameter of the inertial measurement device according to the angular velocity.
  • the second sub-parameter determining module is configured to determine a second pose change parameter of the inertial measurement device according to the acceleration.
  • the fusion module is configured to combine the first pose transformation parameter and the second pose transformation parameter to determine a pose transformation parameter of the inertial measurement device.
  • the pose estimating apparatus provides the depth image video acquired by the depth sensor from the depth sensor, and selects two depth images of at least one pixel indicating the same object from the two frames, and then determines the two-frame depth image. And corresponding to the first pixel point set and the second pixel point set, and then for each pixel point in the first pixel point set, according to the first two-dimensional coordinate in the first depth image and corresponding to the pixel point.
  • the first depth value of the corresponding pixel in the second depth image determines the pose transformation parameter of the depth sensor, reduces the consumption of computing resources, improves the calculation efficiency, and ensures the real-time performance of the pose estimation.
  • the units 701 to 704 described in the pose estimating apparatus 700 correspond to respective steps in the method described with reference to FIG. 2, respectively.
  • the operations and features described above for the method for synthesizing singing voice are equally applicable to the apparatus 700 and the units contained therein, and are not described herein again.
  • Corresponding units of device 700 may cooperate with units in the server to implement the solution of the embodiments of the present application.
  • FIG. 8 a block diagram of a computer system 800 suitable for use in implementing a server of an embodiment of the present application is shown.
  • the server shown in FIG. 8 is merely an example, and should not impose any limitation on the function and scope of use of the embodiments of the present application.
  • computer system 800 includes a central processing unit (CPU) 801 that can be loaded into a program in random access memory (RAM) 803 according to a program stored in read only memory (ROM) 802 or from storage portion 808. And perform various appropriate actions and processes.
  • RAM random access memory
  • ROM read only memory
  • RAM 803 various programs and data required for the operation of the system 800 are also stored.
  • the CPU 801, the ROM 802, and the RAM 803 are connected to each other through a bus 804.
  • An input/output (I/O) interface 805 is also coupled to bus 804.
  • the following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, etc.; an output portion 807 including, for example, a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a storage portion 808 including a hard disk or the like. And a communication portion 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the Internet.
  • Driver 810 is also coupled to I/O interface 805 as needed.
  • a removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like, is mounted on the drive 810 as needed so that a computer program read therefrom is installed into the storage portion 808 as needed.
  • an embodiment of the present disclosure includes a computer program product comprising a computer program carried on a machine readable medium, the computer program comprising program code for executing the method illustrated in the flowchart.
  • the computer program can be downloaded and installed from the network via communication portion 809, and/or installed from removable media 811.
  • the central processing unit (CPU) 801 the above-described functions defined in the method of the present application are performed.
  • the computer readable medium described herein may be a computer readable signal medium or a computer readable storage medium or any combination of the two.
  • the computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • a computer readable storage medium may be any tangible medium that can contain or store a program, which can be used by or in connection with an instruction execution system, apparatus or device.
  • a computer readable signal medium may include a data signal that is propagated in the baseband or as part of a carrier, carrying computer readable program code. Such propagated data signals can take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer readable signal medium can also be any computer readable medium other than a computer readable storage medium, which can transmit, propagate, or transport a program for use by or in connection with the instruction execution system, apparatus, or device.
  • Program code embodied on a computer readable medium can be transmitted by any suitable medium, including but not limited to wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
  • each block of the flowchart or block diagram can represent a module, a program segment, or a portion of code that includes one or more of the logic functions for implementing the specified.
  • Executable instructions can also occur in a different order than that illustrated in the drawings. For example, two successively represented blocks may in fact be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented in a dedicated hardware-based system that performs the specified function or operation. Or it can be implemented by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present application may be implemented by software or by hardware.
  • the described unit may also be disposed in the processor, for example, as a processor including a first acquisition unit, an image selection unit, a pixel point determination unit, and a first parameter determination unit.
  • the names of these units do not constitute a limitation on the unit itself under certain circumstances.
  • the first acquisition unit may also be described as “a unit that acquires depth image video from the depth sensor”.
  • the present application also provides a computer readable medium, which may be included in the apparatus described in the above embodiments, or may be separately present and not incorporated into the apparatus.
  • the computer readable medium carries one or more programs that, when executed by the device, cause the device to: acquire a depth image video from a depth sensor; from each frame depth image of the depth image video Selecting a first depth image and a second depth image, wherein the first depth image and the second depth image share at least one pixel point indicating the same object; determining the first pixel point set and the second depth image in the first depth image a second set of pixel points, wherein each pixel point in the first pixel point set corresponds to each pixel point in the second pixel point set, and the corresponding two pixel points indicate the same object; for the first pixel Determining the depth sensor based on the first two-dimensional coordinates of the pixel in the first depth image and the first depth value of the corresponding pixel corresponding to the pixel in the second depth image The pose transformation parameters.

Abstract

Disclosed in the present invention are a pose estimation method and an apparatus. A specific embodiment of said method comprises: acquiring range image video from a range sensor location; selecting from among individual range image frames a first range image and a second range image, the first range image and the second range image sharing at least one pixel point indicating the same object; determining a first pixel point set of the first range image and a second pixel point set of the second range image, each pixel point of the first pixel point set corresponding one-to-one with a pixel point of the second pixel point set, two corresponding pixel points indicating the same object; for any pixel point of the first pixel point set, determining a pose change parameter on the basis of first two-dimensional coordinates of the pixel point in the first range image and a first range value of the corresponding pixel point in the second range image. The present embodiment decreases consumption of computational resources, increasing computational efficiency and ensuring real-time pose estimation.

Description

位姿估计方法和装置Position estimation method and device
相关申请的交叉引用Cross-reference to related applications
本专利申请要求于2017年5月9日提交的、申请号为201710321322.X、申请人为北京京东尚科信息技术有限公司和北京京东世纪贸易有限公司、发明名称为“位姿估计方法和装置”的中国专利申请的优先权,该申请的全文以引用的方式并入本申请中。This patent application is filed on May 9, 2017, the application number is 201710321322.X, the applicant is Beijing Jingdong Shangke Information Technology Co., Ltd. and Beijing Jingdong Century Trading Co., Ltd., and the invention name is “Position Estimation Method and Device”. Priority of the Chinese Patent Application, the entire contents of which is hereby incorporated by reference.
技术领域Technical field
本申请涉及计算机视觉技术领域,具体涉及位姿估计领域,尤其涉及一种位姿估计方法和装置。The present application relates to the field of computer vision technology, and in particular to the field of pose estimation, and in particular to a pose estimation method and apparatus.
背景技术Background technique
近年来随着传感器、处理器等硬件的快速更新换代和定位、重建、学习等先进算法涌现,推动了无人机和机器人等相关行业的快速发展,这些行业相关的核心技术主要包括:位姿估计、三维重建、路径规划、机器学习等。位姿估计尤其是视觉位姿估计涉及到图像处理、计算机视觉、惯性导航、数理统计、最优化等多个学科的知识,是许多新兴产业和行业的基础技术,在人们当前及今后的生产和生活中将发挥重要作用。In recent years, with the rapid updating of sensors, processors and other hardware and the emergence of advanced algorithms such as positioning, reconstruction, and learning, the rapid development of related industries such as drones and robots has been promoted. The core technologies related to these industries mainly include: poses. Estimation, 3D reconstruction, path planning, machine learning, etc. Pose estimation, especially visual pose estimation, involves knowledge of many disciplines such as image processing, computer vision, inertial navigation, mathematical statistics, optimization, etc. It is the basic technology of many emerging industries and industries, in current and future production and Life will play an important role.
现有的位姿估计方法,通常需要提取图像的特征点,并建立描述子,因此具有消耗大量计算资源的特点,位姿估计的实时性较差。The existing pose estimation method usually needs to extract the feature points of the image and establish a descriptor, so it has the characteristics of consuming a large amount of computing resources, and the real-time performance of the pose estimation is poor.
发明内容Summary of the invention
本申请的目的在于提出一种位姿估计方法和装置,来解决以上背景技术部分提到的技术问题。The purpose of the present application is to propose a pose estimation method and apparatus to solve the technical problems mentioned in the background art section above.
第一方面,本申请实施例提供了一种位姿估计方法,从深度传感器处获取深度图像视频;从所述深度图像视频的各帧深度图像中选取 第一深度图像和第二深度图像,其中,所述第一深度图像和所述第二深度图像中共有至少一个指示同一物体的像素点;确定所述第一深度图像中的第一像素点集合以及所述第二深度图像中的第二像素点集合,其中,所述第一像素点集合中的各像素点与所述第二像素点集合中的各像素点一一对应,且对应的两个像素点指示同一物体;对于所述第一像素点集合中的任一像素点,基于该像素点在所述第一深度图像中的第一二维坐标及与该像素点对应的对应像素点在所述第二深度图像中的第一深度值,确定所述深度传感器的位姿变换参数。In a first aspect, an embodiment of the present application provides a pose estimation method for acquiring a depth image video from a depth sensor, and selecting a first depth image and a second depth image from each frame depth image of the depth image video, where Determining at least one pixel point indicating the same object in the first depth image and the second depth image; determining a first pixel point set in the first depth image and a second one in the second depth image a set of pixel points, wherein each pixel point in the first set of pixel points is in one-to-one correspondence with each pixel point in the second set of pixel points, and corresponding two pixel points indicate the same object; Any one of the pixels in the set of pixels, based on the first two-dimensional coordinates of the pixel in the first depth image and the first pixel corresponding to the pixel in the second depth image a depth value that determines a pose transformation parameter of the depth sensor.
在一些实施例中,在所述从所述各帧深度图像中选取第一深度图像和第二深度图像之前,所述方法还包括:对于每帧深度图像,删除该帧深度图像中符合预设条件的像素点;对删除后的深度图像进行平滑处理。In some embodiments, before the selecting the first depth image and the second depth image from the frame depth images, the method further includes deleting, for each frame depth image, the preset in the frame depth image The pixel of the condition; smoothing the deleted depth image.
在一些实施例中,所述删除该帧深度图像中符合预设条件的像素点,包括:检测每个像素点的深度值;将深度值大于第一预设值,且深度值小于第二预设值的像素点删除。In some embodiments, the deleting the pixel points in the frame depth image that meet the preset condition comprises: detecting a depth value of each pixel point; and the depth value is greater than the first preset value, and the depth value is less than the second pre- The pixel of the set value is deleted.
在一些实施例中,所述删除该帧深度图像中符合预设条件的像素点,包括:确定该帧深度图像在水平方向的第一偏导数和在竖直方向上的第二偏导数;根据所述第一偏导数和所述第二偏导数,确定该帧深度图像中的几何边缘像素点;将所述几何边缘像素点删除。In some embodiments, the deleting a pixel point in the frame depth image that meets a preset condition comprises: determining a first partial derivative of the frame depth image in a horizontal direction and a second partial derivative in a vertical direction; Determining, by the first partial derivative and the second partial derivative, a geometric edge pixel in the frame depth image; deleting the geometric edge pixel.
在一些实施例中,所述删除该帧深度图像中符合预设条件的像素点,包括:确定该帧深度图像中深度值不存在的失效像素点;将所述失效像素点以及与所述失效像素点相邻的像素点删除。In some embodiments, the deleting a pixel point in the frame depth image that meets a preset condition includes: determining a failed pixel point that the depth value does not exist in the frame depth image; and the invalidating pixel point and the invalidation The pixels adjacent to the pixel are deleted.
在一些实施例中,所述基于该像素点在所述第一深度图像中的第一二维坐标及与该像素点对应的对应像素点在所述第二深度图像中的第一深度值,确定所述深度传感器的位姿变换参数,包括:将该像素点在所述第一深度图像中的第一二维坐标映射为所述第一深度图像所属坐标系的第一三维空间坐标;将所述第一三维空间坐标变换到所述第二深度图像所属坐标系,得到第二三维空间坐标;将所述第二三维空间坐标映射到所述第二深度图像中,得到第二二维坐标;确定所述第二二维坐标在所述第二深度图像中的第二深度值;基于所述第一深 度值以及所述第二深度值,确定所述深度传感器的位姿变换参数。In some embodiments, the first depth value based on the first two-dimensional coordinates of the pixel in the first depth image and the corresponding pixel corresponding to the pixel in the second depth image, Determining the pose transformation parameter of the depth sensor, comprising: mapping a first two-dimensional coordinate of the pixel in the first depth image to a first three-dimensional coordinate of a coordinate system to which the first depth image belongs; Converting the first three-dimensional space coordinate to a coordinate system to which the second depth image belongs to obtain a second three-dimensional space coordinate; mapping the second three-dimensional space coordinate into the second depth image to obtain a second two-dimensional coordinate Determining a second depth value of the second two-dimensional coordinate in the second depth image; determining a pose transformation parameter of the depth sensor based on the first depth value and the second depth value.
在一些实施例中,所述基于所述第一深度值以及所述第二深度值,确定所述深度传感器的位姿变换参数,包括:根据所述第一深度值以及所述第二深度值,确定所述第一深度图像与所述第二深度图像之间的深度差值;确定所述深度差值为深度残差,并基于所述深度残差,执行以下迭代步骤:基于所述深度残差,确定位姿估计增量;确定所述深度残差是否小于预设阈值;响应于所述深度残差小于预设阈值,累积所述位姿估计增量与所述第一深度值,确定位姿估计值;根据所述位姿估计值,确定所述深度传感器的位姿变换参数;响应于所述深度残差大于或等于预设阈值,确定累积的所述位姿估计增量为所述深度残差,继续执行所述迭代步骤。In some embodiments, determining the pose transformation parameter of the depth sensor based on the first depth value and the second depth value comprises: according to the first depth value and the second depth value Determining a depth difference between the first depth image and the second depth image; determining the depth difference as a depth residual, and based on the depth residual, performing the following iterative steps: based on the depth Residual, determining a pose estimation increment; determining whether the depth residual is less than a preset threshold; and accumulating the pose estimation increment and the first depth value in response to the depth residual being less than a preset threshold, Determining a pose estimation value; determining a pose transformation parameter of the depth sensor according to the pose estimation value; determining, in response to the depth residual being greater than or equal to a preset threshold, determining the cumulative pose estimation increment as The depth residual continues to perform the iterative step.
在一些实施例中,所述方法还包括:从与所述深度传感器物理绑定的惯性测量装置处获取角速度和加速度;根据所述角速度和所述加速度,确定所述惯性测量装置的位姿变换参数;融合所述深度传感器的位姿变换参数以及所述惯性测量装置的位姿变换参数,确定综合位姿变换参数。In some embodiments, the method further comprises: obtaining angular velocity and acceleration from an inertial measurement device physically bound to the depth sensor; determining a pose transformation of the inertial measurement device based on the angular velocity and the acceleration a parameter; a pose transformation parameter of the depth sensor and a pose transformation parameter of the inertial measurement device are combined to determine an integrated pose transformation parameter.
在一些实施例中,所述根据所述角速度和所述加速度,确定所述惯性测量装置的位姿变换参数,包括:根据所述角速度,确定所述惯性测量装置的第一位姿变换参数;根据所述加速度,确定所述惯性测量装置的第二位姿变换参数;融合所述第一位姿变换参数以及所述第二位姿变换参数,确定所述惯性测量装置的位姿变换参数。In some embodiments, determining the pose transformation parameter of the inertial measurement device according to the angular velocity and the acceleration comprises: determining a first pose transformation parameter of the inertial measurement device according to the angular velocity; Determining, according to the acceleration, a second pose transformation parameter of the inertial measurement device; and combining the first pose transformation parameter and the second pose transformation parameter to determine a pose transformation parameter of the inertial measurement device.
第二方面,本申请实施例提供了一种位姿估计装置,所述装置包括:第一获取单元,用于从深度传感器处获取深度图像视频;图像选取单元,用于从所述深度图像视频的各帧深度图像中选取第一深度图像和第二深度图像,其中,所述第一深度图像和所述第二深度图像中共有至少一个指示同一物体的像素点;像素点集合确定单元,用于确定所述第一深度图像中的第一像素点集合以及所述第二深度图像中的第二像素点集合,其中,所述第一像素点集合中的各像素点与所述第二像素点集合中的各像素点一一对应,且对应的两个像素点指示同一物体;第一参数确定单元,用于对于所述第一像素点集合中的任一像 素点,基于该像素点在所述第一深度图像中的第一二维坐标及与该像素点对应的对应像素点在所述第二深度图像中的第一深度值,确定所述深度传感器的位姿变换参数。In a second aspect, the embodiment of the present application provides a pose estimating apparatus, the apparatus comprising: a first acquiring unit, configured to acquire a depth image video from a depth sensor; and an image selecting unit, configured to use the depth image video Selecting a first depth image and a second depth image in each of the frame depth images, wherein the first depth image and the second depth image share at least one pixel point indicating the same object; the pixel point set determining unit, Determining a first set of pixel points in the first depth image and a second set of pixel points in the second depth image, wherein each pixel point in the first set of pixel points and the second pixel Each of the pixel points in the point set corresponds to the one-to-one correspondence, and the corresponding two pixel points indicate the same object; the first parameter determining unit is configured to use, according to the pixel point, any pixel point in the first pixel point set. Determining the depth sensing by the first two-dimensional coordinates in the first depth image and the first depth value of the corresponding pixel point corresponding to the pixel point in the second depth image The pose transformation parameters of the device.
在一些实施例中,所述装置还包括预处理单元,所述预处理单元包括像素点删除模块以及平滑模块;所述像素点删除模块,用于在所述图像选取单元从所述各帧深度图像中选取第一深度图像和第二深度图像之前,对于每帧深度图像,删除该帧深度图像中符合预设条件的像素点;所述平滑模块,用于对删除后的深度图像进行平滑处理。In some embodiments, the apparatus further includes a pre-processing unit, the pre-processing unit includes a pixel point deletion module and a smoothing module; the pixel point deletion module is configured to select, in the image selection unit, the frame depth Before selecting the first depth image and the second depth image in the image, for each frame depth image, the pixel points in the frame depth image that meet the preset condition are deleted; the smoothing module is configured to perform smoothing on the deleted depth image. .
在一些实施例中,所述像素点删除模块进一步用于:检测每个像素点的深度值;将深度值大于第一预设值,且深度值小于第二预设值的像素点删除。In some embodiments, the pixel point deleting module is further configured to: detect a depth value of each pixel point; delete a pixel point whose depth value is greater than the first preset value, and the depth value is smaller than the second preset value.
在一些实施例中,所述像素点删除模块进一步用于:确定该帧深度图像在水平方向的第一偏导数和在竖直方向上的第二偏导数;根据所述第一偏导数和所述第二偏导数,确定该帧深度图像中的几何边缘像素点;将所述几何边缘像素点删除。In some embodiments, the pixel point deleting module is further configured to: determine a first partial derivative of the frame depth image in a horizontal direction and a second partial derivative in a vertical direction; according to the first partial derivative and the Deriving a second partial derivative to determine a geometric edge pixel in the frame depth image; deleting the geometric edge pixel.
在一些实施例中,所述像素点删除模块进一步用于:确定该帧深度图像中深度值不存在的失效像素点;将所述失效像素点以及与所述失效像素点相邻的像素点删除。In some embodiments, the pixel point deleting module is further configured to: determine a failed pixel point that the depth value does not exist in the frame depth image; delete the failed pixel point and a pixel point adjacent to the failed pixel point .
在一些实施例中,所述第一参数确定单元包括:第一映射模块,用于将该像素点在所述第一深度图像中的第一二维坐标映射为所述第一深度图像所属坐标系的第一三维空间坐标;变换模块,用于将所述第一三维空间坐标变换到所述第二深度图像所属坐标系,得到第二三维空间坐标;第二映射模块,用于将所述第二三维空间坐标映射到所述第二深度图像中,得到第二二维坐标;深度值确定模块,用于确定所述第二二维坐标在所述第二深度图像中的第二深度值;第一参数确定模块,用于基于所述第一深度值以及所述第二深度值,确定所述深度传感器的位姿变换参数。In some embodiments, the first parameter determining unit includes: a first mapping module, configured to map the first two-dimensional coordinates of the pixel in the first depth image to the coordinates of the first depth image a first three-dimensional space coordinate; a transformation module, configured to transform the first three-dimensional space coordinate to a coordinate system to which the second depth image belongs, to obtain a second three-dimensional space coordinate; and a second mapping module, configured to: Mapping a second three-dimensional space coordinate to the second depth image to obtain a second two-dimensional coordinate; a depth value determining module, configured to determine a second depth value of the second two-dimensional coordinate in the second depth image a first parameter determining module, configured to determine a pose transformation parameter of the depth sensor based on the first depth value and the second depth value.
在一些实施例中,所述第一参数确定模块进一步用于:根据所述第一深度值以及所述第二深度值,确定所述第一深度图像与所述第二深度图像之间的深度差值;确定所述深度差值为深度残差,并基于所 述深度残差,执行以下迭代步骤:基于所述深度残差,确定位姿估计增量;确定所述深度残差是否小于预设阈值;响应于所述深度残差小于预设阈值,累积所述位姿估计增量与所述第一深度值,确定位姿估计值;根据所述位姿估计值,确定所述深度传感器的位姿变换参数;响应于所述深度残差大于或等于预设阈值,确定累积的所述位姿估计增量为所述深度残差,继续执行所述迭代步骤。In some embodiments, the first parameter determining module is further configured to: determine a depth between the first depth image and the second depth image according to the first depth value and the second depth value Determining the depth difference as a depth residual, and based on the depth residual, performing an iterative step of determining a pose estimation increment based on the depth residual; determining whether the depth residual is less than a pre- Setting a threshold value; in response to the depth residual being less than a preset threshold, accumulating the pose estimation increment and the first depth value, determining a pose estimation value; determining the depth sensor according to the pose estimation value a pose transformation parameter; in response to the depth residual being greater than or equal to a preset threshold, determining that the accumulated pose estimation increment is the depth residual, and continuing to perform the iterative step.
在一些实施例中,所述装置还包括:第二获取单元,用于从与所述深度传感器物理绑定的惯性测量装置处获取角速度和加速度;第二参数确定单元,用于根据所述角速度和所述加速度,确定所述惯性测量装置的位姿变换参数;参数融合单元,用于融合所述深度传感器的位姿变换参数以及所述惯性测量装置的位姿变换参数,确定综合位姿变换参数。In some embodiments, the apparatus further includes: a second acquisition unit configured to acquire angular velocity and acceleration from an inertial measurement device physically bound to the depth sensor; a second parameter determination unit configured to determine an angular velocity Determining a pose transformation parameter of the inertial measurement device with the acceleration; a parameter fusion unit configured to fuse a pose transformation parameter of the depth sensor and a pose transformation parameter of the inertial measurement device to determine a comprehensive pose transformation parameter.
在一些实施例中,所述第二参数确定单元包括:第一子参数确定模块,用于根据所述角速度,确定所述惯性测量装置的第一位姿变换参数;第二子参数确定模块,用于根据所述加速度,确定所述惯性测量装置的第二位姿变换参数;融合模块,用于融合所述第一位姿变换参数以及所述第二位姿变换参数,确定所述惯性测量装置的位姿变换参数。In some embodiments, the second parameter determining unit includes: a first sub-parameter determining module, configured to determine a first pose transformation parameter of the inertial measurement device according to the angular velocity; and a second sub-parameter determination module, Determining, according to the acceleration, a second pose transformation parameter of the inertial measurement device; a fusion module, configured to fuse the first pose transformation parameter and the second pose transformation parameter, and determine the inertial measurement The pose transformation parameters of the device.
第三方面,本申请实施例提供了一种服务器,包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现上述任一实施例所描述的方法。In a third aspect, an embodiment of the present application provides a server, including: one or more processors; and a storage device, configured to store one or more programs, when the one or more programs are the one or more The processor executes such that the one or more processors implement the methods described in any of the above embodiments.
第四方面,本申请实施例提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如上述任一实施例所描述的方法。In a fourth aspect, the embodiment of the present application provides a computer readable storage medium, where a computer program is stored, and when the program is executed by the processor, the method described in any one of the foregoing embodiments is implemented.
本申请提供的位姿估计方法和装置,从深度传感器获取其采集的深度图像视频,并从其中选取两帧共有至少一个指示同一物体的像素点的深度图像,然后确定上述两帧深度图像中相互对应的第一像素点集合和第二像素点集合,然后对于第一像素点集合中的每一像素点,根据其在第一深度图像中的第一二维坐标及与该像素点对应的对应像 素点在第二深度图像中的第一深度值,确定深度传感器的位姿变换参数。本申请的位姿估计方法,利用深度图像进行位姿估计,降低了对计算资源的消耗,提高了计算效率,保证了位姿估计的实时性。The pose estimation method and apparatus provided by the present application acquires a depth image video collected by the depth sensor from the depth sensor, and selects two frames from which at least one pixel image indicating the same object is selected, and then determines the mutual image in the two frame depth images. Corresponding first pixel point set and second pixel point set, and then for each pixel point in the first pixel point set, according to the first two-dimensional coordinate in the first depth image and corresponding to the pixel point A first depth value of the pixel in the second depth image determines a pose transformation parameter of the depth sensor. The pose estimation method of the present application uses the depth image to perform pose estimation, reduces the consumption of computing resources, improves the calculation efficiency, and ensures the real-time performance of the pose estimation.
附图说明DRAWINGS
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:Other features, objects, and advantages of the present application will become more apparent from the detailed description of the accompanying drawings.
图1是本申请可以应用于其中的示例性系统架构图;1 is an exemplary system architecture diagram to which the present application can be applied;
图2是根据本申请的位姿估计方法的一个实施例的流程图;2 is a flow chart of one embodiment of a pose estimation method in accordance with the present application;
图3是根据本申请的位姿估计方法的一个应用场景的示意图;3 is a schematic diagram of an application scenario of a pose estimation method according to the present application;
图4是根据本申请的位姿估计方法中确定深度传感器的位姿变换参数的流程图;4 is a flowchart of determining a pose transformation parameter of a depth sensor in a pose estimation method according to the present application;
图5是根据本申请的位姿估计方法中位姿变换原理示意图;5 is a schematic diagram of a principle of pose transformation in a pose estimation method according to the present application;
图6是根据本申请的位姿估计方法的另一个实施例的流程图;6 is a flow chart of another embodiment of a pose estimation method according to the present application;
图7是根据本申请的位姿估计装置的一个实施例的结构示意图;7 is a schematic structural view of an embodiment of a pose estimating apparatus according to the present application;
图8是适于用来实现本申请实施例的服务器的计算机系统的结构示意图。FIG. 8 is a block diagram showing the structure of a computer system suitable for implementing the server of the embodiment of the present application.
具体实施方式detailed description
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。The present application will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the invention, rather than the invention. It is also to be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict. The present application will be described in detail below with reference to the accompanying drawings.
图1示出了可以应用本申请的位姿估计方法或位姿估计装置的实施例的示例性系统架构100。FIG. 1 illustrates an exemplary system architecture 100 in which embodiments of the pose estimation method or pose estimation apparatus of the present application may be applied.
如图1所示,系统架构100可以包括深度传感器101,网络102和服务器103。网络102用以在深度传感器101和服务器103之间提 供通信链路的介质。网络102可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1, system architecture 100 can include depth sensor 101, network 102, and server 103. Network 102 is used to provide a medium for the communication link between depth sensor 101 and server 103. Network 102 can include a variety of connection types, such as wired, wireless communication links, fiber optic cables, and the like.
深度传感器101通过网络102与服务器103交互,以发送深度图像视频等。深度传感器101可以安装在各种运动的物体上,例如可以安装在无人车、机器人、无人配送车、智能穿戴设备、虚拟现实设备上等。The depth sensor 101 interacts with the server 103 via the network 102 to transmit depth image video or the like. The depth sensor 101 can be mounted on various moving objects, such as an unmanned vehicle, a robot, an unmanned delivery vehicle, a smart wearable device, a virtual reality device, and the like.
深度传感器101可以是能够连续采集多帧深度图像的各种深度传感器。The depth sensor 101 may be various depth sensors capable of continuously acquiring multi-frame depth images.
服务器103可以是提供各种服务的服务器,例如对深度传感器101采集的深度图像视频进行处理的后台服务器。后台服务器可以对接收到的深度图像视频等数据进行分析等处理。The server 103 may be a server that provides various services, such as a background server that processes depth image video acquired by the depth sensor 101. The background server can analyze and process data such as the received depth image video.
需要说明的是,本申请实施例所提供的位姿估计方法一般由服务器103执行,相应地,位姿估计装置一般设置于服务器103中。It should be noted that the pose estimation method provided by the embodiment of the present application is generally performed by the server 103. Accordingly, the pose estimation apparatus is generally disposed in the server 103.
应该理解,图1中的深度传感器、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的深度传感器、网络和服务器。It should be understood that the number of depth sensors, networks, and servers in Figure 1 are merely illustrative. Depending on the implementation needs, there can be any number of depth sensors, networks, and servers.
继续参考图2,示出了根据本申请的位姿估计方法的一个实施例的流程200。本实施例的位姿估计方法,包括以下步骤:With continued reference to FIG. 2, a flow 200 of one embodiment of a pose estimation method in accordance with the present application is illustrated. The pose estimation method of this embodiment includes the following steps:
步骤201,从深度传感器处获取深度图像视频。Step 201: Acquire a depth image video from the depth sensor.
在本实施例中,位姿估计方法运行于其上的电子设备(例如图1所示的服务器)可以通过有线连接方式或者无线连接方式从深度传感器处获取深度图像视频。深度图像视频中的各帧图像均为深度图像。深度图像(depth image)也被称为距离影像,是指将图像采集器到场景中各点的距离(深度)作为像素值的图像,它直接反映了景物可见表面的几何形状。深度图像中的每一个像素点代表的是在深度传感器的视野中,特定坐标处的物体到深度传感器的摄像头平面之间的距离。In the present embodiment, the electronic device on which the pose estimation method operates (for example, the server shown in FIG. 1) can acquire the depth image video from the depth sensor by a wired connection or a wireless connection. Each frame image in the depth image video is a depth image. The depth image, also called the distance image, refers to the distance (depth) of the image collector to each point in the scene as the image of the pixel value, which directly reflects the geometry of the visible surface of the scene. Each pixel in the depth image represents the distance between the object at a particular coordinate and the camera plane of the depth sensor in the field of view of the depth sensor.
需要指出的是,上述无线连接方式可以包括但不限于3G/4G连接、WiFi连接、蓝牙连接、WiMAX连接、Zigbee连接、UWB(ultra wideband)连接以及其他现在已知或将来开发的无线连接方式。It should be noted that the above wireless connection manners may include, but are not limited to, 3G/4G connection, WiFi connection, Bluetooth connection, WiMAX connection, Zigbee connection, UWB (ultra wideband) connection, and other wireless connection methods now known or developed in the future.
步骤202,从深度图像视频的各帧深度图像中选取第一深度图像 和第二深度图像。Step 202: Select a first depth image and a second depth image from each frame depth image of the depth image video.
其中,上述第一深度图像和第二深度图像中共有至少一个指示同一物体的像素点,也就是说,第一深度图像和第二深度图像中共有至少一个物体。例如,上述第一深度图像与第二深度图像可以是上述深度图像视频中相邻的两帧深度图像,也可以是在深度图像视频中的序号相差小于预设值的两帧深度图像。Wherein, at least one pixel point indicating the same object is shared in the first depth image and the second depth image, that is, at least one object is shared in the first depth image and the second depth image. For example, the first depth image and the second depth image may be adjacent two-frame depth images in the depth image video, or may be two-frame depth images in which the sequence numbers in the depth image video differ by less than a preset value.
步骤203,确定第一深度图像中的第一像素点集合以及第二深度图像中的第二像素点集合。Step 203: Determine a first pixel point set in the first depth image and a second pixel point set in the second depth image.
上述第一像素点集合中的各像素点与第二像素点集合中的各像素点一一对应,且相对应的两个像素点指示同一物体,更具体的,相对应的两个像素点指示同一物体的同一位置。可以理解的是,第一像素点集合中的像素点数量与第二像素点集合中的像素点数量相等,且其数量与第一深度图像与第二深度图像中共有的指示同一物体的像素点的数量相同。Each pixel point in the first pixel point set corresponds to each pixel point in the second pixel point set, and the corresponding two pixel points indicate the same object, and more specifically, the corresponding two pixel point indications The same location of the same object. It can be understood that the number of pixels in the first set of pixel points is equal to the number of pixels in the second set of pixel points, and the number thereof is the same as the pixel indicating the same object shared by the first depth image and the second depth image. The number is the same.
步骤204,对于第一像素点集合中的任一像素点,基于该像素点在第一深度图像中的第一二维坐标及与该像素点对应的对应像素点在第二深度图像中的第一深度值,确定深度传感器的位姿变换参数。Step 204: For any pixel in the first set of pixel points, based on the first two-dimensional coordinates of the pixel in the first depth image and the corresponding pixel corresponding to the pixel in the second depth image A depth value determines the pose transformation parameter of the depth sensor.
服务器可以基于第一像素点集合中的每个像素点在第一深度图像中的第一二维坐标以及第二像素点集合中与该像素点对应的对应像素点在第二深度图像中的第一深度值,来确定深度传感器的位姿变换参数。可以理解的是,上述第一二维坐标是该像素点在第一深度图像的图像坐标系中的坐标,上述第一二维坐标中不包括该像素点的深度值。与该像素点对应的对应的像素点存在于第二像素点集合中,由于每个像素点都有一个深度值,因此可以通过第二深度图像确定上述对应像素点的第一深度值。上述位姿变换参数可以是上述第一深度图像与第二深度图像之间的位姿变换参数。The server may be based on the first two-dimensional coordinates of each pixel in the first set of pixel points in the first depth image and the corresponding pixel points in the second set of pixels corresponding to the pixel in the second depth image. A depth value is used to determine the pose transformation parameters of the depth sensor. It can be understood that the first two-dimensional coordinate is a coordinate of the pixel in the image coordinate system of the first depth image, and the depth value of the pixel is not included in the first two-dimensional coordinate. A corresponding pixel point corresponding to the pixel point exists in the second pixel point set. Since each pixel point has a depth value, the first depth value of the corresponding pixel point may be determined by the second depth image. The pose transformation parameter may be a pose transformation parameter between the first depth image and the second depth image.
继续参见图3,图3是根据本实施例的位姿估计方法的应用场景的一个示意图。在图3的应用场景中,无人车302上安装有深度传感器301,随着无人车302的行驶,深度传感器301采集了深度图像视频,并将采集到的深度图像视频发送给服务器303,服务器303在接 收到上述深度图像视频后,确定了深度传感器301的位姿变换参数后,将此位姿变换参数发送给无人车302,则无人车302可以根据上述位姿变换参数进行导航和避障。With continued reference to FIG. 3, FIG. 3 is a schematic diagram of an application scenario of the pose estimation method according to the present embodiment. In the application scenario of FIG. 3, the depth sensor 301 is installed on the unmanned vehicle 302. As the unmanned vehicle 302 travels, the depth sensor 301 collects the depth image video and sends the collected depth image video to the server 303. After receiving the depth image video, the server 303 determines the pose transformation parameter of the depth sensor 301, and then sends the pose transformation parameter to the unmanned vehicle 302, and the unmanned vehicle 302 can navigate according to the pose transformation parameter. And obstacle avoidance.
本申请的上述实施例提供的位姿估计方法,从深度传感器获取其采集的深度图像视频,并从其中选取两帧共有至少一个指示同一物体的像素点的深度图像,然后确定上述两帧深度图像中相互对应的第一像素点集合和第二像素点集合,然后对于第一像素点集合中的每一像素点,根据其在第一深度图像中的第一二维坐标及与该像素点对应的对应像素点在第二深度图像中的第一深度值,确定深度传感器的位姿变换参数,降低了对计算资源的消耗,提高了计算效率,保证了位姿估计的实时性。The pose estimation method provided by the above embodiment of the present application acquires the depth image video collected by the depth sensor from the depth sensor, and selects two depth images of at least one pixel point indicating the same object from the two frames, and then determines the two-frame depth image. And corresponding to the first pixel point set and the second pixel point set, and then for each pixel point in the first pixel point set, according to the first two-dimensional coordinate in the first depth image and corresponding to the pixel point The first depth value of the corresponding pixel in the second depth image determines the pose transformation parameter of the depth sensor, reduces the consumption of computing resources, improves the calculation efficiency, and ensures the real-time performance of the pose estimation.
在本实施例的一些可选的实现方式中,上述方法还包括图2中未示出的以下步骤:In some optional implementation manners of this embodiment, the foregoing method further includes the following steps not shown in FIG. 2:
对于每帧深度图像,删除该帧深度图像中符合预设条件的像素点;对删除后的深度图像进行平滑处理。For each frame depth image, pixels in the frame depth image that meet the preset condition are deleted; and the deleted depth image is smoothed.
深度传感器通常会发射探测光线(例如红外线、激光、雷达),并接收物体表面反射回来的探测光线来确定物体与深度传感器之间的距离。由于物体的遮挡、物体表面对探测光线的吸收和漫反射等原因,造成深度传感器不能完全接收反射回来的探测光线。因此深度图像中的很多像素点没有深度值或者深度值不准确。本实现方式中,为了保证位姿估计的准确性,需要将每帧深度图像中符合预设条件的像素点删除。同时,为了提高深度值的鲁棒性,抑制深度值的噪声,可以对删除了像素点的深度图像进行平滑处理。上述平滑处理可以包括:线性平滑、插值平滑、卷积平滑、高斯滤波、双边滤波等。The depth sensor typically emits probe light (eg, infrared, laser, radar) and receives the reflected light reflected from the surface of the object to determine the distance between the object and the depth sensor. Due to the occlusion of the object, the absorption of the detected light by the surface of the object, and the diffuse reflection, the depth sensor cannot completely receive the reflected light reflected back. Therefore, many pixel points in the depth image have no depth value or the depth value is inaccurate. In this implementation manner, in order to ensure the accuracy of the pose estimation, it is necessary to delete the pixel points in the depth image of each frame that meet the preset condition. At the same time, in order to improve the robustness of the depth value and suppress the noise of the depth value, the depth image from which the pixel is deleted can be smoothed. The above smoothing processing may include linear smoothing, interpolation smoothing, convolution smoothing, Gaussian filtering, bilateral filtering, and the like.
在本实施例的一些可选的实现方式中,可以首先检测每个像素点的深度值,并将深度值大于第一预设值、小于第二预设值的像素点删除。In some optional implementation manners of the embodiment, the depth value of each pixel point may be detected first, and the pixel point whose depth value is greater than the first preset value and smaller than the second preset value may be deleted.
由于深度传感器自身的限制,深度值在第一预设值与第二预设值之间的像素点的不确定性非常高,因此需要将这些像素点删除。可以理解的是,上述第一预设值、第二预设值的值与深度传感器的型号有 关,本实现方式对此不做限定。Due to the limitation of the depth sensor itself, the uncertainty of the pixel point of the depth value between the first preset value and the second preset value is very high, so these pixel points need to be deleted. It can be understood that the values of the first preset value and the second preset value are related to the model of the depth sensor, which is not limited in this implementation manner.
在本实施例的一些可选的实现方式中,可以确定每帧深度图像在水平方向u方向的第一偏导数Zu以及在竖直方向v方向的第二偏导数Zv,然后根据Zu和Zv确定该帧深度图像中的几何边缘像素点,将确定的几何边缘像素点删除。In some optional implementation manners of the embodiment, the first partial derivative Zu of the depth image in the horizontal direction u direction and the second partial derivative Zv in the vertical direction v direction may be determined, and then determined according to Zu and Zv. The geometric edge pixel points in the frame depth image are removed from the determined geometric edge pixel points.
由于深度传感器上的探测光线发射器与探测光线接收器的位置不重合,所以对于物体边缘所在的像素点来说,其深度值具有很高的不确定性,同时在这些边缘所在像素点两侧的像素点深度值会发生跳变,为了保证位姿估计的准确性,可以将上述几何边缘像素点删除。Since the position of the probe light emitter on the depth sensor does not coincide with the position of the probe light receiver, the depth value of the pixel at the edge of the object has a high degree of uncertainty, and at the same time on both sides of the pixel where the edge is located The pixel depth value will jump, and in order to ensure the accuracy of the pose estimation, the above geometric edge pixel points can be deleted.
在本实施例的一些可选的实现方式中,可以确定每帧深度图像中深度值不存在的失效像素点,然后将这些失效像素点以及与这些失效像素点相邻的像素点删除。In some optional implementations of this embodiment, the failed pixel points where the depth value does not exist in each depth image may be determined, and then the failed pixel points and the pixel points adjacent to the failed pixel points are deleted.
如果深度传感器的探测光线发射器发射的探测光线被物体遮挡或吸收,则探测光线接收器不能接收到物体反射回的探测光线,从而不能确定该像素点的深度值,这些像素点称为失效像素点。同时,为了提高位姿估计的准确性,将与失效像素点相邻的像素点也一并删除。If the detection light emitted by the detection light emitter of the depth sensor is blocked or absorbed by the object, the detection light receiver cannot receive the detection light reflected back from the object, and thus the depth value of the pixel cannot be determined. These pixels are called invalid pixels. point. At the same time, in order to improve the accuracy of the pose estimation, the pixel points adjacent to the failed pixel point are also deleted.
继续参见图4,其示出了根据本申请的位姿估计方法中确定深度传感器的位姿变换参数的流程400。如图4所示,本实施例中,可以通过以下步骤来确定深度传感器的位姿变换参数:With continued reference to FIG. 4, a flow 400 for determining a pose transformation parameter of a depth sensor in a pose estimation method in accordance with the present application is illustrated. As shown in FIG. 4, in this embodiment, the pose transformation parameter of the depth sensor can be determined by the following steps:
步骤401,将该像素点在第一深度图像中的第一二维坐标映射为第一深度图像所属坐标系的第一三维空间坐标。Step 401: Map the first two-dimensional coordinates of the pixel in the first depth image to the first three-dimensional coordinate of the coordinate system to which the first depth image belongs.
对于第一像素点集合中的任一像素点,可以首先确定该像素点在第一深度图像中的第一二维坐标(x 1,y 1),然后将上述第一二维坐标(x 1,y 1)映射为第一三维空间坐标。在映射时,可通过针孔相机模型中的π -1映射得到第一深度图像所属坐标系的第一三维空间坐标(x 1’,y 1’,z 1’)。 For any pixel in the first set of pixel points, the first two-dimensional coordinates (x 1 , y 1 ) of the pixel in the first depth image may be first determined, and then the first two-dimensional coordinates (x 1) , y 1 ) is mapped to the first three-dimensional space coordinates. At the time of mapping, the first three-dimensional space coordinates (x 1 ', y 1 ', z 1 ') of the coordinate system to which the first depth image belongs may be obtained by π -1 mapping in the pinhole camera model.
针孔相机模型包括将三维空间点投影到像素平面的二维坐标的映射关系π以及将图像上带有深度的点的二维坐标映射为三维空间点的映射关系π -1The pinhole camera model includes a mapping relationship π of two-dimensional coordinates that project a three-dimensional space point to a pixel plane and a two-dimensional coordinate of a point with depth on the image as a mapping relationship π -1 of the three-dimensional space point.
步骤402,将第一三维空间坐标变换到第二深度图像所属坐标系, 得到第二三维空间坐标。Step 402: Transform the first three-dimensional space coordinate into a coordinate system to which the second depth image belongs, to obtain a second three-dimensional space coordinate.
本实施例中,第一深度图像变换到第二深度图像之间的位姿变换参数记为T 1→2,具体可参见图5。图5中,世界坐标系x w-y w-z w下的一个行人记为点P,深度传感器在t 1时刻位于左侧,在t 2时刻位于右侧。点P在t 1时刻得到的深度图像中为点P 1,其坐标为(x 1,y 1),其深度值为Z 1,其所属的坐标系为x c1-y c1-z c1。点P在t 2时刻得到的深度图像中为点P 2,其坐标为(x 2,y 2),其深度值为Z 2,其所属的坐标系为x c2-y c2-z c2。坐标系x c1-y c1-z c1与坐标系x c2-y c2-z c2之间的位姿变换参数为T 1→2In this embodiment, the pose transformation parameter between the first depth image and the second depth image is recorded as T 1→2 , as shown in FIG. 5 . In Fig. 5, a pedestrian in the world coordinate system x w -y w -z w is denoted as point P, the depth sensor is located on the left side at time t 1 and on the right side at time t 2 . Point P is the point P 1 in the depth image obtained at time t 1 , its coordinate is (x 1 , y 1 ), its depth value is Z 1 , and its coordinate system is x c1 -y c1 -z c1 . Point P is a point P 2 in the depth image obtained at time t 2 , its coordinate is (x 2 , y 2 ), its depth value is Z 2 , and its associated coordinate system is x c2 -y c2 -z c2 . The pose transformation parameter between the coordinate system x c1 -y c1 -z c1 and the coordinate system x c2 -y c2 -z c2 is T 1→2 .
上述位姿变换参数T 1→2用李代数se(3)表示为ξ,将ξ用矩阵来表示为李群T(ξ)。在将上述第一三维空间坐标变换到第二深度图像所属坐标系时,可以预设一个李群T(ξ),并利用此预设的李群T(ξ)完成变换,得到第二三维坐标(x 2’,y 2’,z 2’)。 The above-described pose transformation parameter T 1→2 is represented by a Lie algebra se(3) as a ξ, and is represented by a matrix as a Lie group T(ξ). When the first three-dimensional space coordinate is transformed to the coordinate system of the second depth image, a Lie group T(ξ) may be preset, and the preset Lie group T(ξ) is used to complete the transformation to obtain the second three-dimensional coordinate. (x 2 ', y 2 ', z 2 ').
步骤403,将第二三维空间坐标映射到第二深度图像中,得到第二二维坐标。Step 403: Map the second three-dimensional space coordinates into the second depth image to obtain the second two-dimensional coordinates.
本实施例中,可以利用针孔相机模型中的π映射得到第二二维坐标(x 2,y 2)。 In this embodiment, the second two-dimensional coordinates (x 2 , y 2 ) can be obtained by using the π mapping in the pinhole camera model.
步骤404,确定第二二维坐标在第二深度图像中的第二深度值。 Step 404, determining a second depth value of the second two-dimensional coordinate in the second depth image.
在第二深度图像中确定上述第二二维坐标(x 2,y 2)的第二深度值。理想情况下,上述第二深度值应与第二像素点集合中与该像素点对应的对应像素点的第一深度值相同。但由于噪声存在,往往两者不相同。 A second depth value of the second two-dimensional coordinate (x 2 , y 2 ) is determined in the second depth image. Ideally, the second depth value should be the same as the first depth value of the corresponding pixel point corresponding to the pixel point in the second pixel point set. However, due to the presence of noise, the two are often different.
步骤405,基于第一深度值以及第二深度值,确定深度传感器的位姿变换参数。Step 405: Determine a pose change parameter of the depth sensor based on the first depth value and the second depth value.
在得到上述第二深度值后,可以结合第一深度值,来确定深度传感器的位姿变换参数。After the second depth value is obtained, the pose change parameter of the depth sensor may be determined in conjunction with the first depth value.
在本实施例的一些可选的实现方式中,上述步骤405可以通过图4中未示出的以下步骤来实现:In some optional implementation manners of this embodiment, the foregoing step 405 can be implemented by the following steps not shown in FIG. 4:
根据第一深度值以及第二深度值,确定第一深度图像与第二深度图像之间的深度差值;确定上述深度差值为深度残差,并基于深度残差,执行以下迭代步骤:基于深度残差,确定位姿估计增量;确定深度残差是否小于预设阈值;响应于深度残差小于预设阈值,累积位姿 估计增量与第一深度值,确定位姿估计值;根据位姿估计值,确定深度传感器的位姿变换参数;响应于所述深度残差大于或等于预设阈值,确定累积的位姿估计增量为深度残差,继续执行迭代步骤。Determining a depth difference between the first depth image and the second depth image according to the first depth value and the second depth value; determining that the depth difference is a depth residual, and performing the following iterative steps based on the depth residual: based on a depth residual, determining a pose estimation increment; determining whether the depth residual is less than a preset threshold; and in response to the depth residual being less than a preset threshold, accumulating the pose estimation increment and the first depth value, determining the pose estimate; The pose estimation value determines a pose transformation parameter of the depth sensor; in response to the depth residual being greater than or equal to a preset threshold, determining that the accumulated pose estimation increment is a depth residual, and continuing to perform the iterative step.
本实现方式中,如果第一深度值与第二深度值不相等,则确定二者之差为第一深度图像与第二深度图像之间的深度差值。将上述深度差值作为深度残差,然后执行迭代步骤:基于上述深度残差,确定位姿估计增量,然后判断上述深度残差是否小于预设阈值,如果小于的话,则将上述位姿估计增量与第一深度值累积,得到位姿估计值,此时的位姿估计值与第二深度值之间的差值在可接收的范围内,因此可直接根据上述位姿估计值来确定深度传感器的位姿变换参数。如果上述深度残差大于或等于预设阈值,则将每次执行迭代步骤得到的位姿估计增量累积,将上述累积值作为新的深度残差,然后继续执行迭代步骤。In this implementation manner, if the first depth value and the second depth value are not equal, determining a difference between the two is a depth difference between the first depth image and the second depth image. Taking the depth difference as the depth residual, and then performing an iterative step: determining the pose estimation increment based on the depth residual, and then determining whether the depth residual is less than a preset threshold, and if less than, determining the pose The increment and the first depth value are accumulated to obtain a pose estimation value, and the difference between the pose estimate and the second depth value at this time is within an acceptable range, and thus can be directly determined according to the pose estimate described above. Pose transformation parameters of the depth sensor. If the depth residual is greater than or equal to the preset threshold, the pose estimation increment obtained each time the iterative step is performed is accumulated, the accumulated value is taken as a new depth residual, and then the iterative step is continued.
本实现方式中,可以根据以下公式来确定深度传感器的位姿变换参数:In this implementation manner, the pose transformation parameter of the depth sensor may be determined according to the following formula:
ξ *=arg min∑|Z 2(P 2')-[T(ξ)·π -1(P 1)] Z| ξ * =arg min∑|Z 2 (P 2 ')-[T(ξ)·π -1 (P 1 )] Z |
其中,ξ*为位姿变换参数的估计值,Z 2为第二深度图像,P 1为第一深度图像中的像素点,P 2’为P 1在第二深度图像中的对应像素点,Z 2(P 2’)为P 2’点在第二深度图像中的深度值(即第一深度值),T为李群,ξ为位姿变换参数,T(ξ)为预设的位姿变换的李群,π -1为针孔相机模型中的π -1映射,[T(ξ)·π -1(P 1)] Z为P 1点经π映射变为三维空间点然后经空间变换至第二深度图像所属的坐标系再经π -1映射得到的像素点的深度值(即第二深度值),arg min函数表示使∑|Z 2(P 2')-[T(ξ)·π -1(P 1)] Z|取最小值时的ξ值,记为ξ*。 Where ξ* is an estimated value of the pose transformation parameter, Z 2 is a second depth image, P 1 is a pixel point in the first depth image, and P 2 ' is a corresponding pixel point of P 1 in the second depth image, Z 2 (P 2 ') is the depth value of the P 2 ' point in the second depth image (ie, the first depth value), T is the Lie group, ξ is the pose transformation parameter, and T(ξ) is the preset bit. Lie group of pose transformation, π -1 is the π -1 map in the pinhole camera model, [T(ξ)·π -1 (P 1 )] Z is the point where P 1 is transformed into a three-dimensional space point by π mapping and then The space is transformed to the depth value of the pixel point obtained by the π -1 mapping of the coordinate system to which the second depth image belongs, and the arg min function is such that ∑|Z 2 (P 2 ')-[T( ξ)·π -1 (P 1 )] Z | The ξ value at the minimum value is denoted as ξ*.
本申请的上述实施例所提供的位姿估计方法,利用深度图像之间的深度残差来求解位姿变换参数,避免了现有技术中提取特征点和建立描述子的繁复过程,节省了计算资源,保证了计算的实时性。The pose estimation method provided by the above embodiment of the present application utilizes the depth residual between the depth images to solve the pose transformation parameters, thereby avoiding the complicated process of extracting feature points and establishing descriptors in the prior art, and saving calculations. Resources ensure the real-time performance of the calculation.
进一步参考图6,其示出了根据本申请的位姿估计方法的另一个实施例的流程600。如图6所示,本实施例的位姿估计方法,在得到深度传感器的位姿变换参数后还可以包括以下步骤:With further reference to FIG. 6, a flow 600 of another embodiment of a pose estimation method in accordance with the present application is illustrated. As shown in FIG. 6, the pose estimation method of the embodiment may further include the following steps after obtaining the pose transformation parameter of the depth sensor:
步骤601,从与深度传感器物理绑定的惯性测量装置处获取角速 度和加速度。 Step 601, obtaining angular velocity and acceleration from an inertial measurement device physically bound to the depth sensor.
本实施例中,为了进一步保证位姿估计的准确性,可以将惯性测量装置(Inertial measurement unit,IMU)与深度传感器物理绑定。上述物理绑定可以理解为将惯性测量装置与深度传感器的中心重合后固定在一起。惯性测量装置可以测量物体移动的角速度和加速度。本实施例中,执行位姿估计方法的服务器可以通过有线或无线方式从惯性测量装置处获取上述角速度和加速度。In this embodiment, in order to further ensure the accuracy of the pose estimation, an inertial measurement unit (IMU) may be physically bound to the depth sensor. The above physical binding can be understood as the fact that the inertial measurement device is coincident with the center of the depth sensor and fixed together. The inertial measurement device measures the angular velocity and acceleration of the object's movement. In this embodiment, the server performing the pose estimation method can acquire the angular velocity and the acceleration from the inertial measurement device by wire or wirelessly.
步骤602,根据角速度和加速度,确定惯性测量装置的位姿变换参数。Step 602: Determine a pose transformation parameter of the inertial measurement device according to the angular velocity and the acceleration.
服务器可以在获取上述角速度和加速度后,确定惯性测量装置的位姿变换参数。The server may determine the pose transformation parameters of the inertial measurement device after acquiring the angular velocity and acceleration described above.
在本实施例的一些可选的实现方式中,上述步骤602可以通过图6中未示出的以下步骤来实现:In some optional implementation manners of this embodiment, the foregoing step 602 may be implemented by the following steps not shown in FIG. 6:
根据角速度,确定惯性测量装置的第一位姿变换参数;根据加速度,确定惯性测量装置的第二位姿变换参数;融合第一位姿变换参数以及第二位姿变换参数,确定惯性测量装置的位姿变换参数。Determining a first pose transformation parameter of the inertial measurement device according to the angular velocity; determining a second pose transformation parameter of the inertial measurement device according to the acceleration; and combining the first pose transformation parameter and the second pose transformation parameter to determine the inertial measurement device Pose transformation parameters.
本实现方式中,利用角速度确定第一位姿变换参数,以及利用加速度来确定第二位姿变换参数为本领域技术人员公知的,此处不再赘述。在得到上述第一位姿变换参数和第二位姿变换参数后,可以将二者融合,确定惯性测量装置的位姿变换参数。In this implementation manner, determining the first pose transformation parameter by using the angular velocity, and determining the second pose transformation parameter by using the acceleration is well known to those skilled in the art, and details are not described herein again. After obtaining the first pose transformation parameter and the second pose transformation parameter, the two can be fused to determine the pose transformation parameter of the inertial measurement device.
步骤603,融合深度传感器的位姿变换参数以及惯性测量装置的位姿变换参数,确定综合位姿变换参数。 Step 603, combining the pose transformation parameter of the depth sensor and the pose transformation parameter of the inertial measurement device to determine the integrated pose transformation parameter.
在得到深度传感器的位姿变换参数以及惯性测量装置的位姿变换参数后,可以利用耦合的方法(例如松耦合或紧耦合)将二者融合,以确定综合位姿变换参数。After obtaining the pose transformation parameters of the depth sensor and the pose transformation parameters of the inertial measurement device, the coupling methods (such as loose coupling or tight coupling) can be used to fuse the two to determine the integrated pose transformation parameters.
在本实施例的一些可选的实现方式中,为了减少噪声对加速度和角速度值的影响,在步骤602之前可以首先对加速度和角速度进行滤波处理。本实现方式中,可以采用互补滤波器来去除加速度和角速度的噪声,提高位姿变换参数的准确性。In some alternative implementations of this embodiment, to reduce the effect of noise on acceleration and angular velocity values, the acceleration and angular velocity may be first filtered prior to step 602. In this implementation manner, a complementary filter can be used to remove the acceleration and angular velocity noise, and improve the accuracy of the pose transformation parameters.
本申请的上述实施例提供的位姿估计方法,可以提高位姿估计参 数的准确性。The pose estimation method provided by the above embodiment of the present application can improve the accuracy of the pose estimation parameter.
进一步参考图7,作为对上述各图所示方法的实现,本申请提供了一种位姿估计装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。With further reference to FIG. 7, as an implementation of the method shown in the above figures, the present application provides an embodiment of a pose estimation apparatus, the apparatus embodiment corresponding to the method embodiment shown in FIG. 2, the apparatus specific Can be applied to a variety of electronic devices.
如图7所示,本实施例的位姿估计装置700包括:第一获取单元701、图像选取单元702、像素点集合确定单元703以及第一参数确定单元704。As shown in FIG. 7, the pose estimating apparatus 700 of the present embodiment includes a first acquiring unit 701, an image selecting unit 702, a pixel point set determining unit 703, and a first parameter determining unit 704.
其中,第一获取单元701,用于从深度传感器处获取深度图像视频。The first obtaining unit 701 is configured to acquire a depth image video from the depth sensor.
图像选取单元702,用于从深度图像视频的各帧深度图像中选取第一深度图像和第二深度图像。The image selecting unit 702 is configured to select the first depth image and the second depth image from each frame depth image of the depth image video.
其中,第一深度图像和第二深度图像中共有至少一个指示同一物体的像素点。Wherein, at least one pixel point indicating the same object is shared in the first depth image and the second depth image.
像素点集合确定单元703,用于确定第一深度图像中的第一像素点集合以及第二深度图像中的第二像素点集合。The pixel point set determining unit 703 is configured to determine a first pixel point set in the first depth image and a second pixel point set in the second depth image.
其中,第一像素点集合中的各像素点与第二像素点集合中的各像素点一一对应,且对应的两个像素点指示同一物体。The pixel points in the first pixel point set are in one-to-one correspondence with the pixel points in the second pixel point set, and the corresponding two pixel points indicate the same object.
第一参数确定单元704,用于对于第一像素点集合中的任一像素点,基于该像素点在第一深度图像中的第一二维坐标及与该像素点对应的对应像素点在第二深度图像中的第一深度值,确定深度传感器的位姿变换参数。The first parameter determining unit 704 is configured to: for any pixel in the first set of pixel points, based on the first two-dimensional coordinates of the pixel in the first depth image and the corresponding pixel corresponding to the pixel A first depth value in the two depth images determines a pose transformation parameter of the depth sensor.
在本实施例的一些可选的实现方式中,上述装置700还可以包括图7中未示出的预处理单元,上述预处理单元包括像素点删除模块以及平滑模块。In some optional implementation manners of the embodiment, the foregoing apparatus 700 may further include a pre-processing unit not shown in FIG. 7. The pre-processing unit includes a pixel point deletion module and a smoothing module.
其中,像素点删除模块,用于在图像选取单元702从各帧深度图像中选取第一深度图像和第二深度图像之前,对于每帧深度图像,删除该帧深度图像中符合预设条件的像素点。The pixel point deleting module is configured to: before the image selecting unit 702 selects the first depth image and the second depth image from each frame depth image, delete the pixel that meets the preset condition in the frame depth image for each frame depth image. point.
平滑模块,用于对删除后的深度图像进行平滑处理。A smoothing module for smoothing the deleted depth image.
在本实施例的一些可选的实现方式中,上述像素点删除模块可以进一步用于:检测每个像素点的深度值;将深度值大于第一预设值, 且深度值小于第二预设值的像素点删除。In some optional implementation manners of the embodiment, the pixel point deleting module may be further configured to: detect a depth value of each pixel point; set the depth value to be greater than the first preset value, and the depth value is smaller than the second preset The pixel of the value is deleted.
在本实施例的一些可选的实现方式中,上述像素点删除模块可以进一步用于:确定该帧深度图像在水平方向的第一偏导数和在竖直方向上的第二偏导数;根据第一偏导数和第二偏导数,确定该帧深度图像中的几何边缘像素点;将几何边缘像素点删除。In some optional implementation manners of the embodiment, the pixel point deleting module may be further configured to: determine a first partial derivative of the frame depth image in a horizontal direction and a second partial derivative in a vertical direction; A partial derivative and a second partial derivative are used to determine geometric edge pixel points in the frame depth image; the geometric edge pixel points are deleted.
在本实施例的一些可选的实现方式中,上述像素点删除模块可以进一步用于:确定该帧深度图像中深度值不存在的失效像素点;将失效像素点以及与失效像素点相邻的像素点删除。In some optional implementation manners of the embodiment, the pixel deletion module may be further configured to: determine a failed pixel point where the depth value does not exist in the frame depth image; and the failed pixel point and the adjacent to the failed pixel point The pixel is deleted.
在本实施例的一些可选的实现方式中,上述第一参数确定单元704还可以包括图7中未示出的第一映射模块、变换模块、第二映射模块、深度值确定模块以及第一参数确定模块。In some optional implementation manners of the embodiment, the first parameter determining unit 704 may further include a first mapping module, a transform module, a second mapping module, a depth value determining module, and the first Parameter determination module.
其中,第一映射模块,用于将该像素点在第一深度图像中的第一二维坐标映射为第一深度图像所属坐标系的第一三维空间坐标。The first mapping module is configured to map the first two-dimensional coordinates of the pixel in the first depth image to the first three-dimensional coordinate of the coordinate system to which the first depth image belongs.
变换模块,用于将第一三维空间坐标变换到第二深度图像所属坐标系,得到第二三维空间坐标。And a transform module, configured to transform the first three-dimensional space coordinate to a coordinate system to which the second depth image belongs, to obtain a second three-dimensional space coordinate.
第二映射模块,用于将第二三维空间坐标映射到第二深度图像中,得到第二二维坐标。The second mapping module is configured to map the second three-dimensional space coordinates into the second depth image to obtain the second two-dimensional coordinates.
深度值确定模块,用于确定第二二维坐标在第二深度图像中的第二深度值。a depth value determining module, configured to determine a second depth value of the second two-dimensional coordinate in the second depth image.
第一参数确定模块,用于基于第一深度值以及第二深度值,确定深度传感器的位姿变换参数。The first parameter determining module is configured to determine a pose transformation parameter of the depth sensor based on the first depth value and the second depth value.
在本实施例的一些可选的实现方式中,上述第一参数确定模块可以进一步用于:根据第一深度值以及第二深度值,确定第一深度图像与第二深度图像之间的深度差值;确定深度差值为深度残差,并基于深度残差,执行以下迭代步骤:基于深度残差,确定位姿估计增量;确定深度残差是否小于预设阈值;响应于深度残差小于预设阈值,累积位姿估计增量与第一深度值,确定位姿估计值;根据位姿估计值,确定深度传感器的位姿变换参数;响应于深度残差大于或等于预设阈值,确定累积的位姿估计增量为深度残差,继续执行上述迭代步骤。In some optional implementation manners of the embodiment, the first parameter determining module may be further configured to: determine a depth difference between the first depth image and the second depth image according to the first depth value and the second depth value Determining the depth difference as the depth residual, and based on the depth residual, performing the following iterative steps: determining the pose estimation increment based on the depth residual; determining whether the depth residual is less than a preset threshold; and responding to the depth residual is less than a preset threshold, a cumulative pose estimation increment and a first depth value, determining a pose estimation value; determining a pose transformation parameter of the depth sensor according to the pose estimation value; determining, in response to the depth residual being greater than or equal to a preset threshold, determining The accumulated pose estimation increment is the depth residual and the above iterative steps are continued.
在本实施例的一些可选的实现方式中,上述位姿估计装置700还 可以进一步包括图7中未示出的第二获取单元、第二参数确定单元以及参数融合单元。In some optional implementation manners of the embodiment, the pose estimating apparatus 700 may further include a second acquiring unit, a second parameter determining unit, and a parameter fusion unit not shown in FIG. 7.
第二获取单元,用于从与深度传感器物理绑定的惯性测量装置处获取角速度和加速度。A second acquisition unit is configured to acquire angular velocity and acceleration from an inertial measurement device physically bound to the depth sensor.
第二参数确定单元,用于根据角速度和加速度,确定惯性测量装置的位姿变换参数。The second parameter determining unit is configured to determine a pose transformation parameter of the inertial measurement device according to the angular velocity and the acceleration.
参数融合单元,用于融合深度传感器的位姿变换参数以及惯性测量装置的位姿变换参数,确定综合位姿变换参数。The parameter fusion unit is configured to combine the pose transformation parameter of the depth sensor and the pose transformation parameter of the inertial measurement device to determine the integrated pose transformation parameter.
在本实施例的一些可选的实现方式中,上述第二参数确定单元可以进一步包括图7中未示出的第一子参数确定模块、第二子参数确定模块以及融合模块。In some optional implementation manners of the embodiment, the second parameter determining unit may further include a first sub-parameter determining module, a second sub-parameter determining module, and a converging module, which are not illustrated in FIG. 7 .
第一子参数确定模块,用于根据角速度,确定惯性测量装置的第一位姿变换参数。The first sub-parameter determining module is configured to determine a first pose change parameter of the inertial measurement device according to the angular velocity.
第二子参数确定模块,用于根据加速度,确定惯性测量装置的第二位姿变换参数。The second sub-parameter determining module is configured to determine a second pose change parameter of the inertial measurement device according to the acceleration.
融合模块,用于融合第一位姿变换参数以及第二位姿变换参数,确定惯性测量装置的位姿变换参数。The fusion module is configured to combine the first pose transformation parameter and the second pose transformation parameter to determine a pose transformation parameter of the inertial measurement device.
本申请的上述实施例提供的位姿估计装置,从深度传感器获取其采集的深度图像视频,并从其中选取两帧共有至少一个指示同一物体的像素点的深度图像,然后确定上述两帧深度图像中相互对应的第一像素点集合和第二像素点集合,然后对于第一像素点集合中的每一像素点,根据其在第一深度图像中的第一二维坐标及与该像素点对应的对应像素点在第二深度图像中的第一深度值,确定深度传感器的位姿变换参数,降低了对计算资源的消耗,提高了计算效率,保证了位姿估计的实时性。The pose estimating apparatus provided by the above embodiment provides the depth image video acquired by the depth sensor from the depth sensor, and selects two depth images of at least one pixel indicating the same object from the two frames, and then determines the two-frame depth image. And corresponding to the first pixel point set and the second pixel point set, and then for each pixel point in the first pixel point set, according to the first two-dimensional coordinate in the first depth image and corresponding to the pixel point The first depth value of the corresponding pixel in the second depth image determines the pose transformation parameter of the depth sensor, reduces the consumption of computing resources, improves the calculation efficiency, and ensures the real-time performance of the pose estimation.
应当理解,位姿估计装置700中记载的单元701至单元704分别与参考图2中描述的方法中的各个步骤相对应。由此,上文针对用于合成歌声的方法描述的操作和特征同样适用于装置700及其中包含的单元,在此不再赘述。装置700的相应单元可以与服务器中的单元相互配合以实现本申请实施例的方案。It should be understood that the units 701 to 704 described in the pose estimating apparatus 700 correspond to respective steps in the method described with reference to FIG. 2, respectively. Thus, the operations and features described above for the method for synthesizing singing voice are equally applicable to the apparatus 700 and the units contained therein, and are not described herein again. Corresponding units of device 700 may cooperate with units in the server to implement the solution of the embodiments of the present application.
下面参考图8,其示出了适于用来实现本申请实施例的服务器的计算机系统800的结构示意图。图8示出的服务器仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。Referring now to Figure 8, a block diagram of a computer system 800 suitable for use in implementing a server of an embodiment of the present application is shown. The server shown in FIG. 8 is merely an example, and should not impose any limitation on the function and scope of use of the embodiments of the present application.
如图8所示,计算机系统800包括中央处理单元(CPU)801,其可以根据存储在只读存储器(ROM)802中的程序或者从存储部分808加载到随机访问存储器(RAM)803中的程序而执行各种适当的动作和处理。在RAM 803中,还存储有系统800操作所需的各种程序和数据。CPU 801、ROM 802以及RAM 803通过总线804彼此相连。输入/输出(I/O)接口805也连接至总线804。As shown in FIG. 8, computer system 800 includes a central processing unit (CPU) 801 that can be loaded into a program in random access memory (RAM) 803 according to a program stored in read only memory (ROM) 802 or from storage portion 808. And perform various appropriate actions and processes. In the RAM 803, various programs and data required for the operation of the system 800 are also stored. The CPU 801, the ROM 802, and the RAM 803 are connected to each other through a bus 804. An input/output (I/O) interface 805 is also coupled to bus 804.
以下部件连接至I/O接口805:包括键盘、鼠标等的输入部分806;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分807;包括硬盘等的存储部分808;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分809。通信部分809经由诸如因特网的网络执行通信处理。驱动器810也根据需要连接至I/O接口805。可拆卸介质811,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器810上,以便于从其上读出的计算机程序根据需要被安装入存储部分808。The following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, etc.; an output portion 807 including, for example, a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a storage portion 808 including a hard disk or the like. And a communication portion 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the Internet. Driver 810 is also coupled to I/O interface 805 as needed. A removable medium 811, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like, is mounted on the drive 810 as needed so that a computer program read therefrom is installed into the storage portion 808 as needed.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在机器可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分809从网络上被下载和安装,和/或从可拆卸介质811被安装。在该计算机程序被中央处理单元(CPU)801执行时,执行本申请的方法中限定的上述功能。In particular, the processes described above with reference to the flowcharts may be implemented as a computer software program in accordance with an embodiment of the present disclosure. For example, an embodiment of the present disclosure includes a computer program product comprising a computer program carried on a machine readable medium, the computer program comprising program code for executing the method illustrated in the flowchart. In such an embodiment, the computer program can be downloaded and installed from the network via communication portion 809, and/or installed from removable media 811. When the computer program is executed by the central processing unit (CPU) 801, the above-described functions defined in the method of the present application are performed.
需要说明的是,本申请所述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、 只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本申请中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。It should be noted that the computer readable medium described herein may be a computer readable signal medium or a computer readable storage medium or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain or store a program, which can be used by or in connection with an instruction execution system, apparatus or device. In the present application, a computer readable signal medium may include a data signal that is propagated in the baseband or as part of a carrier, carrying computer readable program code. Such propagated data signals can take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing. The computer readable signal medium can also be any computer readable medium other than a computer readable storage medium, which can transmit, propagate, or transport a program for use by or in connection with the instruction execution system, apparatus, or device. . Program code embodied on a computer readable medium can be transmitted by any suitable medium, including but not limited to wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
附图中的流程图和框图,图示了按照本申请各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality and operation of possible implementations of systems, methods and computer program products in accordance with various embodiments of the present application. In this regard, each block of the flowchart or block diagram can represent a module, a program segment, or a portion of code that includes one or more of the logic functions for implementing the specified. Executable instructions. It should also be noted that in some alternative implementations, the functions noted in the blocks may also occur in a different order than that illustrated in the drawings. For example, two successively represented blocks may in fact be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts, can be implemented in a dedicated hardware-based system that performs the specified function or operation. Or it can be implemented by a combination of dedicated hardware and computer instructions.
描述于本申请实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括第一获取单元、图像选取单元、像素点集合确定单元和第一参数确定单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,第一获取单元还可以被描述为“从深度传感器处获取深度图像视频的单元”。The units involved in the embodiments of the present application may be implemented by software or by hardware. The described unit may also be disposed in the processor, for example, as a processor including a first acquisition unit, an image selection unit, a pixel point determination unit, and a first parameter determination unit. The names of these units do not constitute a limitation on the unit itself under certain circumstances. For example, the first acquisition unit may also be described as “a unit that acquires depth image video from the depth sensor”.
作为另一方面,本申请还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的装置中所包含的;也可以是单独存在,而未装配入该装置中。上述计算机可读介质承载有一个或者多个程序,当所述一个或者多个程序被该装置执行时,使得该装置:从深度传感器处获取深度图像视频;从深度图像视频的各帧深度图像中选取第一深度图像和第二深度图像,其中,第一深度图像和第二深度图像中共有至少一个指示同一物体的像素点;确定第一深度图像中的第一像素点集合以及第二深度图像中的第二像素点集合,其中,第一像素点集合中的各像素点与第二像素点集合中的各像素点一一对应,且对应的两个像素点指示同一物体;对于第一像素点集合中的任一像素点,基于该像素点在第一深度图像中的第一二维坐标及与该像素点对应的对应像素点在第二深度图像中的第一深度值,确定深度传感器的位姿变换参数。In another aspect, the present application also provides a computer readable medium, which may be included in the apparatus described in the above embodiments, or may be separately present and not incorporated into the apparatus. The computer readable medium carries one or more programs that, when executed by the device, cause the device to: acquire a depth image video from a depth sensor; from each frame depth image of the depth image video Selecting a first depth image and a second depth image, wherein the first depth image and the second depth image share at least one pixel point indicating the same object; determining the first pixel point set and the second depth image in the first depth image a second set of pixel points, wherein each pixel point in the first pixel point set corresponds to each pixel point in the second pixel point set, and the corresponding two pixel points indicate the same object; for the first pixel Determining the depth sensor based on the first two-dimensional coordinates of the pixel in the first depth image and the first depth value of the corresponding pixel corresponding to the pixel in the second depth image The pose transformation parameters.
以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本申请中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only a preferred embodiment of the present application and a description of the principles of the applied technology. It should be understood by those skilled in the art that the scope of the invention referred to in the present application is not limited to the specific combination of the above technical features, and should also be covered by the above technical features or without departing from the above inventive concept. Other technical solutions formed by arbitrarily combining the equivalent features. For example, the above features are combined with the technical features disclosed in the present application, but are not limited to the technical features having similar functions.

Claims (20)

  1. 一种位姿估计方法,其特征在于,所述方法包括:A pose estimation method, characterized in that the method comprises:
    从深度传感器处获取深度图像视频;Obtaining a depth image video from the depth sensor;
    从所述深度图像视频的各帧深度图像中选取第一深度图像和第二深度图像,其中,所述第一深度图像和所述第二深度图像中共有至少一个指示同一物体的像素点;And selecting, from the frame depth images of the depth image video, a first depth image and a second depth image, wherein the first depth image and the second depth image share at least one pixel point indicating the same object;
    确定所述第一深度图像中的第一像素点集合以及所述第二深度图像中的第二像素点集合,其中,所述第一像素点集合中的各像素点与所述第二像素点集合中的各像素点一一对应,且对应的两个像素点指示同一物体;Determining a first pixel point set in the first depth image and a second pixel point set in the second depth image, wherein each pixel point and the second pixel point in the first pixel point set Each pixel in the set corresponds one-to-one, and the corresponding two pixels point to the same object;
    对于所述第一像素点集合中的任一像素点,基于该像素点在所述第一深度图像中的第一二维坐标及与该像素点对应的对应像素点在所述第二深度图像中的第一深度值,确定所述深度传感器的位姿变换参数。For any pixel in the first set of pixel points, based on the first two-dimensional coordinates of the pixel in the first depth image and the corresponding pixel corresponding to the pixel in the second depth image The first depth value in the determination of the pose transformation parameter of the depth sensor.
  2. 根据权利要求1所述的方法,其特征在于,在所述从所述各帧深度图像中选取第一深度图像和第二深度图像之前,所述方法还包括:The method according to claim 1, wherein before the selecting the first depth image and the second depth image from the frame depth images, the method further comprises:
    对于每帧深度图像,删除该帧深度图像中符合预设条件的像素点;For each frame depth image, deleting a pixel point in the frame depth image that meets a preset condition;
    对删除后的深度图像进行平滑处理。Smoothing the deleted depth image.
  3. 根据权利要求2所述的方法,其特征在于,所述删除该帧深度图像中符合预设条件的像素点,包括:The method according to claim 2, wherein the deleting the pixel points in the frame depth image that meet the preset condition comprises:
    检测每个像素点的深度值;Detecting the depth value of each pixel;
    将深度值大于第一预设值,且深度值小于第二预设值的像素点删除。A pixel point whose depth value is greater than the first preset value and whose depth value is smaller than the second preset value is deleted.
  4. 根据权利要求2所述的方法,其特征在于,所述删除该帧深度图像中符合预设条件的像素点,包括:The method according to claim 2, wherein the deleting the pixel points in the frame depth image that meet the preset condition comprises:
    确定该帧深度图像在水平方向的第一偏导数和在竖直方向上的第 二偏导数;Determining a first partial derivative of the frame depth image in a horizontal direction and a second partial derivative in a vertical direction;
    根据所述第一偏导数和所述第二偏导数,确定该帧深度图像中的几何边缘像素点;Determining geometric edge pixel points in the frame depth image according to the first partial derivative and the second partial derivative;
    将所述几何边缘像素点删除。The geometric edge pixel points are deleted.
  5. 根据权利要求2所述的方法,其特征在于,所述删除该帧深度图像中符合预设条件的像素点,包括:The method according to claim 2, wherein the deleting the pixel points in the frame depth image that meet the preset condition comprises:
    确定该帧深度图像中深度值不存在的失效像素点;Determining a failed pixel point where the depth value does not exist in the depth image of the frame;
    将所述失效像素点以及与所述失效像素点相邻的像素点删除。The failed pixel point and a pixel point adjacent to the failed pixel point are deleted.
  6. 根据权利要求1所述的方法,其特征在于,所述基于该像素点在所述第一深度图像中的第一二维坐标及与该像素点对应的对应像素点在所述第二深度图像中的第一深度值,确定所述深度传感器的位姿变换参数,包括:The method according to claim 1, wherein the first two-dimensional coordinates in the first depth image based on the pixel point and corresponding pixel points corresponding to the pixel point are in the second depth image The first depth value in the determining the pose transformation parameter of the depth sensor, including:
    将该像素点在所述第一深度图像中的第一二维坐标映射为所述第一深度图像所属坐标系的第一三维空间坐标;Mapping a first two-dimensional coordinate of the pixel in the first depth image to a first three-dimensional coordinate of a coordinate system to which the first depth image belongs;
    将所述第一三维空间坐标变换到所述第二深度图像所属坐标系,得到第二三维空间坐标;Converting the first three-dimensional space coordinate to a coordinate system to which the second depth image belongs, to obtain a second three-dimensional space coordinate;
    将所述第二三维空间坐标映射到所述第二深度图像中,得到第二二维坐标;Mapping the second three-dimensional space coordinates into the second depth image to obtain a second two-dimensional coordinate;
    确定所述第二二维坐标在所述第二深度图像中的第二深度值;Determining a second depth value of the second two-dimensional coordinate in the second depth image;
    基于所述第一深度值以及所述第二深度值,确定所述深度传感器的位姿变换参数。Determining a pose change parameter of the depth sensor based on the first depth value and the second depth value.
  7. 根据权利要求6所述的方法,其特征在于,所述基于所述第一深度值以及所述第二深度值,确定所述深度传感器的位姿变换参数,包括:The method according to claim 6, wherein the determining the pose transformation parameter of the depth sensor based on the first depth value and the second depth value comprises:
    根据所述第一深度值以及所述第二深度值,确定所述第一深度图像与所述第二深度图像之间的深度差值;Determining a depth difference between the first depth image and the second depth image according to the first depth value and the second depth value;
    确定所述深度差值为深度残差,并基于所述深度残差,执行以下 迭代步骤:基于所述深度残差,确定位姿估计增量;确定所述深度残差是否小于预设阈值;响应于所述深度残差小于预设阈值,累积所述位姿估计增量与所述第一深度值,确定位姿估计值;根据所述位姿估计值,确定所述深度传感器的位姿变换参数;Determining that the depth difference is a depth residual, and based on the depth residual, performing an iterative step of: determining a pose estimation increment based on the depth residual; determining whether the depth residual is less than a preset threshold; And responsive to the depth residual being less than a preset threshold, accumulating the pose estimation increment and the first depth value, determining a pose estimation value; determining a pose of the depth sensor according to the pose estimation value Transform parameters
    响应于所述深度残差大于或等于预设阈值,确定累积的所述位姿估计增量为所述深度残差,继续执行所述迭代步骤。And responsive to the depth residual being greater than or equal to a preset threshold, determining that the accumulated pose estimation increment is the depth residual, and continuing to perform the iterative step.
  8. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method of claim 1 further comprising:
    从与所述深度传感器物理绑定的惯性测量装置处获取角速度和加速度;Obtaining angular velocity and acceleration from an inertial measurement device physically bound to the depth sensor;
    根据所述角速度和所述加速度,确定所述惯性测量装置的位姿变换参数;Determining a pose transformation parameter of the inertial measurement device according to the angular velocity and the acceleration;
    融合所述深度传感器的位姿变换参数以及所述惯性测量装置的位姿变换参数,确定综合位姿变换参数。The pose transformation parameter of the depth sensor and the pose transformation parameter of the inertial measurement device are combined to determine an integrated pose transformation parameter.
  9. 根据权利要求8所述的方法,其特征在于,所述根据所述角速度和所述加速度,确定所述惯性测量装置的位姿变换参数,包括:The method according to claim 8, wherein the determining the pose transformation parameter of the inertial measurement device according to the angular velocity and the acceleration comprises:
    根据所述角速度,确定所述惯性测量装置的第一位姿变换参数;Determining a first pose transformation parameter of the inertial measurement device according to the angular velocity;
    根据所述加速度,确定所述惯性测量装置的第二位姿变换参数;Determining, according to the acceleration, a second pose transformation parameter of the inertial measurement device;
    融合所述第一位姿变换参数以及所述第二位姿变换参数,确定所述惯性测量装置的位姿变换参数。And combining the first pose transformation parameter and the second pose transformation parameter to determine a pose transformation parameter of the inertial measurement device.
  10. 一种位姿估计装置,其特征在于,所述装置包括:A pose estimation apparatus, characterized in that the apparatus comprises:
    第一获取单元,用于从深度传感器处获取深度图像视频;a first acquiring unit, configured to acquire a depth image video from the depth sensor;
    图像选取单元,用于从所述深度图像视频的各帧深度图像中选取第一深度图像和第二深度图像,其中,所述第一深度图像和所述第二深度图像中共有至少一个指示同一物体的像素点;An image selecting unit, configured to select a first depth image and a second depth image from each frame depth image of the depth image video, wherein at least one of the first depth image and the second depth image indicates the same The pixel point of the object;
    像素点集合确定单元,用于确定所述第一深度图像中的第一像素点集合以及所述第二深度图像中的第二像素点集合,其中,所述第一像素点集合中的各像素点与所述第二像素点集合中的各像素点一一对 应,且对应的两个像素点指示同一物体;a pixel point set determining unit, configured to determine a first pixel point set in the first depth image and a second pixel point set in the second depth image, where each pixel in the first pixel point set The point is in one-to-one correspondence with each pixel point in the second set of pixel points, and the corresponding two pixel points indicate the same object;
    第一参数确定单元,用于对于所述第一像素点集合中的任一像素点,基于该像素点在所述第一深度图像中的第一二维坐标及与该像素点对应的对应像素点在所述第二深度图像中的第一深度值,确定所述深度传感器的位姿变换参数。a first parameter determining unit, configured to: for any pixel in the first set of pixel points, based on a first two-dimensional coordinate of the pixel in the first depth image and a corresponding pixel corresponding to the pixel A pose value of the depth sensor is determined by a first depth value in the second depth image.
  11. 根据权利要求10所述的装置,其特征在于,所述装置还包括预处理单元,所述预处理单元包括像素点删除模块以及平滑模块;The device according to claim 10, wherein the device further comprises a preprocessing unit, the preprocessing unit comprising a pixel point deleting module and a smoothing module;
    所述像素点删除模块,用于在所述图像选取单元从所述各帧深度图像中选取第一深度图像和第二深度图像之前,对于每帧深度图像,删除该帧深度图像中符合预设条件的像素点;The pixel point deleting module is configured to: before the image selecting unit selects the first depth image and the second depth image from the frame depth images, delete the frame depth image in accordance with the preset for each frame depth image Conditional pixel point;
    所述平滑模块,用于对删除后的深度图像进行平滑处理。The smoothing module is configured to perform smoothing on the deleted depth image.
  12. 根据权利要求11所述的装置,其特征在于,所述像素点删除模块进一步用于:The device according to claim 11, wherein the pixel point deleting module is further configured to:
    检测每个像素点的深度值;Detecting the depth value of each pixel;
    将深度值大于第一预设值,且深度值小于第二预设值的像素点删除。A pixel point whose depth value is greater than the first preset value and whose depth value is smaller than the second preset value is deleted.
  13. 根据权利要求11所述的装置,其特征在于,所述像素点删除模块进一步用于:The device according to claim 11, wherein the pixel point deleting module is further configured to:
    确定该帧深度图像在水平方向的第一偏导数和在竖直方向上的第二偏导数;Determining a first partial derivative of the frame depth image in a horizontal direction and a second partial derivative in a vertical direction;
    根据所述第一偏导数和所述第二偏导数,确定该帧深度图像中的几何边缘像素点;Determining geometric edge pixel points in the frame depth image according to the first partial derivative and the second partial derivative;
    将所述几何边缘像素点删除。The geometric edge pixel points are deleted.
  14. 根据权利要求11所述的装置,其特征在于,所述像素点删除模块进一步用于:The device according to claim 11, wherein the pixel point deleting module is further configured to:
    确定该帧深度图像中深度值不存在的失效像素点;Determining a failed pixel point where the depth value does not exist in the depth image of the frame;
    将所述失效像素点以及与所述失效像素点相邻的像素点删除。The failed pixel point and a pixel point adjacent to the failed pixel point are deleted.
  15. 根据权利要求10所述的装置,其特征在于,所述第一参数确定单元包括:The apparatus according to claim 10, wherein the first parameter determining unit comprises:
    第一映射模块,用于将该像素点在所述第一深度图像中的第一二维坐标映射为所述第一深度图像所属坐标系的第一三维空间坐标;a first mapping module, configured to map a first two-dimensional coordinate of the pixel in the first depth image to a first three-dimensional coordinate of a coordinate system to which the first depth image belongs;
    变换模块,用于将所述第一三维空间坐标变换到所述第二深度图像所属坐标系,得到第二三维空间坐标;a transform module, configured to transform the first three-dimensional space coordinate to a coordinate system to which the second depth image belongs, to obtain a second three-dimensional space coordinate;
    第二映射模块,用于将所述第二三维空间坐标映射到所述第二深度图像中,得到第二二维坐标;a second mapping module, configured to map the second three-dimensional space coordinates into the second depth image to obtain a second two-dimensional coordinate;
    深度值确定模块,用于确定所述第二二维坐标在所述第二深度图像中的第二深度值;a depth value determining module, configured to determine a second depth value of the second two-dimensional coordinate in the second depth image;
    第一参数确定模块,用于基于所述第一深度值以及所述第二深度值,确定所述深度传感器的位姿变换参数。a first parameter determining module, configured to determine a pose transformation parameter of the depth sensor based on the first depth value and the second depth value.
  16. 根据权利要求15所述的装置,其特征在于,所述第一参数确定模块进一步用于:The apparatus according to claim 15, wherein the first parameter determining module is further configured to:
    根据所述第一深度值以及所述第二深度值,确定所述第一深度图像与所述第二深度图像之间的深度差值;Determining a depth difference between the first depth image and the second depth image according to the first depth value and the second depth value;
    确定所述深度差值为深度残差,并基于所述深度残差,执行以下迭代步骤:基于所述深度残差,确定位姿估计增量;确定所述深度残差是否小于预设阈值;响应于所述深度残差小于预设阈值,累积所述位姿估计增量与所述第一深度值,确定位姿估计值;根据所述位姿估计值,确定所述深度传感器的位姿变换参数;Determining that the depth difference is a depth residual, and based on the depth residual, performing an iterative step of: determining a pose estimation increment based on the depth residual; determining whether the depth residual is less than a preset threshold; And responsive to the depth residual being less than a preset threshold, accumulating the pose estimation increment and the first depth value, determining a pose estimation value; determining a pose of the depth sensor according to the pose estimation value Transform parameters
    响应于所述深度残差大于或等于预设阈值,确定累积的所述位姿估计增量为所述深度残差,继续执行所述迭代步骤。And responsive to the depth residual being greater than or equal to a preset threshold, determining that the accumulated pose estimation increment is the depth residual, and continuing to perform the iterative step.
  17. 根据权利要求10所述的装置,其特征在于,所述装置还包括:The device according to claim 10, wherein the device further comprises:
    第二获取单元,用于从与所述深度传感器物理绑定的惯性测量装置处获取角速度和加速度;a second acquiring unit, configured to acquire angular velocity and acceleration from an inertial measurement device physically bound to the depth sensor;
    第二参数确定单元,用于根据所述角速度和所述加速度,确定所述惯性测量装置的位姿变换参数;a second parameter determining unit, configured to determine a pose transformation parameter of the inertial measurement device according to the angular velocity and the acceleration;
    参数融合单元,用于融合所述深度传感器的位姿变换参数以及所述惯性测量装置的位姿变换参数,确定综合位姿变换参数。And a parameter fusion unit configured to fuse the pose transformation parameter of the depth sensor and the pose transformation parameter of the inertial measurement device to determine an integrated pose transformation parameter.
  18. 根据权利要求17所述的装置,其特征在于,所述第二参数确定单元包括:The device according to claim 17, wherein the second parameter determining unit comprises:
    第一子参数确定模块,用于根据所述角速度,确定所述惯性测量装置的第一位姿变换参数;a first sub-parameter determining module, configured to determine a first pose change parameter of the inertial measurement device according to the angular velocity;
    第二子参数确定模块,用于根据所述加速度,确定所述惯性测量装置的第二位姿变换参数;a second sub-parameter determining module, configured to determine a second pose change parameter of the inertial measurement device according to the acceleration;
    融合模块,用于融合所述第一位姿变换参数以及所述第二位姿变换参数,确定所述惯性测量装置的位姿变换参数。And a fusion module, configured to combine the first pose transformation parameter and the second pose transformation parameter to determine a pose transformation parameter of the inertial measurement device.
  19. 一种服务器,其特征在于,包括:A server, comprising:
    一个或多个处理器;One or more processors;
    存储装置,用于存储一个或多个程序,a storage device for storing one or more programs,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-9中任一所述的方法。The one or more programs are executed by the one or more processors such that the one or more processors implement the method of any of claims 1-9.
  20. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1-9中任一所述的方法。A computer readable storage medium having stored thereon a computer program, wherein the program, when executed by a processor, implements the method of any of claims 1-9.
PCT/CN2018/083376 2017-05-09 2018-04-17 Pose estimation method and apparatus WO2018205803A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710321322.XA CN107123142B (en) 2017-05-09 2017-05-09 Pose estimation method and device
CN201710321322.X 2017-05-09

Publications (1)

Publication Number Publication Date
WO2018205803A1 true WO2018205803A1 (en) 2018-11-15

Family

ID=59726877

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/083376 WO2018205803A1 (en) 2017-05-09 2018-04-17 Pose estimation method and apparatus

Country Status (2)

Country Link
CN (1) CN107123142B (en)
WO (1) WO2018205803A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112146578A (en) * 2019-06-28 2020-12-29 顺丰科技有限公司 Scale ratio calculation method, device, equipment and storage medium

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107123142B (en) * 2017-05-09 2020-05-01 北京京东尚科信息技术有限公司 Pose estimation method and device
CN108399643A (en) * 2018-03-15 2018-08-14 南京大学 A kind of outer ginseng calibration system between laser radar and camera and method
CN110914867A (en) * 2018-07-17 2020-03-24 深圳市大疆创新科技有限公司 Pose determination method, pose determination device and computer readable storage medium
CN110800023A (en) * 2018-07-24 2020-02-14 深圳市大疆创新科技有限公司 Image processing method and equipment, camera device and unmanned aerial vehicle
CN109186596B (en) * 2018-08-14 2020-11-10 深圳清华大学研究院 IMU measurement data generation method, system, computer device and readable storage medium
CN109544629B (en) * 2018-11-29 2021-03-23 南京人工智能高等研究院有限公司 Camera position and posture determining method and device and electronic equipment
CN109470149B (en) * 2018-12-12 2020-09-29 北京理工大学 Method and device for measuring position and posture of pipeline
CN111435086B (en) * 2019-01-13 2022-03-25 北京魔门塔科技有限公司 Navigation method and device based on splicing map
CN109650292B (en) * 2019-02-02 2019-11-05 北京极智嘉科技有限公司 The location regulation method and medium of a kind of intelligent forklift and intelligent forklift
CN110054121B (en) * 2019-04-25 2021-04-20 北京极智嘉科技有限公司 Intelligent forklift and container pose deviation detection method
CN112907164A (en) * 2019-12-03 2021-06-04 北京京东乾石科技有限公司 Object positioning method and device
CN112070052A (en) * 2020-09-16 2020-12-11 青岛维感科技有限公司 Interval monitoring method, device and system and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609942A (en) * 2011-01-31 2012-07-25 微软公司 Mobile camera localization using depth maps
CN104933755A (en) * 2014-03-18 2015-09-23 华为技术有限公司 Static object reconstruction method and system
US20160171703A1 (en) * 2013-07-09 2016-06-16 Samsung Electronics Co., Ltd. Camera pose estimation apparatus and method
CN106403924A (en) * 2016-08-24 2017-02-15 智能侠(北京)科技有限公司 Method for robot fast positioning and attitude estimation based on depth camera
CN106529538A (en) * 2016-11-24 2017-03-22 腾讯科技(深圳)有限公司 Method and device for positioning aircraft
CN107123142A (en) * 2017-05-09 2017-09-01 北京京东尚科信息技术有限公司 Position and orientation estimation method and device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102289809A (en) * 2011-07-25 2011-12-21 清华大学 Method and device for estimating pose of camera
US10339389B2 (en) * 2014-09-03 2019-07-02 Sharp Laboratories Of America, Inc. Methods and systems for vision-based motion estimation
CN104361575B (en) * 2014-10-20 2015-08-19 湖南戍融智能科技有限公司 Automatic floor in depth image detects and video camera relative pose estimation method
CN106157367B (en) * 2015-03-23 2019-03-08 联想(北京)有限公司 Method for reconstructing three-dimensional scene and equipment
CN105045263B (en) * 2015-07-06 2016-05-18 杭州南江机器人股份有限公司 A kind of robot method for self-locating based on Kinect depth camera
CN105698765B (en) * 2016-02-22 2018-09-18 天津大学 Object pose method under double IMU monocular visions measurement in a closed series noninertial systems
CN105976353B (en) * 2016-04-14 2020-01-24 南京理工大学 Spatial non-cooperative target pose estimation method based on model and point cloud global matching

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609942A (en) * 2011-01-31 2012-07-25 微软公司 Mobile camera localization using depth maps
US20160171703A1 (en) * 2013-07-09 2016-06-16 Samsung Electronics Co., Ltd. Camera pose estimation apparatus and method
CN104933755A (en) * 2014-03-18 2015-09-23 华为技术有限公司 Static object reconstruction method and system
CN106403924A (en) * 2016-08-24 2017-02-15 智能侠(北京)科技有限公司 Method for robot fast positioning and attitude estimation based on depth camera
CN106529538A (en) * 2016-11-24 2017-03-22 腾讯科技(深圳)有限公司 Method and device for positioning aircraft
CN107123142A (en) * 2017-05-09 2017-09-01 北京京东尚科信息技术有限公司 Position and orientation estimation method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112146578A (en) * 2019-06-28 2020-12-29 顺丰科技有限公司 Scale ratio calculation method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN107123142A (en) 2017-09-01
CN107123142B (en) 2020-05-01

Similar Documents

Publication Publication Date Title
WO2018205803A1 (en) Pose estimation method and apparatus
CN108986161B (en) Three-dimensional space coordinate estimation method, device, terminal and storage medium
JP7173772B2 (en) Video processing method and apparatus using depth value estimation
JP7106665B2 (en) MONOCULAR DEPTH ESTIMATION METHOD AND DEVICE, DEVICE AND STORAGE MEDIUM THEREOF
KR20220009393A (en) Image-based localization
US20230245391A1 (en) 3d model reconstruction and scale estimation
US11064178B2 (en) Deep virtual stereo odometry
US9129435B2 (en) Method for creating 3-D models by stitching multiple partial 3-D models
CN109461208B (en) Three-dimensional map processing method, device, medium and computing equipment
CN111325796A (en) Method and apparatus for determining pose of vision device
JP2018534699A (en) System and method for correcting erroneous depth information
KR20170053007A (en) Method and apparatus for estimating pose
Zhao et al. Real-time stereo on GPGPU using progressive multi-resolution adaptive windows
WO2015124066A1 (en) Visual navigation method and device and robot
JP2023530545A (en) Spatial geometric information estimation model generation method and apparatus
JP2023021994A (en) Data processing method and device for automatic driving vehicle, electronic apparatus, storage medium, computer program, and automatic driving vehicle
CN113850859A (en) Methods, systems, articles, and apparatus for enhancing image depth confidence maps
CN113378605B (en) Multi-source information fusion method and device, electronic equipment and storage medium
CN114998406B (en) Self-supervision multi-view depth estimation method and device
JP2017111209A (en) Creation of 3d map
KR101668649B1 (en) Surrounding environment modeling method and apparatus performing the same
CN113763468A (en) Positioning method, device, system and storage medium
US20230141372A1 (en) Context aware measurement
CN112528728B (en) Image processing method and device for visual navigation and mobile robot
Liu et al. Robust Localization with Visual-Inertial Odometry Constraints for Markerless Mobile AR

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18798239

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17.03.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18798239

Country of ref document: EP

Kind code of ref document: A1