WO2018205803A1 - Procédé et appareil d'estimation de pose - Google Patents

Procédé et appareil d'estimation de pose Download PDF

Info

Publication number
WO2018205803A1
WO2018205803A1 PCT/CN2018/083376 CN2018083376W WO2018205803A1 WO 2018205803 A1 WO2018205803 A1 WO 2018205803A1 CN 2018083376 W CN2018083376 W CN 2018083376W WO 2018205803 A1 WO2018205803 A1 WO 2018205803A1
Authority
WO
WIPO (PCT)
Prior art keywords
depth
depth image
determining
pixel point
pixel
Prior art date
Application number
PCT/CN2018/083376
Other languages
English (en)
Chinese (zh)
Inventor
孙志明
张潮
李雨倩
吴迪
樊晨
李政
贾士伟
李祎翔
张连川
刘新月
Original Assignee
北京京东尚科信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东尚科信息技术有限公司, 北京京东世纪贸易有限公司 filed Critical 北京京东尚科信息技术有限公司
Publication of WO2018205803A1 publication Critical patent/WO2018205803A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Definitions

  • the present application relates to the field of computer vision technology, and in particular to the field of pose estimation, and in particular to a pose estimation method and apparatus.
  • Pose estimation especially visual pose estimation, involves knowledge of many disciplines such as image processing, computer vision, inertial navigation, mathematical statistics, optimization, etc. It is the basic technology of many emerging industries and industries, in current and future production and Life will play an important role.
  • the existing pose estimation method usually needs to extract the feature points of the image and establish a descriptor, so it has the characteristics of consuming a large amount of computing resources, and the real-time performance of the pose estimation is poor.
  • the purpose of the present application is to propose a pose estimation method and apparatus to solve the technical problems mentioned in the background art section above.
  • an embodiment of the present application provides a pose estimation method for acquiring a depth image video from a depth sensor, and selecting a first depth image and a second depth image from each frame depth image of the depth image video, where Determining at least one pixel point indicating the same object in the first depth image and the second depth image; determining a first pixel point set in the first depth image and a second one in the second depth image a set of pixel points, wherein each pixel point in the first set of pixel points is in one-to-one correspondence with each pixel point in the second set of pixel points, and corresponding two pixel points indicate the same object; Any one of the pixels in the set of pixels, based on the first two-dimensional coordinates of the pixel in the first depth image and the first pixel corresponding to the pixel in the second depth image a depth value that determines a pose transformation parameter of the depth sensor.
  • the method before the selecting the first depth image and the second depth image from the frame depth images, the method further includes deleting, for each frame depth image, the preset in the frame depth image The pixel of the condition; smoothing the deleted depth image.
  • the deleting the pixel points in the frame depth image that meet the preset condition comprises: detecting a depth value of each pixel point; and the depth value is greater than the first preset value, and the depth value is less than the second pre- The pixel of the set value is deleted.
  • the deleting a pixel point in the frame depth image that meets a preset condition comprises: determining a first partial derivative of the frame depth image in a horizontal direction and a second partial derivative in a vertical direction; Determining, by the first partial derivative and the second partial derivative, a geometric edge pixel in the frame depth image; deleting the geometric edge pixel.
  • the deleting a pixel point in the frame depth image that meets a preset condition includes: determining a failed pixel point that the depth value does not exist in the frame depth image; and the invalidating pixel point and the invalidation The pixels adjacent to the pixel are deleted.
  • the first depth value based on the first two-dimensional coordinates of the pixel in the first depth image and the corresponding pixel corresponding to the pixel in the second depth image Determining the pose transformation parameter of the depth sensor, comprising: mapping a first two-dimensional coordinate of the pixel in the first depth image to a first three-dimensional coordinate of a coordinate system to which the first depth image belongs; Converting the first three-dimensional space coordinate to a coordinate system to which the second depth image belongs to obtain a second three-dimensional space coordinate; mapping the second three-dimensional space coordinate into the second depth image to obtain a second two-dimensional coordinate Determining a second depth value of the second two-dimensional coordinate in the second depth image; determining a pose transformation parameter of the depth sensor based on the first depth value and the second depth value.
  • determining the pose transformation parameter of the depth sensor based on the first depth value and the second depth value comprises: according to the first depth value and the second depth value Determining a depth difference between the first depth image and the second depth image; determining the depth difference as a depth residual, and based on the depth residual, performing the following iterative steps: based on the depth Residual, determining a pose estimation increment; determining whether the depth residual is less than a preset threshold; and accumulating the pose estimation increment and the first depth value in response to the depth residual being less than a preset threshold, Determining a pose estimation value; determining a pose transformation parameter of the depth sensor according to the pose estimation value; determining, in response to the depth residual being greater than or equal to a preset threshold, determining the cumulative pose estimation increment as The depth residual continues to perform the iterative step.
  • the method further comprises: obtaining angular velocity and acceleration from an inertial measurement device physically bound to the depth sensor; determining a pose transformation of the inertial measurement device based on the angular velocity and the acceleration a parameter; a pose transformation parameter of the depth sensor and a pose transformation parameter of the inertial measurement device are combined to determine an integrated pose transformation parameter.
  • determining the pose transformation parameter of the inertial measurement device according to the angular velocity and the acceleration comprises: determining a first pose transformation parameter of the inertial measurement device according to the angular velocity; Determining, according to the acceleration, a second pose transformation parameter of the inertial measurement device; and combining the first pose transformation parameter and the second pose transformation parameter to determine a pose transformation parameter of the inertial measurement device.
  • the embodiment of the present application provides a pose estimating apparatus, the apparatus comprising: a first acquiring unit, configured to acquire a depth image video from a depth sensor; and an image selecting unit, configured to use the depth image video Selecting a first depth image and a second depth image in each of the frame depth images, wherein the first depth image and the second depth image share at least one pixel point indicating the same object; the pixel point set determining unit, Determining a first set of pixel points in the first depth image and a second set of pixel points in the second depth image, wherein each pixel point in the first set of pixel points and the second pixel Each of the pixel points in the point set corresponds to the one-to-one correspondence, and the corresponding two pixel points indicate the same object; the first parameter determining unit is configured to use, according to the pixel point, any pixel point in the first pixel point set. Determining the depth sensing by the first two-dimensional coordinates in the first depth image and the first depth value of the
  • the apparatus further includes a pre-processing unit, the pre-processing unit includes a pixel point deletion module and a smoothing module; the pixel point deletion module is configured to select, in the image selection unit, the frame depth Before selecting the first depth image and the second depth image in the image, for each frame depth image, the pixel points in the frame depth image that meet the preset condition are deleted; the smoothing module is configured to perform smoothing on the deleted depth image. .
  • the pixel point deleting module is further configured to: detect a depth value of each pixel point; delete a pixel point whose depth value is greater than the first preset value, and the depth value is smaller than the second preset value.
  • the pixel point deleting module is further configured to: determine a first partial derivative of the frame depth image in a horizontal direction and a second partial derivative in a vertical direction; according to the first partial derivative and the Deriving a second partial derivative to determine a geometric edge pixel in the frame depth image; deleting the geometric edge pixel.
  • the pixel point deleting module is further configured to: determine a failed pixel point that the depth value does not exist in the frame depth image; delete the failed pixel point and a pixel point adjacent to the failed pixel point .
  • the first parameter determining unit includes: a first mapping module, configured to map the first two-dimensional coordinates of the pixel in the first depth image to the coordinates of the first depth image a first three-dimensional space coordinate; a transformation module, configured to transform the first three-dimensional space coordinate to a coordinate system to which the second depth image belongs, to obtain a second three-dimensional space coordinate; and a second mapping module, configured to: Mapping a second three-dimensional space coordinate to the second depth image to obtain a second two-dimensional coordinate; a depth value determining module, configured to determine a second depth value of the second two-dimensional coordinate in the second depth image a first parameter determining module, configured to determine a pose transformation parameter of the depth sensor based on the first depth value and the second depth value.
  • a first mapping module configured to map the first two-dimensional coordinates of the pixel in the first depth image to the coordinates of the first depth image a first three-dimensional space coordinate
  • a transformation module configured to transform the first three-dimensional space coordinate to a coordinate system to which the second
  • the first parameter determining module is further configured to: determine a depth between the first depth image and the second depth image according to the first depth value and the second depth value Determining the depth difference as a depth residual, and based on the depth residual, performing an iterative step of determining a pose estimation increment based on the depth residual; determining whether the depth residual is less than a pre- Setting a threshold value; in response to the depth residual being less than a preset threshold, accumulating the pose estimation increment and the first depth value, determining a pose estimation value; determining the depth sensor according to the pose estimation value a pose transformation parameter; in response to the depth residual being greater than or equal to a preset threshold, determining that the accumulated pose estimation increment is the depth residual, and continuing to perform the iterative step.
  • the apparatus further includes: a second acquisition unit configured to acquire angular velocity and acceleration from an inertial measurement device physically bound to the depth sensor; a second parameter determination unit configured to determine an angular velocity Determining a pose transformation parameter of the inertial measurement device with the acceleration; a parameter fusion unit configured to fuse a pose transformation parameter of the depth sensor and a pose transformation parameter of the inertial measurement device to determine a comprehensive pose transformation parameter.
  • the second parameter determining unit includes: a first sub-parameter determining module, configured to determine a first pose transformation parameter of the inertial measurement device according to the angular velocity; and a second sub-parameter determination module, Determining, according to the acceleration, a second pose transformation parameter of the inertial measurement device; a fusion module, configured to fuse the first pose transformation parameter and the second pose transformation parameter, and determine the inertial measurement The pose transformation parameters of the device.
  • an embodiment of the present application provides a server, including: one or more processors; and a storage device, configured to store one or more programs, when the one or more programs are the one or more The processor executes such that the one or more processors implement the methods described in any of the above embodiments.
  • the embodiment of the present application provides a computer readable storage medium, where a computer program is stored, and when the program is executed by the processor, the method described in any one of the foregoing embodiments is implemented.
  • the pose estimation method and apparatus acquires a depth image video collected by the depth sensor from the depth sensor, and selects two frames from which at least one pixel image indicating the same object is selected, and then determines the mutual image in the two frame depth images. Corresponding first pixel point set and second pixel point set, and then for each pixel point in the first pixel point set, according to the first two-dimensional coordinate in the first depth image and corresponding to the pixel point A first depth value of the pixel in the second depth image determines a pose transformation parameter of the depth sensor.
  • the pose estimation method of the present application uses the depth image to perform pose estimation, reduces the consumption of computing resources, improves the calculation efficiency, and ensures the real-time performance of the pose estimation.
  • FIG. 1 is an exemplary system architecture diagram to which the present application can be applied;
  • FIG. 2 is a flow chart of one embodiment of a pose estimation method in accordance with the present application.
  • FIG. 3 is a schematic diagram of an application scenario of a pose estimation method according to the present application.
  • FIG. 4 is a flowchart of determining a pose transformation parameter of a depth sensor in a pose estimation method according to the present application
  • FIG. 5 is a schematic diagram of a principle of pose transformation in a pose estimation method according to the present application.
  • FIG. 6 is a flow chart of another embodiment of a pose estimation method according to the present application.
  • FIG. 7 is a schematic structural view of an embodiment of a pose estimating apparatus according to the present application.
  • FIG. 8 is a block diagram showing the structure of a computer system suitable for implementing the server of the embodiment of the present application.
  • FIG. 1 illustrates an exemplary system architecture 100 in which embodiments of the pose estimation method or pose estimation apparatus of the present application may be applied.
  • system architecture 100 can include depth sensor 101, network 102, and server 103.
  • Network 102 is used to provide a medium for the communication link between depth sensor 101 and server 103.
  • Network 102 can include a variety of connection types, such as wired, wireless communication links, fiber optic cables, and the like.
  • the depth sensor 101 interacts with the server 103 via the network 102 to transmit depth image video or the like.
  • the depth sensor 101 can be mounted on various moving objects, such as an unmanned vehicle, a robot, an unmanned delivery vehicle, a smart wearable device, a virtual reality device, and the like.
  • the depth sensor 101 may be various depth sensors capable of continuously acquiring multi-frame depth images.
  • the server 103 may be a server that provides various services, such as a background server that processes depth image video acquired by the depth sensor 101.
  • the background server can analyze and process data such as the received depth image video.
  • the pose estimation method provided by the embodiment of the present application is generally performed by the server 103. Accordingly, the pose estimation apparatus is generally disposed in the server 103.
  • depth sensors, networks, and servers in Figure 1 are merely illustrative. Depending on the implementation needs, there can be any number of depth sensors, networks, and servers.
  • the pose estimation method of this embodiment includes the following steps:
  • Step 201 Acquire a depth image video from the depth sensor.
  • the electronic device on which the pose estimation method operates can acquire the depth image video from the depth sensor by a wired connection or a wireless connection.
  • Each frame image in the depth image video is a depth image.
  • the depth image also called the distance image, refers to the distance (depth) of the image collector to each point in the scene as the image of the pixel value, which directly reflects the geometry of the visible surface of the scene.
  • Each pixel in the depth image represents the distance between the object at a particular coordinate and the camera plane of the depth sensor in the field of view of the depth sensor.
  • wireless connection manners may include, but are not limited to, 3G/4G connection, WiFi connection, Bluetooth connection, WiMAX connection, Zigbee connection, UWB (ultra wideband) connection, and other wireless connection methods now known or developed in the future.
  • Step 202 Select a first depth image and a second depth image from each frame depth image of the depth image video.
  • the first depth image and the second depth image may be adjacent two-frame depth images in the depth image video, or may be two-frame depth images in which the sequence numbers in the depth image video differ by less than a preset value.
  • Step 203 Determine a first pixel point set in the first depth image and a second pixel point set in the second depth image.
  • Each pixel point in the first pixel point set corresponds to each pixel point in the second pixel point set, and the corresponding two pixel points indicate the same object, and more specifically, the corresponding two pixel point indications The same location of the same object. It can be understood that the number of pixels in the first set of pixel points is equal to the number of pixels in the second set of pixel points, and the number thereof is the same as the pixel indicating the same object shared by the first depth image and the second depth image. The number is the same.
  • Step 204 For any pixel in the first set of pixel points, based on the first two-dimensional coordinates of the pixel in the first depth image and the corresponding pixel corresponding to the pixel in the second depth image A depth value determines the pose transformation parameter of the depth sensor.
  • the server may be based on the first two-dimensional coordinates of each pixel in the first set of pixel points in the first depth image and the corresponding pixel points in the second set of pixels corresponding to the pixel in the second depth image.
  • a depth value is used to determine the pose transformation parameters of the depth sensor. It can be understood that the first two-dimensional coordinate is a coordinate of the pixel in the image coordinate system of the first depth image, and the depth value of the pixel is not included in the first two-dimensional coordinate.
  • a corresponding pixel point corresponding to the pixel point exists in the second pixel point set. Since each pixel point has a depth value, the first depth value of the corresponding pixel point may be determined by the second depth image.
  • the pose transformation parameter may be a pose transformation parameter between the first depth image and the second depth image.
  • FIG. 3 is a schematic diagram of an application scenario of the pose estimation method according to the present embodiment.
  • the depth sensor 301 is installed on the unmanned vehicle 302.
  • the depth sensor 301 collects the depth image video and sends the collected depth image video to the server 303.
  • the server 303 determines the pose transformation parameter of the depth sensor 301, and then sends the pose transformation parameter to the unmanned vehicle 302, and the unmanned vehicle 302 can navigate according to the pose transformation parameter. And obstacle avoidance.
  • the pose estimation method acquires the depth image video collected by the depth sensor from the depth sensor, and selects two depth images of at least one pixel point indicating the same object from the two frames, and then determines the two-frame depth image. And corresponding to the first pixel point set and the second pixel point set, and then for each pixel point in the first pixel point set, according to the first two-dimensional coordinate in the first depth image and corresponding to the pixel point.
  • the first depth value of the corresponding pixel in the second depth image determines the pose transformation parameter of the depth sensor, reduces the consumption of computing resources, improves the calculation efficiency, and ensures the real-time performance of the pose estimation.
  • the foregoing method further includes the following steps not shown in FIG. 2:
  • pixels in the frame depth image that meet the preset condition are deleted; and the deleted depth image is smoothed.
  • the depth sensor typically emits probe light (eg, infrared, laser, radar) and receives the reflected light reflected from the surface of the object to determine the distance between the object and the depth sensor. Due to the occlusion of the object, the absorption of the detected light by the surface of the object, and the diffuse reflection, the depth sensor cannot completely receive the reflected light reflected back. Therefore, many pixel points in the depth image have no depth value or the depth value is inaccurate. In this implementation manner, in order to ensure the accuracy of the pose estimation, it is necessary to delete the pixel points in the depth image of each frame that meet the preset condition.
  • probe light eg, infrared, laser, radar
  • the depth image from which the pixel is deleted can be smoothed.
  • the above smoothing processing may include linear smoothing, interpolation smoothing, convolution smoothing, Gaussian filtering, bilateral filtering, and the like.
  • the depth value of each pixel point may be detected first, and the pixel point whose depth value is greater than the first preset value and smaller than the second preset value may be deleted.
  • the values of the first preset value and the second preset value are related to the model of the depth sensor, which is not limited in this implementation manner.
  • the first partial derivative Zu of the depth image in the horizontal direction u direction and the second partial derivative Zv in the vertical direction v direction may be determined, and then determined according to Zu and Zv.
  • the geometric edge pixel points in the frame depth image are removed from the determined geometric edge pixel points.
  • the depth value of the pixel at the edge of the object has a high degree of uncertainty, and at the same time on both sides of the pixel where the edge is located The pixel depth value will jump, and in order to ensure the accuracy of the pose estimation, the above geometric edge pixel points can be deleted.
  • the failed pixel points where the depth value does not exist in each depth image may be determined, and then the failed pixel points and the pixel points adjacent to the failed pixel points are deleted.
  • the detection light receiver cannot receive the detection light reflected back from the object, and thus the depth value of the pixel cannot be determined. These pixels are called invalid pixels. point.
  • the pixel points adjacent to the failed pixel point are also deleted.
  • the pose transformation parameter of the depth sensor can be determined by the following steps:
  • Step 401 Map the first two-dimensional coordinates of the pixel in the first depth image to the first three-dimensional coordinate of the coordinate system to which the first depth image belongs.
  • the first two-dimensional coordinates (x 1 , y 1 ) of the pixel in the first depth image may be first determined, and then the first two-dimensional coordinates (x 1) , y 1 ) is mapped to the first three-dimensional space coordinates.
  • the first three-dimensional space coordinates (x 1 ', y 1 ', z 1 ') of the coordinate system to which the first depth image belongs may be obtained by ⁇ -1 mapping in the pinhole camera model.
  • the pinhole camera model includes a mapping relationship ⁇ of two-dimensional coordinates that project a three-dimensional space point to a pixel plane and a two-dimensional coordinate of a point with depth on the image as a mapping relationship ⁇ -1 of the three-dimensional space point.
  • Step 402 Transform the first three-dimensional space coordinate into a coordinate system to which the second depth image belongs, to obtain a second three-dimensional space coordinate.
  • the pose transformation parameter between the first depth image and the second depth image is recorded as T 1 ⁇ 2 , as shown in FIG. 5 .
  • a pedestrian in the world coordinate system x w -y w -z w is denoted as point P
  • the depth sensor is located on the left side at time t 1 and on the right side at time t 2 .
  • Point P is the point P 1 in the depth image obtained at time t 1 , its coordinate is (x 1 , y 1 ), its depth value is Z 1 , and its coordinate system is x c1 -y c1 -z c1 .
  • Point P is a point P 2 in the depth image obtained at time t 2 , its coordinate is (x 2 , y 2 ), its depth value is Z 2 , and its associated coordinate system is x c2 -y c2 -z c2 .
  • the pose transformation parameter between the coordinate system x c1 -y c1 -z c1 and the coordinate system x c2 -y c2 -z c2 is T 1 ⁇ 2 .
  • the above-described pose transformation parameter T 1 ⁇ 2 is represented by a Lie algebra se(3) as a ⁇ , and is represented by a matrix as a Lie group T( ⁇ ).
  • a Lie group T( ⁇ ) may be preset, and the preset Lie group T( ⁇ ) is used to complete the transformation to obtain the second three-dimensional coordinate. (x 2 ', y 2 ', z 2 ').
  • Step 403 Map the second three-dimensional space coordinates into the second depth image to obtain the second two-dimensional coordinates.
  • the second two-dimensional coordinates (x 2 , y 2 ) can be obtained by using the ⁇ mapping in the pinhole camera model.
  • Step 404 determining a second depth value of the second two-dimensional coordinate in the second depth image.
  • a second depth value of the second two-dimensional coordinate (x 2 , y 2 ) is determined in the second depth image.
  • the second depth value should be the same as the first depth value of the corresponding pixel point corresponding to the pixel point in the second pixel point set.
  • the two are often different.
  • Step 405 Determine a pose change parameter of the depth sensor based on the first depth value and the second depth value.
  • the pose change parameter of the depth sensor may be determined in conjunction with the first depth value.
  • step 405 can be implemented by the following steps not shown in FIG. 4:
  • the pose estimation value determines a pose transformation parameter of the depth sensor; in response to the depth residual being greater than or equal to a preset threshold, determining that the accumulated pose estimation increment is a depth residual, and continuing to perform the iterative step.
  • determining a difference between the two is a depth difference between the first depth image and the second depth image.
  • determining the pose estimation increment based on the depth residual, and then determining whether the depth residual is less than a preset threshold, and if less than, determining the pose
  • the increment and the first depth value are accumulated to obtain a pose estimation value, and the difference between the pose estimate and the second depth value at this time is within an acceptable range, and thus can be directly determined according to the pose estimate described above.
  • Pose transformation parameters of the depth sensor If the depth residual is greater than or equal to the preset threshold, the pose estimation increment obtained each time the iterative step is performed is accumulated, the accumulated value is taken as a new depth residual, and then the iterative step is continued.
  • the pose transformation parameter of the depth sensor may be determined according to the following formula:
  • ⁇ * arg min ⁇
  • ⁇ * is an estimated value of the pose transformation parameter
  • Z 2 is a second depth image
  • P 1 is a pixel point in the first depth image
  • P 2 ' is a corresponding pixel point of P 1 in the second depth image
  • Z 2 (P 2 ') is the depth value of the P 2 ' point in the second depth image (ie, the first depth value)
  • T is the Lie group
  • is the pose transformation parameter
  • T( ⁇ ) is the preset bit.
  • ⁇ -1 is the ⁇ -1 map in the pinhole camera model
  • [T( ⁇ ) ⁇ -1 (P 1 )] Z is the point where P 1 is transformed into a three-dimensional space point by ⁇ mapping and then The space is transformed to the depth value of the pixel point obtained by the ⁇ -1 mapping of the coordinate system to which the second depth image belongs
  • the arg min function is such that ⁇
  • the ⁇ value at the minimum value is denoted as ⁇ *.
  • the pose estimation method provided by the above embodiment of the present application utilizes the depth residual between the depth images to solve the pose transformation parameters, thereby avoiding the complicated process of extracting feature points and establishing descriptors in the prior art, and saving calculations. Resources ensure the real-time performance of the calculation.
  • the pose estimation method of the embodiment may further include the following steps after obtaining the pose transformation parameter of the depth sensor:
  • Step 601 obtaining angular velocity and acceleration from an inertial measurement device physically bound to the depth sensor.
  • an inertial measurement unit may be physically bound to the depth sensor.
  • the above physical binding can be understood as the fact that the inertial measurement device is coincident with the center of the depth sensor and fixed together.
  • the inertial measurement device measures the angular velocity and acceleration of the object's movement.
  • the server performing the pose estimation method can acquire the angular velocity and the acceleration from the inertial measurement device by wire or wirelessly.
  • Step 602 Determine a pose transformation parameter of the inertial measurement device according to the angular velocity and the acceleration.
  • the server may determine the pose transformation parameters of the inertial measurement device after acquiring the angular velocity and acceleration described above.
  • step 602 may be implemented by the following steps not shown in FIG. 6:
  • determining the first pose transformation parameter by using the angular velocity, and determining the second pose transformation parameter by using the acceleration is well known to those skilled in the art, and details are not described herein again.
  • the two After obtaining the first pose transformation parameter and the second pose transformation parameter, the two can be fused to determine the pose transformation parameter of the inertial measurement device.
  • Step 603 combining the pose transformation parameter of the depth sensor and the pose transformation parameter of the inertial measurement device to determine the integrated pose transformation parameter.
  • the coupling methods can be used to fuse the two to determine the integrated pose transformation parameters.
  • the acceleration and angular velocity may be first filtered prior to step 602.
  • a complementary filter can be used to remove the acceleration and angular velocity noise, and improve the accuracy of the pose transformation parameters.
  • the pose estimation method provided by the above embodiment of the present application can improve the accuracy of the pose estimation parameter.
  • the present application provides an embodiment of a pose estimation apparatus, the apparatus embodiment corresponding to the method embodiment shown in FIG. 2, the apparatus specific Can be applied to a variety of electronic devices.
  • the pose estimating apparatus 700 of the present embodiment includes a first acquiring unit 701, an image selecting unit 702, a pixel point set determining unit 703, and a first parameter determining unit 704.
  • the first obtaining unit 701 is configured to acquire a depth image video from the depth sensor.
  • the image selecting unit 702 is configured to select the first depth image and the second depth image from each frame depth image of the depth image video.
  • At least one pixel point indicating the same object is shared in the first depth image and the second depth image.
  • the pixel point set determining unit 703 is configured to determine a first pixel point set in the first depth image and a second pixel point set in the second depth image.
  • the pixel points in the first pixel point set are in one-to-one correspondence with the pixel points in the second pixel point set, and the corresponding two pixel points indicate the same object.
  • the first parameter determining unit 704 is configured to: for any pixel in the first set of pixel points, based on the first two-dimensional coordinates of the pixel in the first depth image and the corresponding pixel corresponding to the pixel A first depth value in the two depth images determines a pose transformation parameter of the depth sensor.
  • the foregoing apparatus 700 may further include a pre-processing unit not shown in FIG. 7.
  • the pre-processing unit includes a pixel point deletion module and a smoothing module.
  • the pixel point deleting module is configured to: before the image selecting unit 702 selects the first depth image and the second depth image from each frame depth image, delete the pixel that meets the preset condition in the frame depth image for each frame depth image. point.
  • a smoothing module for smoothing the deleted depth image For smoothing the deleted depth image.
  • the pixel point deleting module may be further configured to: detect a depth value of each pixel point; set the depth value to be greater than the first preset value, and the depth value is smaller than the second preset The pixel of the value is deleted.
  • the pixel point deleting module may be further configured to: determine a first partial derivative of the frame depth image in a horizontal direction and a second partial derivative in a vertical direction; A partial derivative and a second partial derivative are used to determine geometric edge pixel points in the frame depth image; the geometric edge pixel points are deleted.
  • the pixel deletion module may be further configured to: determine a failed pixel point where the depth value does not exist in the frame depth image; and the failed pixel point and the adjacent to the failed pixel point The pixel is deleted.
  • the first parameter determining unit 704 may further include a first mapping module, a transform module, a second mapping module, a depth value determining module, and the first Parameter determination module.
  • the first mapping module is configured to map the first two-dimensional coordinates of the pixel in the first depth image to the first three-dimensional coordinate of the coordinate system to which the first depth image belongs.
  • a transform module configured to transform the first three-dimensional space coordinate to a coordinate system to which the second depth image belongs, to obtain a second three-dimensional space coordinate.
  • the second mapping module is configured to map the second three-dimensional space coordinates into the second depth image to obtain the second two-dimensional coordinates.
  • a depth value determining module configured to determine a second depth value of the second two-dimensional coordinate in the second depth image.
  • the first parameter determining module is configured to determine a pose transformation parameter of the depth sensor based on the first depth value and the second depth value.
  • the first parameter determining module may be further configured to: determine a depth difference between the first depth image and the second depth image according to the first depth value and the second depth value Determining the depth difference as the depth residual, and based on the depth residual, performing the following iterative steps: determining the pose estimation increment based on the depth residual; determining whether the depth residual is less than a preset threshold; and responding to the depth residual is less than a preset threshold, a cumulative pose estimation increment and a first depth value, determining a pose estimation value; determining a pose transformation parameter of the depth sensor according to the pose estimation value; determining, in response to the depth residual being greater than or equal to a preset threshold, determining The accumulated pose estimation increment is the depth residual and the above iterative steps are continued.
  • the pose estimating apparatus 700 may further include a second acquiring unit, a second parameter determining unit, and a parameter fusion unit not shown in FIG. 7.
  • a second acquisition unit is configured to acquire angular velocity and acceleration from an inertial measurement device physically bound to the depth sensor.
  • the second parameter determining unit is configured to determine a pose transformation parameter of the inertial measurement device according to the angular velocity and the acceleration.
  • the parameter fusion unit is configured to combine the pose transformation parameter of the depth sensor and the pose transformation parameter of the inertial measurement device to determine the integrated pose transformation parameter.
  • the second parameter determining unit may further include a first sub-parameter determining module, a second sub-parameter determining module, and a converging module, which are not illustrated in FIG. 7 .
  • the first sub-parameter determining module is configured to determine a first pose change parameter of the inertial measurement device according to the angular velocity.
  • the second sub-parameter determining module is configured to determine a second pose change parameter of the inertial measurement device according to the acceleration.
  • the fusion module is configured to combine the first pose transformation parameter and the second pose transformation parameter to determine a pose transformation parameter of the inertial measurement device.
  • the pose estimating apparatus provides the depth image video acquired by the depth sensor from the depth sensor, and selects two depth images of at least one pixel indicating the same object from the two frames, and then determines the two-frame depth image. And corresponding to the first pixel point set and the second pixel point set, and then for each pixel point in the first pixel point set, according to the first two-dimensional coordinate in the first depth image and corresponding to the pixel point.
  • the first depth value of the corresponding pixel in the second depth image determines the pose transformation parameter of the depth sensor, reduces the consumption of computing resources, improves the calculation efficiency, and ensures the real-time performance of the pose estimation.
  • the units 701 to 704 described in the pose estimating apparatus 700 correspond to respective steps in the method described with reference to FIG. 2, respectively.
  • the operations and features described above for the method for synthesizing singing voice are equally applicable to the apparatus 700 and the units contained therein, and are not described herein again.
  • Corresponding units of device 700 may cooperate with units in the server to implement the solution of the embodiments of the present application.
  • FIG. 8 a block diagram of a computer system 800 suitable for use in implementing a server of an embodiment of the present application is shown.
  • the server shown in FIG. 8 is merely an example, and should not impose any limitation on the function and scope of use of the embodiments of the present application.
  • computer system 800 includes a central processing unit (CPU) 801 that can be loaded into a program in random access memory (RAM) 803 according to a program stored in read only memory (ROM) 802 or from storage portion 808. And perform various appropriate actions and processes.
  • RAM random access memory
  • ROM read only memory
  • RAM 803 various programs and data required for the operation of the system 800 are also stored.
  • the CPU 801, the ROM 802, and the RAM 803 are connected to each other through a bus 804.
  • An input/output (I/O) interface 805 is also coupled to bus 804.
  • the following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, etc.; an output portion 807 including, for example, a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a storage portion 808 including a hard disk or the like. And a communication portion 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the Internet.
  • Driver 810 is also coupled to I/O interface 805 as needed.
  • a removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like, is mounted on the drive 810 as needed so that a computer program read therefrom is installed into the storage portion 808 as needed.
  • an embodiment of the present disclosure includes a computer program product comprising a computer program carried on a machine readable medium, the computer program comprising program code for executing the method illustrated in the flowchart.
  • the computer program can be downloaded and installed from the network via communication portion 809, and/or installed from removable media 811.
  • the central processing unit (CPU) 801 the above-described functions defined in the method of the present application are performed.
  • the computer readable medium described herein may be a computer readable signal medium or a computer readable storage medium or any combination of the two.
  • the computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • a computer readable storage medium may be any tangible medium that can contain or store a program, which can be used by or in connection with an instruction execution system, apparatus or device.
  • a computer readable signal medium may include a data signal that is propagated in the baseband or as part of a carrier, carrying computer readable program code. Such propagated data signals can take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer readable signal medium can also be any computer readable medium other than a computer readable storage medium, which can transmit, propagate, or transport a program for use by or in connection with the instruction execution system, apparatus, or device.
  • Program code embodied on a computer readable medium can be transmitted by any suitable medium, including but not limited to wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
  • each block of the flowchart or block diagram can represent a module, a program segment, or a portion of code that includes one or more of the logic functions for implementing the specified.
  • Executable instructions can also occur in a different order than that illustrated in the drawings. For example, two successively represented blocks may in fact be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented in a dedicated hardware-based system that performs the specified function or operation. Or it can be implemented by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present application may be implemented by software or by hardware.
  • the described unit may also be disposed in the processor, for example, as a processor including a first acquisition unit, an image selection unit, a pixel point determination unit, and a first parameter determination unit.
  • the names of these units do not constitute a limitation on the unit itself under certain circumstances.
  • the first acquisition unit may also be described as “a unit that acquires depth image video from the depth sensor”.
  • the present application also provides a computer readable medium, which may be included in the apparatus described in the above embodiments, or may be separately present and not incorporated into the apparatus.
  • the computer readable medium carries one or more programs that, when executed by the device, cause the device to: acquire a depth image video from a depth sensor; from each frame depth image of the depth image video Selecting a first depth image and a second depth image, wherein the first depth image and the second depth image share at least one pixel point indicating the same object; determining the first pixel point set and the second depth image in the first depth image a second set of pixel points, wherein each pixel point in the first pixel point set corresponds to each pixel point in the second pixel point set, and the corresponding two pixel points indicate the same object; for the first pixel Determining the depth sensor based on the first two-dimensional coordinates of the pixel in the first depth image and the first depth value of the corresponding pixel corresponding to the pixel in the second depth image The pose transformation parameters.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention concerne un procédé d'estimation de pose et un appareil. Un mode de réalisation spécifique dudit procédé comprend : l'acquisition d'une vidéo d'images télémétriques depuis un emplacement de capteur télémétrique ; la sélection, parmi des trames individuelles d'image télémétrique, d'une première image télémétrique et d'une seconde image télémétrique, la première image télémétrique et la seconde image télémétrique partageant au moins un point de pixel indiquant le même objet ; la détermination d'un premier ensemble de points de pixel dans la première image télémétrique et d'un second ensemble de points de pixel dans la seconde image télémétrique, chaque point de pixel du premier ensemble de points de pixel ayant une correspondance biunivoque avec un point de pixel du second ensemble de points de pixel, deux points de pixel correspondants indiquant le même objet ; pour n'importe quel point de pixel du premier ensemble de points de pixel, la détermination d'un paramètre de changement de pose sur la base de premières coordonnées bidimensionnelles du point de pixel dans la première image télémétrique et d'une première valeur télémétrique du point de pixel correspondant dans la seconde image télémétrique. Le présent mode de réalisation diminue la consommation de ressources informatiques, ce qui augmente le rendement de calcul et assure une estimation de pose en temps réel.
PCT/CN2018/083376 2017-05-09 2018-04-17 Procédé et appareil d'estimation de pose WO2018205803A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710321322.X 2017-05-09
CN201710321322.XA CN107123142B (zh) 2017-05-09 2017-05-09 位姿估计方法和装置

Publications (1)

Publication Number Publication Date
WO2018205803A1 true WO2018205803A1 (fr) 2018-11-15

Family

ID=59726877

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/083376 WO2018205803A1 (fr) 2017-05-09 2018-04-17 Procédé et appareil d'estimation de pose

Country Status (2)

Country Link
CN (1) CN107123142B (fr)
WO (1) WO2018205803A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112146578A (zh) * 2019-06-28 2020-12-29 顺丰科技有限公司 尺度比计算方法、装置、设备、及存储介质

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107123142B (zh) * 2017-05-09 2020-05-01 北京京东尚科信息技术有限公司 位姿估计方法和装置
CN108399643A (zh) * 2018-03-15 2018-08-14 南京大学 一种激光雷达和相机间的外参标定系统和方法
CN110914867A (zh) * 2018-07-17 2020-03-24 深圳市大疆创新科技有限公司 位姿确定方法、设备、计算机可读存储介质
WO2020019175A1 (fr) * 2018-07-24 2020-01-30 深圳市大疆创新科技有限公司 Procédé et dispositif de traitement d'image et dispositif photographique et véhicule aérien sans pilote
CN109186596B (zh) * 2018-08-14 2020-11-10 深圳清华大学研究院 Imu测量数据生成方法、系统、计算机装置及可读存储介质
CN109544629B (zh) * 2018-11-29 2021-03-23 南京人工智能高等研究院有限公司 摄像头位姿确定方法和装置以及电子设备
CN109470149B (zh) * 2018-12-12 2020-09-29 北京理工大学 一种管路位姿的测量方法及装置
CN111435086B (zh) * 2019-01-13 2022-03-25 北京魔门塔科技有限公司 基于拼接图的导航方法和装置
CN109650292B (zh) * 2019-02-02 2019-11-05 北京极智嘉科技有限公司 一种智能叉车以及智能叉车的位置调整方法和介质
CN110054121B (zh) 2019-04-25 2021-04-20 北京极智嘉科技有限公司 一种智能叉车以及容器位姿偏移检测方法
CN112907164A (zh) * 2019-12-03 2021-06-04 北京京东乾石科技有限公司 物体定位方法和装置
CN112070052A (zh) * 2020-09-16 2020-12-11 青岛维感科技有限公司 一种间距监测方法、装置、系统及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609942A (zh) * 2011-01-31 2012-07-25 微软公司 使用深度图进行移动相机定位
CN104933755A (zh) * 2014-03-18 2015-09-23 华为技术有限公司 一种静态物体重建方法和系统
US20160171703A1 (en) * 2013-07-09 2016-06-16 Samsung Electronics Co., Ltd. Camera pose estimation apparatus and method
CN106403924A (zh) * 2016-08-24 2017-02-15 智能侠(北京)科技有限公司 基于深度摄像头的机器人快速定位与姿态估计方法
CN106529538A (zh) * 2016-11-24 2017-03-22 腾讯科技(深圳)有限公司 一种飞行器的定位方法和装置
CN107123142A (zh) * 2017-05-09 2017-09-01 北京京东尚科信息技术有限公司 位姿估计方法和装置

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102289809A (zh) * 2011-07-25 2011-12-21 清华大学 估计摄像机位姿的方法及装置
US10339389B2 (en) * 2014-09-03 2019-07-02 Sharp Laboratories Of America, Inc. Methods and systems for vision-based motion estimation
CN104361575B (zh) * 2014-10-20 2015-08-19 湖南戍融智能科技有限公司 深度图像中的自动地面检测及摄像机相对位姿估计方法
CN106157367B (zh) * 2015-03-23 2019-03-08 联想(北京)有限公司 三维场景重建方法和设备
CN105045263B (zh) * 2015-07-06 2016-05-18 杭州南江机器人股份有限公司 一种基于Kinect深度相机的机器人自定位方法
CN105698765B (zh) * 2016-02-22 2018-09-18 天津大学 双imu单目视觉组合测量非惯性系下目标物位姿方法
CN105976353B (zh) * 2016-04-14 2020-01-24 南京理工大学 基于模型和点云全局匹配的空间非合作目标位姿估计方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609942A (zh) * 2011-01-31 2012-07-25 微软公司 使用深度图进行移动相机定位
US20160171703A1 (en) * 2013-07-09 2016-06-16 Samsung Electronics Co., Ltd. Camera pose estimation apparatus and method
CN104933755A (zh) * 2014-03-18 2015-09-23 华为技术有限公司 一种静态物体重建方法和系统
CN106403924A (zh) * 2016-08-24 2017-02-15 智能侠(北京)科技有限公司 基于深度摄像头的机器人快速定位与姿态估计方法
CN106529538A (zh) * 2016-11-24 2017-03-22 腾讯科技(深圳)有限公司 一种飞行器的定位方法和装置
CN107123142A (zh) * 2017-05-09 2017-09-01 北京京东尚科信息技术有限公司 位姿估计方法和装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112146578A (zh) * 2019-06-28 2020-12-29 顺丰科技有限公司 尺度比计算方法、装置、设备、及存储介质

Also Published As

Publication number Publication date
CN107123142A (zh) 2017-09-01
CN107123142B (zh) 2020-05-01

Similar Documents

Publication Publication Date Title
WO2018205803A1 (fr) Procédé et appareil d'estimation de pose
CN108986161B (zh) 一种三维空间坐标估计方法、装置、终端和存储介质
JP7173772B2 (ja) 深度値推定を用いた映像処理方法及び装置
CN111325796B (zh) 用于确定视觉设备的位姿的方法和装置
JP7106665B2 (ja) 単眼深度推定方法およびその装置、機器ならびに記憶媒体
WO2019161813A1 (fr) Procédé, appareil et système de reconstruction tridimensionnelle de scène dynamique, serveur et support
KR20220009393A (ko) 이미지 기반 로컬화
US20230245391A1 (en) 3d model reconstruction and scale estimation
US11064178B2 (en) Deep virtual stereo odometry
US9129435B2 (en) Method for creating 3-D models by stitching multiple partial 3-D models
CN109461208B (zh) 三维地图处理方法、装置、介质和计算设备
JP2018534699A (ja) 誤りのある深度情報を補正するためのシステムおよび方法
KR20170053007A (ko) 자세 추정 방법 및 자세 추정 장치
Zhao et al. Real-time stereo on GPGPU using progressive multi-resolution adaptive windows
WO2015124066A1 (fr) Procédé et dispositif de navigation visuelle et robot
JP2023530545A (ja) 空間幾何情報推定モデルの生成方法及び装置
JP2023021994A (ja) 自動運転車両に対するデータ処理方法及び装置、電子機器、記憶媒体、コンピュータプログラム、ならびに自動運転車両
CN114998406A (zh) 一种自监督多视图深度估计方法、装置
CN113850859A (zh) 用于增强图像深度置信度图的方法、系统、制品和装置
CN113378605B (zh) 多源信息融合方法及装置、电子设备和存储介质
CN112288817B (zh) 基于图像的三维重建处理方法及装置
JP2017111209A (ja) 3dマップの作成
CN114066980A (zh) 对象检测方法、装置、电子设备及自动驾驶车辆
KR101668649B1 (ko) 주변 환경 모델링 방법 및 이를 수행하는 장치
CN113763468A (zh) 一种定位方法、装置、系统及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18798239

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17.03.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18798239

Country of ref document: EP

Kind code of ref document: A1