WO2021057739A1 - 定位方法及装置、设备、存储介质 - Google Patents

定位方法及装置、设备、存储介质 Download PDF

Info

Publication number
WO2021057739A1
WO2021057739A1 PCT/CN2020/116920 CN2020116920W WO2021057739A1 WO 2021057739 A1 WO2021057739 A1 WO 2021057739A1 CN 2020116920 W CN2020116920 W CN 2020116920W WO 2021057739 A1 WO2021057739 A1 WO 2021057739A1
Authority
WO
WIPO (PCT)
Prior art keywords
voxel
coordinates
sample image
target
world coordinates
Prior art date
Application number
PCT/CN2020/116920
Other languages
English (en)
French (fr)
Inventor
金珂
杨宇尘
陈岩
方攀
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to EP20869168.3A priority Critical patent/EP4016458A4/en
Publication of WO2021057739A1 publication Critical patent/WO2021057739A1/zh
Priority to US17/686,091 priority patent/US12051223B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • the embodiments of the present application relate to electronic technology, and relate to, but are not limited to, positioning methods and devices, equipment, and storage media.
  • the location of the person is mainly determined by identifying the person and the fixed object in the image collected by the camera module.
  • This solution matches the fixed object in the image with a pre-built indoor map to determine the corresponding position of the fixed object indoors; then, the indoor position of the person is determined according to the position of the fixed object; wherein, the position of the person is determined
  • the overall idea of the location is: to identify the fixed object in the image through the image recognition method, and determine the position of the person according to the relative position relationship between the fixed object and the person in the image and the location of the fixed object in the room.
  • this positioning method is mainly based on the relative position relationship between the characters and fixed objects in the image. In this way, when positioning is realized, there must be recognizable characters and fixed objects in the image, otherwise the positioning will be invalid. , So the robustness of this positioning method is poor.
  • the positioning method, device, device, and storage medium provided by the embodiments of the present application do not depend on the fixed object and the object to be positioned in the image, so it has better robustness.
  • the technical solutions of the embodiments of the present application are implemented as follows:
  • the positioning method provided by the embodiment of the present application includes: converting the pixel coordinates of the multiple pixels according to the internal parameter matrix of the image acquisition module and the depth values of multiple pixels in the depth image acquired by the image acquisition module Is the camera coordinates; the camera coordinates of each pixel point are matched with the target world coordinates of multiple voxels in the pre-built point cloud map to obtain the positioning result of the image acquisition module; wherein, each of the The target world coordinates of a voxel are obtained by updating the initial world coordinates of each voxel according to a plurality of sample image pairs, and the sample image pairs include a two-dimensional sample image and a depth sample image.
  • the positioning device includes: a coordinate conversion module configured to convert the multiple pixels according to the internal parameter matrix of the image acquisition module and the depth values of multiple pixels in the depth image acquired by the image acquisition module The pixel coordinates of the pixels are converted into camera coordinates; the positioning module is configured to match the camera coordinates of each pixel with the target world coordinates of multiple voxels in a pre-built point cloud map to obtain the image acquisition model The positioning result of the group; wherein, the target world coordinates of each voxel are obtained by updating the initial world coordinates of each voxel according to a plurality of sample image pairs, and the sample image pairs include two-dimensional sample images And depth sample images.
  • the electronic device provided by the embodiment of the present application includes a memory and a processor.
  • the memory stores a computer program that can run on the processor.
  • the processor executes the program, the positioning method in the embodiment of the present application is implemented. step.
  • the computer-readable storage medium provided by the embodiment of the application has a computer program stored thereon, and the computer program implements the steps in the positioning method described in the embodiment of the application when the computer program is executed by a processor.
  • the camera coordinates of multiple pixels in the depth image collected by the image acquisition module are matched with the target world coordinates of multiple voxels in the pre-built point cloud map to determine the image acquisition module’s Positioning result; in this way, when positioning the object to be positioned carrying the image acquisition module, the positioning method does not depend on the fixed object and the object to be positioned in the depth image, so better robustness can be obtained.
  • FIG. 1 is a schematic diagram of an implementation process of a positioning method according to an embodiment of this application.
  • FIG. 2 is a schematic diagram of the construction process of a point cloud map according to an embodiment of the application
  • FIG. 3 is a schematic diagram of quantizing a specific physical space according to an embodiment of the application.
  • FIG. 4A is a schematic diagram of the composition structure of a positioning device according to an embodiment of the application.
  • 4B is a schematic diagram of the composition structure of another positioning device according to an embodiment of the application.
  • FIG. 5 is a schematic diagram of a hardware entity of an electronic device according to an embodiment of the application.
  • first ⁇ second ⁇ third involved in the embodiments of this application is used to distinguish different objects, and does not represent a specific order for the objects. Understandably, “first ⁇ second ⁇ third” Where permitted, the specific order or sequence can be interchanged, so that the embodiments of the present application described herein can be implemented in a sequence other than those illustrated or described herein.
  • the embodiments of this application provide a positioning method, which can be applied to electronic devices, which can be mobile phones, tablets, laptops, desktop computers, servers, robots, drones, and other devices with information processing capabilities.
  • the functions implemented by the positioning method can be implemented by a processor in the electronic device calling program code.
  • the program code can be stored in a computer storage medium. It can be seen that the electronic device includes at least a processor and a storage medium.
  • FIG. 1 is a schematic diagram of the implementation process of the positioning method according to the embodiment of this application. As shown in FIG. 1, the method may include the following steps S101 to S102:
  • Step S101 According to the internal parameter matrix of the image acquisition module and the depth values of multiple pixels in the depth image acquired by the image acquisition module, the pixel coordinates of the multiple pixels are converted into camera coordinates.
  • the image acquisition module generally includes a first camera module and a second camera module; among them, the first camera module is used to collect two-dimensional images of the scene, for example, the red, green, and blue (Red, green, and blue) of the scene are collected by a monocular camera. Green, Blue, RGB) images; the second camera module is used to collect depth images of the shooting scene, the second camera module can be a binocular camera, a structured light camera or a Time-of-Flight (TOF) camera, etc. Three-dimensional vision sensor.
  • the first camera module is used to collect two-dimensional images of the scene, for example, the red, green, and blue (Red, green, and blue) of the scene are collected by a monocular camera. Green, Blue, RGB) images
  • the second camera module is used to collect depth images of the shooting scene, the second camera module can be a binocular camera, a structured light camera or a Time-of-Flight (TOF) camera, etc.
  • TOF Time-of-Flight
  • the electronic device may include an image acquisition module, that is, the image acquisition module is installed in the electronic device.
  • the electronic device is a smart phone with a first camera module and a second camera module; of course, in some embodiments, the electronic device may not include the image acquisition module, and the image acquisition module may use its own internal parameter matrix And the acquired depth image is sent to the electronic device.
  • the internal parameter matrix F of the image acquisition module is:
  • the depth image is also called the distance image, which refers to an image in which the distance between the image acquisition module and each point on the surface of the object in the scene is used as the pixel value. That is to say, the pixel value of the pixel in the depth image is the distance from a certain point on the surface of the object to the mirror surface of the sensor in the field of view of the three-dimensional vision sensor, and the pixel value is generally called the depth value. Therefore, when the depth value z c of the pixel point j is known, the pixel coordinates (u, v) of the pixel point j can be converted to the camera coordinates (x c , y c , z c according to the following formula (2) ):
  • the depth value of the pixel point j is the Z axis coordinate value z c of the camera coordinates of the pixel point.
  • Step S102 matching the camera coordinates of each pixel with the target world coordinates of multiple voxels in a pre-built point cloud map to obtain a positioning result of the image acquisition module.
  • the target world coordinate of the voxel is the world coordinate of a certain point (for example, the center point) of the voxel.
  • the target world coordinates of each voxel are obtained by the electronic device by updating the initial world coordinates of each voxel according to a plurality of sample image pairs including two-dimensional sample images and depth sample images.
  • the point cloud map construction process may include step S501 to step S503 in the following embodiment.
  • the positioning result includes the current world coordinates of the image acquisition module; or, the positioning result includes the current world coordinates and orientation of the image acquisition module. That is, the positioning result includes the position of the user (or device) carrying the image acquisition module, or includes the position and posture of the user (or device).
  • the world coordinates can be two-dimensional coordinates or three-dimensional coordinates.
  • the orientation of the image acquisition module is obtained through step S102, so that the positioning method can be applied to more application scenarios. For example, according to the current orientation of the robot, the robot is instructed to perform the next action. For another example, in a navigation application, knowing the user's orientation can more accurately guide the user in which direction to walk/drive. For another example, in an unmanned driving application, knowing the direction of the vehicle can more accurately control which direction the vehicle is traveling in.
  • the camera coordinates of multiple pixels in the depth image collected by the image acquisition module are matched with the target world coordinates of multiple voxels in the pre-built point cloud map to determine the image acquisition module’s Positioning result; in this way, when positioning the image acquisition module, the positioning method does not depend on the fixed object and the object to be positioned in the depth image, so better robustness can be obtained.
  • the embodiment of the present application further provides a positioning method, and the method may include the following steps S201 to S203:
  • Step S201 Convert the pixel coordinates of the multiple pixels into camera coordinates according to the internal parameter matrix of the image acquisition module and the depth values of multiple pixels in the depth image acquired by the image acquisition module;
  • Step S202 According to the iterative strategy, the camera coordinates of each pixel point are matched with the target world coordinates of multiple voxels in the pre-built point cloud map to obtain the target transformation relationship between the camera coordinate system and the world coordinate system.
  • the point cloud map includes the target world coordinates of the voxel, but does not include the image characteristics of the voxel; in this way, the data volume of the point cloud map can be greatly reduced, thereby saving the point cloud map in the electronic device. storage.
  • the point cloud map does not include the image features of the voxels, that is, the premise of the camera coordinates of each pixel in the depth image collected by the known image acquisition module and the target world coordinates of multiple voxels in the point cloud map
  • iterative strategy try to find the target transformation relationship between the camera coordinate system and the world coordinate system to achieve the positioning of the image acquisition module.
  • iteratively find the voxel closest to each pixel (that is, the best match) to obtain the target transformation relationship.
  • Step S203 Determine the positioning result of the image acquisition module according to the target transformation relationship.
  • an image acquisition module to collect a two-dimensional image, and there is no need to extract the image feature of each pixel from the two-dimensional image. Instead, it is based on the internal reference matrix and each pixel in the depth image. Convert the pixel coordinates corresponding to the pixel points into camera coordinates. Then, through an iterative strategy, the camera coordinates of each pixel point are matched with the target world coordinates of multiple voxels to achieve the image acquisition model. Accurate positioning of the group; in this way, the implementation complexity of the positioning method can be reduced, thereby achieving positioning more efficiently and meeting the real-time requirements of positioning.
  • the embodiment of the present application further provides a positioning method, and the method may include the following steps S301 to S307:
  • Step S301 According to the internal parameter matrix of the image acquisition module and the depth values of multiple pixels in the depth image acquired by the image acquisition module, the pixel coordinates of the multiple pixels are converted into camera coordinates.
  • Step S302 selecting an initial target voxel that matches each pixel point from a plurality of voxels in the pre-built point cloud map.
  • the electronic device may set the initial transformation relationship of the camera coordinate system with respect to the world coordinate system; then, according to the camera coordinates of the pixel point and the initial transformation relationship, the pixel point is matched with the plurality of voxels, thereby An initial target voxel matching the pixel point is selected from the plurality of voxels.
  • the initial target voxel may be selected through steps S402 to S404 in the following embodiments.
  • step S302 the purpose is to select voxels that may match the pixel, that is, the selected initial target voxel may not be an object that actually matches the pixel; therefore, the following steps S303 to S306 are required. , And further determine whether the initial target voxel is an object that really matches the pixel.
  • Step S303 Determine a first transformation relationship between the camera coordinate system and the world coordinate system according to the camera coordinates of each pixel and the target world coordinates of the corresponding initial target voxel.
  • the electronic device may construct an error function according to the camera coordinates of each pixel and the target world coordinates of the corresponding initial target voxel; then, the current optimal first transformation relationship is solved by the least square method.
  • the camera coordinate pixel point is represented by p i
  • the world coordinates are expressed by q i , then the following formula (3) can be listed:
  • E(R, T) is the error function
  • R and T are respectively the rotation matrix and translation vector in the first transformation relationship to be solved. Then, the optimal solution of R and T in equation (1) can be solved by the least square method.
  • Step S304 Determine a matching error according to the first transformation relationship, the camera coordinates of each pixel point and the target world coordinates of the corresponding initial target voxel.
  • the matching error refers to the overall matching error, that is, the matching error of all pixels.
  • the camera coordinates of each pixel can be transformed into the corresponding second world coordinates according to the first transformation relationship. If the initial target voxel and pixel selected in step S302 represent the same location point or two similar location points in the actual physical space, then the second world coordinate of the pixel point should be the same as the corresponding initial target volume. The target world coordinates of the elements are the same or similar.
  • the matching error can be determined through the following steps S406 and S407, and based on the matching error and the preset threshold, it is determined whether the initial target voxel is a point that actually matches the pixel point, and then the target transformation relationship is determined.
  • step S305 if the matching error is greater than the preset threshold, return to step S302, reselect the initial target voxel, and re-determine the matching error.
  • the matching error is greater than the preset threshold, it means that the currently selected initial target voxel is not a voxel that matches the pixel point, and the two refer to not the same location point or similar location points in the physical space.
  • the threshold it is considered that the initial target voxel selected in the current iteration is a point that truly matches the pixel point.
  • the first transformation relationship obtained in the current iteration can be determined as the target transformation relationship.
  • the positioning result of the image acquisition module in the point cloud map is determined according to the second transformation relationship obtained in the current iteration.
  • Step S306 Determine the first transformation relationship when the re-determined matching error is less than or equal to the preset threshold value as the target transformation relationship;
  • Step S307 Determine the positioning result of the image acquisition module according to the target transformation relationship.
  • the embodiment of the present application further provides a positioning method, and the method may include the following steps S401 to S410:
  • Step S401 Convert the pixel coordinates of the multiple pixels into camera coordinates according to the internal parameter matrix of the image acquisition module and the depth values of multiple pixels in the depth image acquired by the image acquisition module;
  • Step S402 Obtain a second transformation relationship between the camera coordinate system and the world coordinate system.
  • the electronic device may set an initial value of the rotation matrix and the translation vector in the second transformation relationship, that is, set the initial value of the second transformation relationship.
  • Step S403 Determine the first world coordinate of the j-th pixel according to the second transformation relationship and the camera coordinate of the j-th pixel in the depth image, where j is an integer greater than 0;
  • Step S404 Match the first world coordinates of each pixel with the target world coordinates of the multiple voxels to obtain a corresponding initial target voxel.
  • the electronic device may determine the distance between the first world coordinate of the pixel and the target world coordinate of each voxel, and then determine the voxel closest to the pixel as the initial target voxel, or set the distance less than The voxel equal to the distance threshold is determined as the initial target voxel. In some embodiments, the electronic device may determine the Euclidean distance between the first world coordinates of the pixel and the target world coordinates of the voxel, and use the Euclidean distance as the distance between the pixel and the voxel.
  • Step S405 Determine a first transformation relationship between the camera coordinate system and the world coordinate system according to the camera coordinates of each pixel and the target world coordinates of the corresponding initial target voxel;
  • Step S406 Determine the second world coordinate of the j-th pixel according to the first transformation relationship and the camera coordinate of the j-th pixel in the depth image, where j is an integer greater than 0;
  • Step S407 Determine the matching error according to the second world coordinates of each pixel and the target world coordinates of the corresponding initial target voxel.
  • the electronic device may first determine the distance (for example, Euclidean distance) between the second world coordinates of each pixel and the target world coordinates of the corresponding initial target voxel; then, according to each of the distances, Determine the matching error.
  • the distance for example, Euclidean distance
  • the average distance between a plurality of pixel points and the matched initial target voxel may be determined as the matching error.
  • the second The world coordinates are represented by p′ i
  • the target world coordinate of the initial target voxel is represented by q i
  • the matching error d can be obtained by the following formula (4):
  • Step S408 If the matching error is greater than the preset threshold, the first transformation relationship is used as the second transformation relationship, the initial target voxel is reselected, and the matching error is re-determined until the re-determined matching error is less than the Until the preset threshold value is reached, step 409 is entered.
  • the matching error is greater than the preset threshold, it means that the acquired second transformation relationship does not conform to reality.
  • the obtained initial target voxel is not a voxel that really matches the pixel point.
  • the currently determined first transformation relationship can be used as the second transformation relationship, and then step S403 to step S407 are executed again.
  • the initial target voxel selected at this time may be the voxel that matches the pixel point, that is, the two correspond to the same position point in the physical space.
  • Step S409 Determine the first transformation relationship when the re-determined matching error is less than or equal to the preset threshold value as the target transformation relationship;
  • Step S410 Determine the positioning result of the image acquisition module according to the target transformation relationship.
  • the positioning method provided in the embodiments of the present application relies on a pre-built point cloud map.
  • the pre-built point cloud map is usually stored in an electronic device or in other electronic devices (such as a server).
  • the electronic device is implementing The positioning method only needs to load the locally stored point cloud map or request the map from other electronic devices; wherein, the construction process of the point cloud map is shown in FIG. 2 and may include the following steps S501 to S503:
  • Step S501 Perform quantization processing on the size of a specific physical space to obtain initial world coordinates of multiple voxels.
  • the specific physical space refers to the physical scene covered by the point cloud map, for example, the specific physical space is a certain building, a large airport, a shopping mall, a certain city, etc.
  • a voxel is actually the smallest unit in the specific physical space.
  • the specific physical space is regarded as a cube 301 with a certain size, and then the cube is meshed with voxel 302 as the unit to obtain multiple voxels; taking the world coordinate system as the reference coordinate system, Determine the initial world coordinate system of each voxel.
  • the size of the particular physical space is 512 ⁇ 512 ⁇ 512m 3, voxel size of 1 ⁇ 1 ⁇ 1m 3, then in 1 ⁇ 1 ⁇ 1m 3 units of voxels size of 512 ⁇ 512 ⁇ 512m
  • the physical space of 3 sizes is quantized, and the initial world coordinates of 512 ⁇ 512 ⁇ 512 voxels can be obtained.
  • the quantization process includes quantizing the size of a specific physical space and determining the initial world coordinates of each voxel.
  • the quantization unit that is, the size of the voxel
  • the size of voxels is not limited. In practical applications, the size of voxels can be designed according to engineering requirements.
  • Step S502 According to the multiple sample image pairs collected by the image collection module in the specific physical space, the initial world coordinates of each voxel are updated to obtain the target world coordinates of each voxel ,
  • the sample image pair includes a two-dimensional sample image and a depth sample image.
  • the two-dimensional sample image may be a plane image that does not contain depth information.
  • the two-dimensional sample image is an RGB image.
  • the electronic device may collect the two-dimensional sample through the first camera module in the image acquisition module. image.
  • the depth sample image refers to an image containing depth information.
  • the electronic device may collect the depth sample image through a second camera module (such as a binocular camera, etc.) in the image acquisition module.
  • the electronic device may implement step S502 through step S602 to step S604 in the following embodiment.
  • Step S503 Construct the point cloud map according to the target world coordinates of each voxel. That is, the point cloud map includes the target world coordinates of each voxel.
  • the image acquisition module collects sample images at different times or at different locations, the shooting scenes thereof have overlapping areas. That is to say, different sample images include part of the same image content, which makes when constructing a point cloud map based on these sample images, a large amount of redundant information is introduced, and the same location in the physical space may be covered by multiple pixels.
  • the same or similar world coordinates are expressed in the point cloud map, which greatly increases the data volume of the point cloud map and affects the positioning speed. Obviously, this kind of point cloud map with a large amount of redundant information is not good for obtaining high-precision positioning results in visual positioning.
  • the point cloud map is constructed in the form of voxels, that is, the initial world coordinates of each voxel are updated (that is, corrected, optimized) through multiple sample image pairs collected, Thus, a point cloud map including the target world coordinates of each voxel is obtained.
  • This method of constructing a point cloud map is equivalent to fusing the world coordinates of all the pixels covered by the voxel into a world coordinate value; in this way, it is solved that the same position in the physical space is divided by multiple pixels with the same or The similar world coordinates represent the above-mentioned problems caused by the point cloud map, and a large amount of redundant information is removed.
  • the positioning speed can be increased, so that the positioning service has better real-time performance, and on the other hand, the positioning accuracy of visual positioning can be improved.
  • a higher positioning accuracy can be obtained by reducing the size of the voxel.
  • the smaller the size of the voxel the higher the positioning accuracy obtained.
  • the embodiment of the present application further provides a point cloud map construction process, and the process may include the following steps S601 to S605:
  • Step S601 quantify the size of the specific physical space to obtain the initial world coordinates of multiple voxels
  • Step S602 controlling the image acquisition module to acquire a sample image pair according to a preset frame rate.
  • the image acquisition module can collect sample image pairs while moving.
  • the collection of sample image pairs can be realized by a robot with an image collection module.
  • data collection personnel can carry image collection modules to collect images while walking.
  • Step S603 Update the initial world coordinates of each voxel according to the first sample image pair collected by the image collection module at the current moment and the second sample image pair collected at the historical moment.
  • the electronic device may implement step S603 through step S703 to step S705 in the following embodiments.
  • Step S604 Continue to update the current world coordinates of each voxel according to the first sample image pair and the third sample image pair acquired by the image acquisition module at the next moment, until the end of the sample image acquisition, The current world coordinate of each voxel is used as the target world coordinate corresponding to the voxel.
  • step S604 the current world coordinates of each voxel are continuously updated, and what is updated is the target world coordinates of each voxel obtained by updating in step S603.
  • the electronic device can update the current world coordinates of each voxel in real time according to the sample image pair collected at the current time by the image acquisition module and the sample image pair collected at the historical time, until the image collection module Until the end of the image collection task, the current updated target world coordinates of each voxel will be used as the target world coordinates corresponding to the voxel.
  • Step S605 Construct the point cloud map according to the target world coordinates of each voxel.
  • the current world coordinates of each voxel are updated by using the collected sample image pairs. That is, the electronic device continuously uses the pair of sample images collected at the current moment by the image collection module and the pair of sample images collected at a historical moment (for example, the previous moment) to update the current world coordinates of each voxel. Since the two sample images obtained at the time before and after have more overlapping areas, the electronic device does not need to find the two sample image pairs with the most overlapping areas from the multiple sample image pairs, and then use these two sample image pairs. Update the current world coordinates of each voxel; in this way, the efficiency of map construction can be greatly improved.
  • the embodiment of the present application further provides a point cloud map construction process, and the process at least includes the following steps S701 to S707:
  • Step S701 quantize the size of the specific physical space to obtain the initial world coordinates of multiple voxels
  • Step S702 controlling the image acquisition module to acquire a sample image pair according to a preset frame rate
  • Step S703 Determine the current camera coordinates of each voxel according to the first sample image pair collected by the image collection module at the current moment and the second sample image pair collected at the historical moment.
  • the electronic device may determine the current transformation relationship of the camera coordinate system with respect to the world coordinate system according to the first sample image pair and the second sample image pair; then, according to the current transformation relationship, The initial world coordinates of each pixel are converted to the current camera coordinates.
  • the electronic device may be based on the image characteristics of the pixels of the two-dimensional sample image in the first sample image pair, the depth value of the pixel points of the depth sample image in the first sample image pair, and the second sample image
  • the image feature of the pixel of the two-dimensional sample image and the depth value of the pixel of the depth sample image in the second sample image are aligned to determine the current transformation relationship. Based on this, the initial world coordinates of the voxel are converted to the current camera coordinates according to the following formula (5).
  • (x c , y c , z c ) represents the camera coordinates
  • the transformation relationship includes the rotation matrix R and the translation vector T
  • (x w , y w , z w ) represents the world coordinates.
  • Step S704 Obtain a depth value corresponding to the current pixel coordinate of each voxel from the depth sample image of the first sample image pair.
  • the electronic device can convert the current camera coordinates of each voxel to the current pixel coordinates according to the internal parameter matrix of the image acquisition module; from the depth sample image of the first sample image pair, obtain the same value as each voxel. The depth value corresponding to the current pixel coordinate of.
  • Step S705 according to the current camera coordinates of each voxel and the depth value corresponding to the current pixel coordinates of each voxel, update the initial world coordinates corresponding to the voxel.
  • the electronic device can obtain the historical distance of each voxel to the surface of the object; the Z-axis coordinate value of the current camera coordinate of each voxel, the depth value corresponding to the current pixel coordinate of each voxel, and each voxel
  • the historical distance to the surface of the object is input into the distance model corresponding to the voxel to update the historical distance to obtain the target distance; the target distance from each voxel to the surface of the object is updated to the initial world coordinates corresponding to the voxel
  • the Z-axis coordinate value in to achieve the update of the initial world coordinate corresponding to the voxel.
  • the distance model corresponding to the voxel is as follows:
  • W t represents the weight of the voxel at the current time t
  • W t-1 represents the weight of the voxel at the previous time t-1
  • maxweight is the maximum weight of all voxels at the previous time t-1
  • D t represents the depth value corresponding to the current pixel coordinate of the voxel
  • z c represents the Z-axis coordinate value of the current camera coordinate of the voxel
  • max truncation and min truncation represent the maximum and minimum truncation range respectively
  • D t-1 represents The distance from the voxel to the surface of the object determined at the previous time t-1 (that is, an example of the historical distance), and D t is the target distance from the voxel currently to be found to the surface of the object.
  • the Z-axis coordinate value z c of the current camera coordinate of the voxel, the depth value D t corresponding to the current pixel coordinate of the voxel, and the historical distance from the voxel to the surface of the object are input into the distance model shown in formula (6),
  • the target distance from the voxel to the surface of the object can be obtained.
  • the historical distance from the voxel to the surface of the object is considered; in this way, the initial world coordinates of the updated voxel are smoother, so that the final voxel’s
  • the target world coordinates are more accurate, and in the positioning stage, the positioning accuracy can be improved.
  • Step S706 Continue to update the current world coordinates of each voxel according to the first sample image pair and the third sample image pair acquired by the image acquisition module at the next moment, until the end of the sample image acquisition, Use the current world coordinates of the voxel as the target world coordinates.
  • the electronic device continues to update the current world coordinates of each voxel by performing steps S703 to S705 similar to that.
  • Step S707 Construct the point cloud map according to the target world coordinates of each voxel.
  • the embodiment of the present application further provides a point cloud map construction process, which may include the following steps S801 to S811:
  • Step S801 quantize the size of the specific physical space to obtain the initial world coordinates of multiple voxels
  • Step S802 controlling the image acquisition module to acquire a sample image pair according to a preset frame rate
  • Step S803 Determine the current transformation relationship between the camera coordinate system and the world coordinate system according to the first sample image pair collected at the current moment by the image collection module and the second sample image pair collected at the historical moment;
  • Step S804 Convert the initial world coordinates of each voxel to current camera coordinates according to the current transformation relationship
  • Step S805 Convert the current camera coordinates of each voxel to current pixel coordinates according to the internal parameter matrix of the image acquisition module;
  • Step S806 Obtain a depth value corresponding to the current pixel coordinate of each voxel from the depth sample image of the first sample image pair;
  • Step S807 Obtain the historical distance from each voxel to the surface of the object
  • Step S808 input the Z-axis coordinate value of the current camera coordinate of each voxel, the depth value corresponding to the current pixel coordinate of each voxel, and the historical distance of each voxel to the surface of the object.
  • the distance model corresponding to the voxel to update the historical distance to obtain the target distance;
  • Step S809 updating the target distance of each voxel to the Z-axis coordinate value in the initial world coordinates corresponding to the voxel, so as to realize the update of the initial world coordinates corresponding to the voxel;
  • Step S810 Continue to update the current world coordinates of each voxel according to the first sample image pair and the third sample image pair acquired by the image acquisition module at the next moment, until the end of the sample image acquisition, Use the current world coordinates of the voxel as the target world coordinates.
  • the electronic device continues to update the current world coordinates of each voxel by executing steps S803 to S810 similar to that.
  • Step S811 construct the point cloud map according to the target world coordinates of each voxel.
  • an indoor environment map can be established to help users quickly locate their own location and surrounding environment.
  • This solution matches the background with a pre-determined indoor map of the building, determines the corresponding position of the background indoors, and then confirms the position of the person indoors according to the position of the background.
  • the image recognition method is used to identify fixed-position objects in the background of the image, and according to the relative position relationship of the fixed-position objects, the position of the person at a certain moment is determined.
  • the core technical points of the program are: 1. Building indoor environment mapping through visual images; 2. Image matching; 3. Recognizing people and objects in the image.
  • the technology only considers the two-dimensional image characteristics of the visual image, and cannot obtain the posture information of the characters after positioning, and the positioning accuracy is low; 2.
  • the technology can target the images in the image. Characters or fixed objects are located, but the premise is that there must be recognizable characters or fixed objects in the image, otherwise the positioning will be invalid and the positioning reliability is low; 3.
  • This technology cannot cope with the positioning needs of light changing scenes, such as Positioning under different conditions during day and night has poor positioning robustness.
  • the embodiments of the present application implement an indoor environment reconstruction and positioning technology based on dense point clouds, which can help users create indoor maps in the form of dense point clouds (ie, an example of point cloud maps), and locate the user's location in real time.
  • This solution can extract image features for visual tracking and motion estimation for indoor scenes, and build dense maps; the positioning process does not depend on external base station equipment, with high positioning accuracy and strong robustness.
  • the program consists of two main parts: building a map and visual positioning.
  • the construction of the map part is mainly to collect RGB image information through the monocular camera in the image acquisition module, extract image features for visual tracking, and use the three-dimensional vision sensor in the image acquisition module (such as TOF, structure Light, etc.), collecting depth information to construct a dense point cloud map (that is, an example of a point cloud map), the specific technical steps include at least the following steps S11 to S17:
  • Step S11 using a monocular camera to collect RGB images at a fixed frame rate
  • Step S12 using a three-dimensional vision sensor to collect depth images at a fixed frame rate
  • Step S13 aligning the RGB image and the depth image, including time stamp alignment and pixel alignment;
  • Step S14 extracting the feature information in the RGB image and the depth information in the depth image in real time during the acquisition process to perform visual tracking and motion estimation on the image acquisition module, and determine the current transformation relationship of the camera coordinate system with respect to the world coordinate system;
  • Step S15 Obtain a dense point cloud through the depth image and the internal parameter matrix of the camera, and the dense point cloud includes the camera coordinates of each pixel;
  • the so-called dense point cloud is relative to the sparse point cloud.
  • the number of sampling points in the dense point cloud is far greater than the number of sampling points in the sparse point cloud.
  • Step S16 using the TSDF algorithm to fuse the dense point cloud in the form of voxels
  • step S17 the fused dense point cloud is stored, and serialized and stored locally as a dense point cloud map.
  • the depth image is also called the distance image, which refers to the image in which the distance from the image acquisition module to each point in the shooting scene is used as the pixel value.
  • the depth image intuitively reflects the geometric shape of the visible surface of the object.
  • each pixel represents the distance from the object at the specific coordinate to the camera plane in the field of view of the 3D vision sensor.
  • the pixel point (u, v) can be converted to camera coordinates (x c , y c , z c ) through the following formula (7):
  • (u 0 , v 0 ) is the pixel coordinate of the center point of the image
  • z c represents the z-axis value of the camera coordinate, that is, the depth value corresponding to the pixel
  • the Z axis of the camera coordinate system is the optical axis of the lens
  • the depth value of the pixel point (u, v) is the Z axis coordinate value z c of the camera coordinate of the pixel point.
  • Step S161 first obtain the coordinates V g (x, y, z) of the voxel in the global coordinate system (ie the target world coordinates of the voxel), and then obtain the transformation matrix obtained from the motion tracking (ie the current output of step S14). Transformation relationship) convert it from global coordinates to camera coordinates V(x,y,z);
  • Step S162 Convert camera coordinates V (x, y, z) into image coordinates according to the camera's internal parameter matrix to obtain an image coordinate (u, v);
  • Step S163 if the depth value D(u,v) of the depth image of the lth frame at the image coordinates (u,v) is not 0, compare D(u,v) with the voxel camera coordinates V(x,y) ,z) the size of z, if D(u,v) ⁇ z, it means that the voxel is farther from the camera, inside the fusion surface; otherwise, it means that the voxel is closer to the camera, outside the fusion surface;
  • step S164 the distance D l and the weight value W l in this voxel are updated according to the result of step S163, and the update formula is shown in the following equation (8):
  • W l (x,y,z) is the weight of the voxel in the global data cube of the current frame
  • W l-1 (x,y,z) is the weight of the voxel in the global data cube of the previous frame
  • maxweight is The maximum weight among the weights of all voxels in the global data cube in the previous frame can be set to 1.
  • D l (x,y,z) is the distance between the voxels in the current global data cube and the surface of the object
  • D l-1 (x, y, z) is the distance from the voxel of the global data cube in the previous frame to the surface of the object
  • d l (x, y, z) is the distance from the voxel in the global data cube to the surface of the object calculated according to the depth data of the current frame Distance
  • z represents the z-axis coordinates of the voxel in the camera coordinate system
  • D l (u, v) represents the depth value of the current frame depth image at the pixel point (u, v)
  • [min truncation, max truncation] is truncation Range, which will affect the fineness of the reconstruction result.
  • a dense point cloud map based on the dense point cloud can be constructed (that is, an example of a point cloud map), which stores the dense point cloud map in a binary format to the local, during the visual positioning process , The map will be loaded and used.
  • the visual positioning part collects depth information and converts it into a point cloud by using a three-dimensional vision sensor, and then matches the dense point cloud map by iterative Closest Point (ICP) to obtain the current camera
  • ICP iterative Closest Point
  • Step S21 load the constructed dense point cloud map
  • Step S22 using a three-dimensional vision sensor to collect a depth image to obtain a depth image to be processed
  • Step S23 Obtain a current point cloud through the depth image and the internal parameter matrix of the camera, and the current point cloud includes the camera coordinates of the pixels of the depth image;
  • step S24 the current point cloud and the dense point cloud map are matched by the ICP algorithm to obtain the accurate pose of the current camera in the map.
  • step S15 For the method of obtaining the current point cloud through the depth image and the camera's internal parameter matrix in step S23, reference may be made to step S15.
  • the ICP algorithm is essentially an optimal registration method based on the least squares method. The algorithm repeatedly selects the corresponding point pairs, calculates the optimal rigid body transformation, and knows that the convergence accuracy requirements for correct registration are met.
  • ICP algorithm is the basic principle: to be respectively matched target point cloud point cloud P and Q in the source, according to certain constraints, find the nearest point (p i, q i), and then calculate the optimal rotation R Translate T to minimize the error function.
  • the error function E(R,T) is as in formula (9):
  • n is the number of adjacent point pairs
  • p i is a point in the target point cloud P
  • q i is the closest point corresponding to p i in the source point cloud Q
  • R is the rotation matrix
  • T is the translation vector.
  • Step S241 take a point set p i ⁇ P in the current point cloud P;
  • Step S242 find out the corresponding point set q i ⁇ Q in the dense point cloud map Q, such that
  • min;
  • Step S243 Calculate the rotation matrix R and the translation matrix T to minimize the error function
  • Step S245 the calculation of p 'i and the average distance corresponding to a set of points q i
  • step S246 if d is less than the given threshold d TH or greater than the preset number of iterations, the iterative calculation is stopped, and the algorithm outputs the current rotation matrix R and translation matrix T; otherwise, skip back to step S242.
  • the positioning purpose is achieved in the predefined dense point cloud map, and the position and posture of the image acquisition module in the map coordinate system are obtained. .
  • the positioning result has high accuracy, does not need to rely on external base station equipment, has strong resistance to environmental interference and strong robustness.
  • the positioning method provided by the embodiments of the present application can obtain the following technical effects: 1.
  • the depth information can be obtained by using a three-dimensional vision sensor, so that the depth information is used to realize map construction and positioning, so that the positioning accuracy will not be affected by the illumination change.
  • the method is highly robust; 2. It can provide both position and attitude in the positioning results, which improves the positioning accuracy compared to other indoor positioning methods; 3.
  • the positioning method does not need to introduce algorithms with higher error rates such as object recognition. The positioning success rate is high, and the robustness is strong; 4.
  • the constructed map form is a dense point cloud map, which does not need to store the RGB information of the environment, so the privacy of the map is better.
  • a three-dimensional vision sensor is used to collect depth information to construct a dense point cloud map, and a high-precision and high-robust point cloud matching algorithm is combined to perform indoor environment positioning.
  • map construction the embodiment of the present application collects depth image information by using a three-dimensional vision sensor, and stores it as a dense point cloud map in the form of a dense point cloud.
  • the embodiment of the present application uses the ICP algorithm to match the current point cloud and the map dense point cloud, and accurately calculates the current position and posture of the user. The combination of the two forms a set of high-precision and high-robust indoor positioning methods.
  • the embodiment of the present application provides a positioning device, which includes each module included and each unit included in each module, which can be implemented by a processor in an electronic device; of course, it can also be specifically
  • the processor can be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA).
  • CPU central processing unit
  • MPU microprocessor
  • DSP digital signal processor
  • FPGA field programmable gate array
  • FIG. 4A is a schematic diagram of the composition structure of a positioning device according to an embodiment of the application.
  • the device 400 includes a coordinate conversion module 401 and a positioning module 402, where
  • the coordinate conversion module 401 is configured to convert the pixel coordinates of the multiple pixels into camera coordinates according to the internal parameter matrix of the image acquisition module and the depth values of multiple pixels in the depth image collected by the image acquisition module;
  • the positioning module 402 is configured to match the camera coordinates of each pixel with the target world coordinates of multiple voxels in a pre-built point cloud map to obtain a positioning result of the image acquisition module;
  • the target world coordinates of each voxel are obtained by updating the initial world coordinates of each voxel according to a plurality of sample image pairs, and the sample image pairs include a two-dimensional sample image and a depth sample image.
  • the device 400 further includes a quantization processing module 403, a coordinate update module 404, and a map construction module 405; wherein the quantization processing module 403 is configured to quantify the size of a specific physical space Processing to obtain the initial world coordinates of a plurality of voxels; the coordinate update module 404 is configured to perform an analysis of the initial world coordinates of each voxel according to a plurality of sample image pairs collected by the image acquisition module in the specific physical space. The coordinates are updated to obtain the target world coordinates of each voxel.
  • the sample image pair includes a two-dimensional sample image and a depth sample image; the map construction module 405 is configured to obtain the target world coordinates of each voxel, Constructing the point cloud map.
  • the coordinate updating module 404 includes: a control sub-module configured to control the image acquisition module to collect sample image pairs at a preset frame rate; and the coordinate updating sub-module configured to control the image acquisition module according to the The first sample image pair acquired at the current moment and the second sample image pair acquired at the historical moment update the initial world coordinates of each voxel; according to the first sample image pair and the image acquisition model Group the third sample image pair collected at the next moment, and continue to update the current world coordinates of each voxel until the end of sample image collection, use the current world coordinates of the voxel as the target world coordinates.
  • the coordinate update submodule includes: a camera coordinate determination unit configured to determine the current camera of each voxel according to the first sample image pair and the second sample image pair Coordinates; a depth value acquisition unit configured to acquire a depth value corresponding to the current pixel coordinates of each voxel from the depth sample image of the first sample image pair; the coordinate update unit is configured to: according to each One of the current camera coordinates of the voxel and the depth value corresponding to the current pixel coordinates of each voxel, updating the initial world coordinates corresponding to the voxel; according to the first sample image pair and the The third sample image pair acquired by the image acquisition module at the next moment continues to update the current world coordinates of each voxel until the end of the sample image acquisition, the current world coordinates of the voxel are used as the target world coordinates .
  • the coordinate update unit is configured to: obtain the historical distance of each voxel to the surface of the object; and compare the Z-axis coordinate value of the current camera coordinate of each voxel with each of the voxels.
  • the depth value corresponding to the current pixel coordinates of the voxel and the historical distance from each voxel to the surface of the object are input into the distance model corresponding to the voxel to update the historical distance to obtain the target distance;
  • the target distance from each voxel to the surface of the object is updated to the Z-axis coordinate value in the initial world coordinates corresponding to the voxel, so as to realize the update of the initial world coordinates corresponding to the voxel;
  • the first sample image pair and the third sample image pair acquired by the image acquisition module at the next moment continue to update the current world coordinates of each voxel until the sample image acquisition ends, the voxel
  • the current world coordinates of is used as the target world coordinates.
  • the camera coordinate determination unit is configured to: determine the current transformation relationship of the camera coordinate system relative to the world coordinate system according to the first sample image pair and the second sample image pair; According to the current transformation relationship, the initial world coordinates of each voxel is converted into the current camera coordinates.
  • the depth value acquisition unit is configured to: convert the current camera coordinates of each voxel into current pixel coordinates according to the internal parameter matrix of the image acquisition module; In the depth sample image of this image pair, the depth value corresponding to the current pixel coordinate of each voxel is obtained.
  • the positioning module 402 includes an iterative sub-module configured to match the camera coordinates of each pixel with the target world coordinates of the multiple voxels according to an iterative strategy to obtain a camera The target transformation relationship of the coordinate system relative to the world coordinate system; the positioning sub-module is configured to determine the positioning result of the image acquisition module according to the target transformation relationship.
  • the iteration sub-module includes: a selection unit configured to select an initial target voxel that matches each pixel point from the plurality of voxels; and the determination unit is configured to: according to each The camera coordinates of the pixel points and the target world coordinates of the corresponding initial target voxel determine the first transformation relationship of the camera coordinate system with respect to the world coordinate system; according to the first transformation relationship, each of the The camera coordinates of the pixel points and the target world coordinates of the corresponding initial target voxel are determined to determine the matching error; if the matching error is greater than the preset threshold, the initial target voxel is reselected, and the matching error is re-determined; the matching error will be re-determined
  • the first transformation relationship when it is less than or equal to the preset threshold is determined as the target transformation relationship.
  • the selecting unit is configured to: obtain a second transformation relationship of the camera coordinate system with respect to the world coordinate system; according to the second transformation relationship and the camera coordinates of the j-th pixel, Determine the first world coordinate of the j-th pixel, where j is an integer greater than 0; match the first world coordinate of each pixel with the target world coordinates of the multiple voxels to obtain the corresponding The initial target voxel.
  • the determining unit is configured to determine the second world coordinate of the j-th pixel according to the first transformation relationship and the camera coordinates of the j-th pixel, where j is greater than 0 Integer; the matching error is determined according to the second world coordinates of each pixel and the target world coordinates of the corresponding initial target voxel.
  • the determining unit is configured to: determine the distance between the second world coordinates of each pixel point and the target world coordinates of the corresponding initial target voxel; and determine the distance between the second world coordinates of each pixel point and the target world coordinates of the corresponding initial target voxel; The matching error.
  • the selection unit is configured to: if the matching error is greater than a preset threshold, use the first transformation relationship as the second transformation relationship, and reselect an initial target voxel.
  • the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence or the parts that contribute to related technologies.
  • the computer software products are stored in a storage medium and include several instructions to enable An electronic device (which may be a mobile phone, a tablet computer, a notebook computer, a desktop computer, a server, a robot, a drone, etc.) executes all or part of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), magnetic disk or optical disk and other media that can store program codes. In this way, the embodiments of the present application are not limited to any specific combination of hardware and software.
  • FIG. 5 is a schematic diagram of a hardware entity of the electronic device according to an embodiment of the application.
  • the hardware entity of the electronic device 500 includes: a memory 501 and a processor. 502.
  • the memory 501 stores a computer program that can be run on the processor 502, and the processor 502 implements the steps in the positioning method provided in the foregoing embodiment when the processor 502 executes the program.
  • the memory 501 is configured to store instructions and applications executable by the processor 502, and can also cache data to be processed or processed by the processor 502 and each module in the electronic device 500 (for example, image data, audio data, voice communication data, and Video communication data) can be implemented by flash memory (FLASH) or random access memory (Random Access Memory, RAM).
  • FLASH flash memory
  • RAM Random Access Memory
  • an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the positioning method provided in the foregoing embodiment are implemented.
  • the disclosed device and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, such as: multiple units or components can be combined, or It can be integrated into another system, or some features can be ignored or not implemented.
  • the coupling, or direct coupling, or communication connection between the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms. of.
  • the units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units; they may be located in one place or distributed on multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • the functional units in the embodiments of the present application can be all integrated into one processing unit, or each unit can be individually used as a unit, or two or more units can be integrated into one unit;
  • the unit can be implemented in the form of hardware, or in the form of hardware plus software functional units.
  • the foregoing program can be stored in a computer readable storage medium.
  • the execution includes The steps of the foregoing method embodiment; and the foregoing storage medium includes various media that can store program codes, such as a mobile storage device, a read only memory (Read Only Memory, ROM), a magnetic disk, or an optical disk.
  • ROM Read Only Memory
  • the above-mentioned integrated unit of the present application is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer readable storage medium.
  • the computer software products are stored in a storage medium and include several instructions to enable An electronic device (which may be a mobile phone, a tablet computer, a notebook computer, a desktop computer, a server, a robot, a drone, etc.) executes all or part of the method described in each embodiment of the present application.
  • the aforementioned storage media include: removable storage devices, ROMs, magnetic disks or optical discs and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Image Processing (AREA)

Abstract

本申请实施例公开了定位方法及装置、设备、存储介质,其中,所述方法包括:根据图像采集模组的内参矩阵和所述图像采集模组采集的深度图像中多个像素点的深度值,将所述多个像素点的像素坐标转换为相机坐标;将每一所述像素点的相机坐标与预先构建的点云地图中多个体素的目标世界坐标进行匹配,得出所述图像采集模组的定位结果;其中,每一所述体素的目标世界坐标是根据多个样本图像对,更新每一所述体素的初始世界坐标而获得的,所述样本图像对包括二维样本图像和深度样本图像。

Description

定位方法及装置、设备、存储介质
相关申请的交叉引用
本申请基于申请号为201910921654.0、申请日为2019年09月27日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以全文引入的方式引入本申请。
技术领域
本申请实施例涉及电子技术,涉及但不限于定位方法及装置、设备、存储介质。
背景技术
基于图像信息进行定位的相关技术中,目前主要是通过识别摄像头模组采集到的图像中的人物和固定物体,确定人物的位置。该方案将图像中该固定物体与预先构建的室内地图进行匹配,从而确定该固定物体在室内的对应位置;然后,根据该固定物体的位置确定该人物在室内的位置;其中,确定该人物的位置的整体思路为:通过图像识别的方法识别图像中的固定物体,并根据图像中该固定物体与人物之间的相对位置关系和该固定物体在室内的位置,确定该人物的位置。
然而,这种定位方法主要根据图像中的人物和固定物体之间的相对位置关系进行定位,这样,在实现定位时就要求图像中必须有可以识别出的人物和固定物体,否则定位就会失效,所以这种定位方法的鲁棒性较差。
发明内容
有鉴于此,本申请实施例提供的定位方法及装置、设备、存储介质,定位方法不依赖于图像中必须有固定物体和待定位对象,因此具有较好的鲁棒性。本申请实施例的技术方案是这样实现的:
本申请实施例提供的定位方法,包括:根据图像采集模组的内参矩阵和所述图像采集模组采集的深度图像中多个像素点的深度值,将所述多个像素点的像素坐标转换为相机坐标;将每一所述像素点的相机坐标与预先构建的点云地图中多个体素的目标世界坐标进行匹配,得出所述图像采集模组的定位结果;其中,每一所述体素的目标世界坐标是根据多个样本图像对,更新每一所述体素的初始世界坐标而获得的,所述样本图像对包括二维样本图像和深度样本图像。
本申请实施例提供的定位装置,包括:坐标转换模块,配置为根据图像采集模组的内参矩阵和所述图像采集模组采集的深度图像中多个像素点的深度值,将所述多个像素点的像素坐标转换为相机坐标;定位模块,配置为将每一所述像素点的相机坐标与预先构建的点云地图中多个体素的目标世界坐标进行匹配,得出所述图像采集模组的定位结果;其中,每一所述体素的目标世界坐标是根据多个样本图像对,更新每一所述体素的初始世界坐标而获得的,所述样本图像对包括二维样本图像和深度样本图像。
本申请实施例提供的电子设备,包括存储器和处理器,所述存储器存储有可在处理器上运行的计算机程序,所述处理器执行所述程序时实现本申请实施例所述定位方法中的步骤。
本申请实施例提供的计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现本申请实施例所述定位方法中的步骤。
在本申请实施例中,根据图像采集模组采集的深度图像中多个像素点的相机坐标与预先构建的点云地图中多个体素的目标世界坐标进行匹配,即可确定图像采集模组的定位结果;如此,在对携带图像采集模组的待定位对象进行定位时,其定位方法不依赖于所述深度图像中必须有固定物体和待定位对象,所以能够获得较好的鲁棒性。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,这些附图示出了符合本申请的实施例,并于说明书一起用于说明本申请的技术方案。
图1为本申请实施例定位方法的实现流程示意图;
图2为本申请实施例点云地图的构建过程示意图;
图3为本申请实施例对特定物理空间进行量化的实现示意图;
图4A为本申请实施例定位装置的组成结构示意图;
图4B为本申请实施例另一定位装置的组成结构示意图;
图5为本申请实施例电子设备的一种硬件实体示意图。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请的具体技术方案做进一步详细描述。以下实施例用于说明本申请,但不用来限制本申请的范围。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。
需要指出,本申请实施例所涉及的术语“第一\第二\第三”用以区别不同的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。
本申请实施例提供一种定位方法,所述方法可以应用于电子设备,所述电子设备可以是手机、平板电脑、笔记本电脑、台式计算机、服务器、机器人或无人机等具有信息处理能力的设备。所述定位方法所实现的功能可以通过所述电子设备中的处理器调用程序代码来实现,当然程序代码可以保存在计算机存储介质中,可见,所述电子设备至少包括处理器和存储介质。
图1为本申请实施例定位方法的实现流程示意图,如图1所示,所述方法可以包括以下步骤S101至步骤S102:
步骤S101,根据图像采集模组的内参矩阵和所述图像采集模组采集的深度图像中多个像素点的深度值,将所述多个像素点的像素坐标转换为相机坐标。
图像采集模组一般包括第一摄像头模组和第二摄像头模组;其中,第一摄像头模组用于采集场景的二维图像,例如通过单目摄像头采集场景的红、绿、蓝(Red、Green、Blue,RGB)图像;第二摄像头模组用于采集拍摄场景的深度图像,第二摄像头模组可以为双目相机、结构光相机或者飞行时间(Time-of-Flight,TOF)相机等三维视觉传感器。
需要说明的是,电子设备可以包括图像采集模组,也就是说,图像采集模组被安装在电子设备中。例如,电子设备为具有第一摄像头模组和第二摄像头模组的智能手机;当然,在一些实施例中,电子设备也可以不包括图像采集模组,图像采集模组可以将自身的内参矩阵 和采集的深度图像发送给电子设备。
在一些实施例中,如果已知图像采集模组的焦距f和图像中心点的像素坐标(u 0,v 0),如下式(1)所示,图像采集模组的内参矩阵F为:
Figure PCTCN2020116920-appb-000001
式中,
Figure PCTCN2020116920-appb-000002
表示焦距f在相机坐标系的x轴上的焦距分量;
Figure PCTCN2020116920-appb-000003
表示焦距f在相机坐标系的y轴上的焦距分量。
深度图像又被称为距离图像,是指将图像采集模组到场景中物体表面各点的距离作为像素值的图像。也就是说,深度图像中像素点的像素值为在三维视觉传感器的视野中,物体表面上的某个点到该传感器镜面的距离,一般将该像素值称为深度值。因此,在已知像素点j的深度值z c的情况下,可以根据如下公式(2),将像素点j的像素坐标(u,v)转换为相机坐标(x c,y c,z c):
Figure PCTCN2020116920-appb-000004
需要说明的是,由于相机坐标系的Z轴为镜头的光轴,因此像素点j的深度值即为该像素点的相机坐标的Z轴坐标值z c
步骤S102,将每一所述像素点的相机坐标与预先构建的点云地图中多个体素的目标世界坐标进行匹配,得出所述图像采集模组的定位结果。
一般来说,体素的目标世界坐标为体素的某一个点(例如中心点)的世界坐标。在点云地图中,每一体素的目标世界坐标是电子设备根据多个包括二维样本图像和深度样本图像的样本图像对,更新每一体素的初始世界坐标而获得的。点云地图的构建过程可以包括如下实施例中的步骤S501至步骤S503。
在一些实施例中,定位结果包括图像采集模组当前的世界坐标;或者,定位结果包括图像采集模组当前的世界坐标和朝向。也就是,定位结果包括携带图像采集模组的用户(或设备)的位置,或者包括该用户(或设备)的位置和姿态。世界坐标可以是二维坐标,也可以是三维坐标。可以理解地,通过步骤S102得出图像采集模组的朝向,如此使得定位方法能够适用于更多的应用场景。例如,根据机器人的当前朝向,指示机器人执行下一个动作。再如,在导航应用中,已知用户的朝向,能够更加精确地指导用户向哪个方向行走/行驶。又如,在无人驾驶应用中,已知车辆的朝向,能够更准确地控制车辆向哪个方向行驶。
在本申请实施例中,根据图像采集模组采集的深度图像中多个像素点的相机坐标与预先构建的点云地图中多个体素的目标世界坐标进行匹配,即可确定图像采集模组的定位结果;如此,在对所述图像采集模组进行定位时,其定位方法不依赖于所述深度图像中必须有固定物体和待定位对象,所以能够获得较好的鲁棒性。
本申请实施例再提供一种定位方法,所述方法可以包括以下步骤S201至步骤S203:
步骤S201,根据图像采集模组的内参矩阵和所述图像采集模组采集的深度图像中多个 像素点的深度值,将所述多个像素点的像素坐标转换为相机坐标;
步骤S202,根据迭代策略,将每一所述像素点的相机坐标与预先构建的点云地图中多个体素的目标世界坐标进行匹配,得出相机坐标系相对于世界坐标系的目标变换关系。
在本申请实施例中,点云地图中包括体素的目标世界坐标,但不包括体素的图像特征;如此,可以大大降低点云地图的数据量,从而节约点云地图在电子设备中的存储空间。
在点云地图不包括体素的图像特征的情况下,即,在已知图像采集模组采集的深度图像中每一像素点的相机坐标和点云地图中多个体素的目标世界坐标的前提下,通过迭代策略,尝试寻找相机坐标系相对于世界坐标系的目标变换关系,即可实现对图像采集模组的定位。针对目标变换关系的寻找,例如,通过如下实施例步骤S302至步骤S306、或者步骤S402至步骤S409,迭代寻找与每一像素点最邻近(即最匹配)的体素,从而得到目标变换关系。
步骤S203,根据所述目标变换关系,确定所述图像采集模组的定位结果。
在本申请实施例提供的定位方法中,无需图像采集模组采集二维图像,也无需从二维图像中提取每一像素点的图像特征,而是根据内参矩阵和深度图像中每一像素点的深度值,将与所述像素点对应的像素坐标转换为相机坐标然后,通过迭代策略,将每一像素点的相机坐标与多个体素的目标世界坐标进行匹配,即可实现对图像采集模组的精确定位;如此,可以降低定位方法的实现复杂度,从而更加高效地实现定位,满足定位的实时性需求。
本申请实施例再提供一种定位方法,所述方法可以包括以下步骤S301至步骤S307:
步骤S301,根据图像采集模组的内参矩阵和所述图像采集模组采集的深度图像中多个像素点的深度值,将所述多个像素点的像素坐标转换为相机坐标。
步骤S302,从预先构建的点云地图中的多个体素中选取与每一所述像素点匹配的初始目标体素。
在一些实施例中,电子设备可以设置相机坐标系相对于世界坐标系的初始变换关系;然后,根据像素点的相机坐标和初始变换关系,将该像素点与所述多个体素进行匹配,从而从所述多个体素中选取与该像素点匹配的初始目标体素。在一些实施例中,可以通过如下实施例中的步骤S402至步骤S404,选取初始目标体素。
实际上,通过步骤S302,目的是为了选取与像素点可能匹配的体素,也就是说,选取的初始目标体素可能不是与像素点真正匹配的对象;因此,需要通过如下步骤S303至步骤S306,进一步确定初始目标体素是否是与像素点真正匹配的对象。
步骤S303,根据每一所述像素点的相机坐标和对应的初始目标体素的目标世界坐标,确定所述相机坐标系相对于所述世界坐标系的第一变换关系。
在一些实施例中,电子设备可以根据每一像素点的相机坐标和对应的初始目标体素的目标世界坐标,构建误差函数;然后,通过最小二乘法求解当前最优的第一变换关系。例如,包括n个像素点的相机坐标的集合表示为P={p 1,p 2,...,p i,...,p n},像素点的相机坐标用p i来表示,与所述n个像素点匹配的初始目标体素的目标世界坐标的集合表示为Q={q 1,q 2,...,q i,...,q n},初始目标体素的目标世界坐标用q i来表示,那么,可以列出如下式(3):
Figure PCTCN2020116920-appb-000005
式中,E(R,T)为误差函数,R和T分别为待求解的第一变换关系中的旋转矩阵和平移向量。那么,可以通过最小二乘法求解式(1)中R和T的最优解。
步骤S304,根据所述第一变换关系、每一所述像素点的相机坐标和对应的初始目标体素的目标世界坐标,确定匹配误差。
在一些实施例中,匹配误差指的是整体匹配误差,即所有像素点的匹配误差。在获得最优解,即第一变换关系之后,即可根据第一变换关系,将每一像素点的相机坐标转换为对应 的第二世界坐标。如果在步骤S302中选取的初始目标体素和像素点在实际物理空间中表示的同一个位置点或者是两个相近的位置点,那么像素点的第二世界坐标应该是与对应的初始目标体素的目标世界坐标相同或相近的。反之,如果两者表示的不是同一位置点,也不是两个相近的位置点,那么,像素点的第二世界坐标与对应的初始目标体素的目标世界坐标不同,也不相近。基于此,可以通过如下步骤S406和步骤S407确定匹配误差,从而基于匹配误差和预设阈值,确定初始目标体素是否是与像素点真正匹配的点,进而确定目标变换关系。
步骤S305,如果所述匹配误差大于预设阈值,返回步骤S302,重新选取初始目标体素,并重新确定匹配误差。
可以理解地,如果匹配误差大于预设阈值,说明当前选取的初始目标体素并不是与像素点匹配的体素,两者指代的不是物理空间中同一位置点或者相近的位置点。此时,还需要返回步骤S302,重新选取初始目标体素,并基于重新选取的初始目标体素,重新执行步骤S303至步骤S304,以重新确定匹配误差,直至重新确定的匹配误差小于所述预设阈值时,认为当前迭代中选取的初始目标体素是与像素点真正匹配的点,此时可以将当前迭代获得的第一变换关系确定为目标变换关系。
反之,在一些实施例中,如果匹配误差小于或等于预设阈值,则根据当前迭代获得的第二变换关系确定图像采集模组在点云地图中的定位结果。
步骤S306,将重新确定的匹配误差小于等于所述预设阈值时的第一变换关系,确定为目标变换关系;
步骤S307,根据所述目标变换关系,确定所述图像采集模组的定位结果。
本申请实施例再提供一种定位方法,所述方法可以包括以下步骤S401至步骤S410:
步骤S401,根据图像采集模组的内参矩阵和所述图像采集模组采集的深度图像中多个像素点的深度值,将所述多个像素点的像素坐标转换为相机坐标;
步骤S402,获取相机坐标系相对于世界坐标系的第二变换关系。
在一些实施例中,电子设备可以将第二变换关系中的旋转矩阵和平移向量设置一个初始值,也就是,设置第二变换关系的初始值。
步骤S403,根据所述第二变换关系和所述深度图像中第j个像素点的相机坐标,确定所述第j个像素点的第一世界坐标,j为大于0的整数;
步骤S404,将每一所述像素点的第一世界坐标与所述多个体素的目标世界坐标进行匹配,得出对应的初始目标体素。
在一些实施例中,电子设备可以确定像素点的第一世界坐标与每一体素的目标世界坐标之间的距离,然后将距离像素点最近的体素确定为初始目标体素,或者将距离小于或等于距离阈值的体素确定为初始目标体素。在一些实施例中,电子设备可以确定像素点的第一世界坐标与体素的目标世界坐标之间的欧氏距离,将该欧式距离作为像素点与体素之间的距离。
步骤S405,根据每一所述像素点的相机坐标和对应的初始目标体素的目标世界坐标,确定所述相机坐标系相对于所述世界坐标系的第一变换关系;
步骤S406,根据所述第一变换关系和所述深度图像中第j个像素点的相机坐标,确定所述第j个像素点的第二世界坐标,j为大于0的整数;
步骤S407,根据每一所述像素点的第二世界坐标和对应的初始目标体素的目标世界坐标,确定所述匹配误差。
在一些实施例中,电子设备可以先确定每一像素点的第二世界坐标与对应的初始目标体素的目标世界坐标之间的距离(例如欧式距离);然后,根据每一所述距离,确定所述匹配误差。
在一些实施例中,可以将多个像素点与匹配的初始目标体素之间的平均距离,确定为匹配误差。例如,包括n个像素点的第二世界坐标的集合表示为P′={p′ 1,p′ 2,...,p′ i,...,p′ n},像素点的第二世界坐标用p′ i表示,与所述n个像素点匹配的初始目标体素的目标世界坐标的集合 表示为Q={q 1,q 2,...,q i,...,q n},初始目标体素的目标世界坐标用q i表示,那么,通过如下公式(4),可以求取所述匹配误差d:
Figure PCTCN2020116920-appb-000006
式中||p′ i-q i|| 2表示,像素点与匹配的初始目标体素之间的欧式距离。
步骤S408,如果所述匹配误差大于预设阈值,将所述第一变换关系作为所述第二变换关系,重新选取初始目标体素,并重新确定匹配误差,直至重新确定的匹配误差小于所述预设阈值为止,进入步骤409。
可以理解地,如果匹配误差大于预设阈值,说明获取的第二变换关系是不符合实际的。换句话说,得出的初始目标体素不是真正与像素点匹配的体素,此时,可以将当前确定的第一变换关系作为所述第二变换关系,然后重新执行步骤S403至步骤S407,直至匹配误差小于所述阈值为止,此时选取的初始目标体素才可能是与像素点匹配的体素,即二者对应的是物理空间中的同一位置点。
步骤S409,将重新确定的匹配误差小于等于所述预设阈值时的第一变换关系,确定为目标变换关系;
步骤S410,根据所述目标变换关系,确定所述图像采集模组的定位结果。
本申请实施例提供的定位方法依赖于预先构建的点云地图,已经被预先构建好的点云地图,通常被存储在电子设备中或者存储在其他电子设备(例如服务器)中,电子设备在实施所述定位方法时只需加载本地存储的点云地图或者向其他电子设备请求获取该地图即可;其中,点云地图的构建过程如图2所示,可以包括以下步骤S501至步骤S503:
步骤S501,对特定物理空间的尺寸进行量化处理,得到多个体素的初始世界坐标。
可以理解地,特定物理空间指的是点云地图所覆盖的物理场景,例如,特定物理空间为某栋大楼、大型机场、商场、某一城市等。体素实际上是该特定物理空间中的最小单位。如图3所示,将特定物理空间看作一个具有一定尺寸的立方体301,然后以体素302为单位,对该立方体进行网格划分,得到多个体素;以世界坐标系为参考坐标系,确定每一体素的初始世界坐标系。举例来说,特定物理空间的尺寸为512×512×512m 3,体素的尺寸为1×1×1m 3,那么以1×1×1m 3大小的体素为单位,对512×512×512m 3大小的物理空间进行量化处理,可以得到512×512×512个体素的初始世界坐标。在一些实施例中,量化处理包括量化特定物理空间的尺寸和确定每一体素的初始世界坐标。
当然,在本申请实施例中,对于量化单位,也就是体素的大小不做限定。在实际应用中,可以根据工程需求来设计体素的大小。
步骤S502,根据所述图像采集模组在所述特定物理空间中采集的多个样本图像对,对每一所述体素的初始世界坐标进行更新,得到每一所述体素的目标世界坐标,所述样本图像对包括二维样本图像和深度样本图像。
二维样本图像可以是不包含深度信息的平面图像,例如所述二维样本图像为RGB图像,在一些实施例中,电子设备可以通过图像采集模组中的第一摄像头模组采集二维样本图像。深度样本图像是指包含深度信息的图像,在一些实施例中,电子设备可以通过图像采集模组中的第二摄像头模组(例如双目摄像头等)采集深度样本图像。电子设备可以通过如下实施例的步骤S602至步骤S604实现步骤S502。
步骤S503,根据每一所述体素的目标世界坐标,构建所述点云地图。即,点云地图中包括每一所述体素的目标世界坐标。
可以理解地,图像采集模组在不同时刻或者不同位置采集样本图像时,其拍摄场景是有重叠区域的。也就是说,不同的样本图像中包括部分相同的图像内容,这使得在基于这些样本图像构建点云地图时,引入了大量的冗余信息,物理空间中的同一位置点可能被多个像素 点以相同或相近的世界坐标表示在点云地图中,这样就大大增加了点云地图的数据量,影响了定位速度。显然这种还有大量冗余信息的点云地图,在视觉定位中,对于获得高精度的定位结果是不利的。
有鉴于此,在本申请实施例中,以体素的形式构建点云地图,即,通过采集的多个样本图像对,对每一体素的初始世界坐标进行更新(也即修正、优化),从而得到包括每一所述体素的目标世界坐标的点云地图。这种构建点云地图的方式,相当于将体素所涵盖的所有像素点的世界坐标融合为一个世界坐标值;如此,就解决了物理空间中的同一位置点被多个像素点以相同或相近的世界坐标表示在点云地图中所带来的上述问题,去除了大量的冗余信息。在视觉定位的应用中,基于通过本申请实施例获得的点云地图,一方面能够提高定位速度,使定位服务具有较好的实时性,另一方面,能够提升视觉定位的定位精度。
在一些实施例中,可以通过缩小体素的尺寸,来获得更高的定位精度。理论上来讲,体素的尺寸越小,获得的定位精度越高。
本申请实施例再提供一种点云地图的构建过程,该过程可以包括以下步骤S601至步骤S605:
步骤S601,对特定物理空间的尺寸进行量化处理,得到多个体素的初始世界坐标;
步骤S602,控制所述图像采集模组按照预设帧率采集样本图像对。
在一些实施例中,图像采集模组可以边移动边采集样本图像对。例如,可以通过具有图像采集模组的机器人实现样本图像对的采集。再如,数据采集人员可以携带图像采集模组,边行走边采集图像。
步骤S603,根据所述图像采集模组在当前时刻采集的第一样本图像对和在历史时刻采集的第二样本图像对,更新每一所述体素的初始世界坐标。
在一些实施例中,电子设备可以通过如下实施例的步骤S703至步骤S705实现步骤S603。
步骤S604,根据所述第一样本图像对和所述图像采集模组在下一时刻采集的第三样本图像对,继续更新每一所述体素的当前世界坐标,直到样本图像采集结束时,将每一所述体素的当前世界坐标作为与所述体素对应的目标世界坐标。
可以理解地,在步骤S604中,所述继续更新每一所述体素的当前世界坐标,更新的是通过步骤S603更新得到的每一所述体素的目标世界坐标。实际上,通过步骤S603和步骤S604,电子设备可以根据图像采集模组当前时刻采集的样本图像对和历史时刻采集的样本图像对,实时地更新每一体素的当前世界坐标,直到图像采集模组的图像采集任务结束为止,将当前更新得到的每一体素的目标世界坐标作为与体素对应的目标世界坐标。
步骤S605,根据每一所述体素的目标世界坐标,构建所述点云地图。
在本申请实施例中,边采集样本图像对,边利用采集的样本图像对更新每一所述体素的当前世界坐标。也就是,电子设备不断地利用图像采集模组在当前时刻采集的样本图像对和在历史时刻(例如前一时刻)采集的样本图像对,更新每一体素的当前世界坐标。由于前后时刻获得的两张样本图像,具有较多的重叠区域,这样,电子设备就无需从多个样本图像对中找出重叠区域最多的两个样本图像对,然后在基于这两个样本图像对更新每一体素的当前世界坐标;如此,能够大大提高地图构建的效率。
本申请实施例再提供一种点云地图的构建过程,该过程至少包括以下步骤S701至步骤S707:
步骤S701,对特定物理空间的尺寸进行量化处理,得到多个体素的初始世界坐标;
步骤S702,控制所述图像采集模组按照预设帧率采集样本图像对;
步骤S703,根据所述图像采集模组在当前时刻采集的第一样本图像对和在历史时刻采集的第二样本图像对,确定每一所述体素的当前相机坐标。
在一些实施例中,电子设备可以根据所述第一样本图像对和所述第二样本图像对,确定相机坐标系相对于世界坐标系的当前变换关系;然后根据所述当前变换关系,将每一体素的初始世界坐标转换为当前相机坐标。
在一些实施例中,电子设备可以根据第一样本图像对中二维样本图像的像素点的图像特征、第一样本图像对中深度样本图像的像素点的深度值、以及第二样本图像对中二维样本图像的像素点的图像特征和第二样本图像对中深度样本图像的像素点的深度值,确定所述当前变换关系。基于此,根据如下公式(5)将体素的初始世界坐标转换为当前相机坐标。
Figure PCTCN2020116920-appb-000007
式中,(x c,y c,z c)表示的是相机坐标,变换关系包括旋转矩阵R和平移向量T,(x w,y w,z w)表示的是世界坐标。
步骤S704,从所述第一样本图像对的深度样本图像中,获取与每一所述体素的当前像素坐标对应的深度值。
在一些实施例中,电子设备可以根据图像采集模组的内参矩阵,将每一体素的当前相机坐标转换为当前像素坐标;从第一样本图像对的深度样本图像中,获取与每一体素的当前像素坐标对应的深度值。
步骤S705,根据每一所述体素的当前相机坐标和与每一所述体素的当前像素坐标对应的深度值,更新与所述体素对应的初始世界坐标。
在一些实施例中,电子设备可以获取每一体素到物体表面的历史距离;将每一体素的当前相机坐标的Z轴坐标值、与每一体素的当前像素坐标对应的深度值和每一体素到物体表面的历史距离,输入至与体素对应的距离模型中,以更新所述历史距离,得到目标距离;将每一体素到物体表面的目标距离,更新为与体素对应的初始世界坐标中的Z轴坐标值,以实现对与体素对应的初始世界坐标进行更新。在一些实施例中,体素对应的距离模型如下式(6):
Figure PCTCN2020116920-appb-000008
式中,W t表示在当前时刻t体素的权重;W t-1表示在前一时刻t-1体素的权重;maxweight为在前一时刻t-1所有体素中的最大权重;D t表示与体素的当前像素坐标对应的深度值;z c表示体素的当前相机坐标的Z轴坐标值;max truncation和min truncation分别表示截断范围的最大值和最小值;D t-1表示在前一时刻t-1确定的体素到物体表面的距离(即所述历史距离的一种示例),而D t则是当前待求的体素到物体表面的目标距离。
这样将体素的当前相机坐标的Z轴坐标值z c、体素的当前像素坐标对应的深度值D t和体素到物体表面的历史距离输入至公式(6)所示的距离模型中,即可得到体素到物体表面的目标距离。
可以理解地,由于在更新体素对应的初始世界坐标时,考虑了体素到物体表面的历史距离;如此,使得更新后的体素的初始世界坐标更加平滑,从而使得最终获得的体素的目标世界坐标更加精确,进而在定位阶段,能够提高定位精度。
步骤S706,根据所述第一样本图像对和所述图像采集模组在下一时刻采集的第三样本 图像对,继续更新每一所述体素的当前世界坐标,直到样本图像采集结束时,将所述体素的当前世界坐标作为所述目标世界坐标。
可以理解地,电子设备通过执行类似于步骤S703至步骤S705,以继续更新每一体素的当前世界坐标。
步骤S707,根据每一所述体素的目标世界坐标,构建所述点云地图。
本申请实施例再提供一种点云地图的构建过程,该过程可以包括以下步骤S801至步骤S811:
步骤S801,对特定物理空间的尺寸进行量化处理,得到多个体素的初始世界坐标;
步骤S802,控制所述图像采集模组按照预设帧率采集样本图像对;
步骤S803,根据所述图像采集模组在当前时刻采集的第一样本图像对和在历史时刻采集的第二样本图像对,确定相机坐标系相对于世界坐标系的当前变换关系;
步骤S804,根据所述当前变换关系,将每一所述体素的初始世界坐标转换为当前相机坐标;
步骤S805,根据所述图像采集模组的内参矩阵,将每一所述体素的当前相机坐标转换为当前像素坐标;
步骤S806,从所述第一样本图像对的深度样本图像中,获取与每一所述体素的当前像素坐标对应的深度值;
步骤S807,获取每一所述体素到物体表面的历史距离;
步骤S808,将每一所述体素的当前相机坐标的Z轴坐标值、与每一所述体素的当前像素坐标对应的深度值和每一所述体素到物体表面的历史距离,输入至与所述体素对应的距离模型中,以更新所述历史距离,得到目标距离;
步骤S809,将每一所述体素的目标距离,更新为与所述体素对应的初始世界坐标中的Z轴坐标值,以实现对与所述体素对应的初始世界坐标进行更新;
步骤S810,根据所述第一样本图像对和所述图像采集模组在下一时刻采集的第三样本图像对,继续更新每一所述体素的当前世界坐标,直到样本图像采集结束时,将所述体素的当前世界坐标作为所述目标世界坐标。
可以理解地,电子设备通过执行类似于步骤S803至步骤S810,以继续更新每一所述体素的当前世界坐标。
步骤S811,根据每一所述体素的目标世界坐标,构建所述点云地图。
通过视觉信息可以建立室内环境地图,协助用户快速定位到自身位置和周边环境。针对于视觉技术,在相关技术中,通过识别摄像头采集到的图像中的人物和背景,以确定人物位置的方法。该方案将背景与预先测定的建筑物室内地图匹配,确定背景在室内的对应位置,然后根据背景的位置确认人物在室内的位置。在确定人物位置的具体算法流程上该方案的整体思路如下,通过图像识别的方法识别图像背景中固定位置物体,并根据固定位置物体的相对位置关系,确定某一时刻人物的位置。该方案的核心技术点是:1、通过视觉图像进行室内环境建图;2、图像匹配;3、在图像中识别出人物和物体。
然而,上述相关技术存在以下缺陷:1、该技术仅考虑了视觉图像的二维图像特征,定位后无法获得人物的姿态信息,且在定位精度方面较低;2、该技术可以针对图像中的人物或固定物体进行定位,但是前提是图像中必须有可以识别出的人物或固定物体,否则定位就会失效,定位的可靠性较低;3、该技术无法应对光线变换场景的定位需求,比如白天黑夜不同情况下的定位,定位鲁棒性较差。
基于此,下面将说明本申请实施例在一个实际的应用场景中的示例性应用。
本申请实施例实现了一种基于稠密点云的室内环境重建与定位技术,可以帮助用户创建稠密点云形式的室内地图(即点云地图的一种示例),并且实时定位用户位置。该方案可以 针对室内场景,提取图像特征进行视觉追踪和运动估计,并构建稠密地图;定位过程不依赖于外部基站设备,定位精度高,鲁棒性强。该方案包含两个主要部分:构建地图和视觉定位。
在本申请实施例中,构建地图部分主要是通过图像采集模组中的单目摄像头采集RGB图像信息,提取图像特征进行视觉追踪,同时利用图像采集模组中的三维视觉传感器(例如TOF、结构光等),采集深度信息构建稠密点云地图(即点云地图的一种示例),具体的技术步骤至少包括以下步骤S11至步骤S17:
步骤S11,利用单目摄像头,以固定帧率进行RGB图像采集;
步骤S12,利用三维视觉传感器,以固定帧率进行深度图像采集;
步骤S13,将RGB图像和深度图像进行对齐,包括时间戳对齐和像素对齐;
步骤S14,采集过程中实时提取RGB图像中的特征信息和深度图像中的深度信息,以对图像采集模组进行视觉追踪和运动估计,确定相机坐标系相对于世界坐标系的当前变换关系;
步骤S15,通过深度图像和相机的内参矩阵得到稠密点云,所述稠密点云中包括每一像素点的相机坐标;
需要说明的是,所谓稠密点云,是相对于稀疏点云而言的。稠密点云的采样点数量远远大于稀疏点云的采样点数量。
步骤S16,利用TSDF算法以体素的形式对稠密点云进行融合;
步骤S17,存储融合后的稠密点云,并序列化存储到本地作为稠密点云地图。
针对步骤S12中的利用三维视觉传感器进行深度图像采集,这里给出如下解释。深度图像又被称为距离图像,是指从图像采集模组到拍摄场景中各点的距离作为像素值的图像。深度图像直观反映了事物可见表面的几何形状。在深度数据流所提供的图像帧中,每一个像素点代表的是在三维视觉传感器的视野中,该特定的坐标处物体到摄像头平面的距离。
针对步骤S15中提到的通过深度图像和相机的内参矩阵得到稠密点云,这里给出如下解释。可以通过如下的公式(7)将像素点(u,v)转换到相机坐标(x c,y c,z c):
Figure PCTCN2020116920-appb-000009
式中(u 0,v 0)是图像的中心点的像素坐标,z c表示相机坐标的z轴值,也就是该像素点对应的深度值;
Figure PCTCN2020116920-appb-000010
表示焦距f在相机坐标系的x轴上的焦距分量;
Figure PCTCN2020116920-appb-000011
表示焦距f在相机坐标系的y轴上的焦距分量。需要说明的是,由于相机坐标系的Z轴为镜头的光轴,因此像素点(u,v)的深度值即为该像素点的相机坐标的Z轴坐标值z c。相机坐标和世界坐标下的同一物体具有相同的深度,即z c=z w
针对步骤S16中的利用TSDF算法以体素的形式对稠密点云进行融合,这里给出如下技术步骤S161至步骤S164:
步骤S161,首先获取体素在全局坐标系下的坐标V g(x,y,z)(即所述体素的目标世界坐标),然后根据运动追踪得到的变换矩阵(即步骤S14输出的当前变换关系)将其从全局坐标转换为相机坐标V(x,y,z);
步骤S162,根据相机的内参矩阵将相机坐标V(x,y,z)转换为图像坐标,得到一个图像坐标(u,v);
步骤S163,如果第l帧深度图像在图像坐标(u,v)处的深度值D(u,v)不为0,则比较D(u,v) 与体素的相机坐标V(x,y,z)中z的大小,如果D(u,v)<z,说明此体素距离相机更远,在融合表面的内部;否则,说明此体素距离相机更近,在融合表面的外部;
步骤S164,根据步骤S163的结果更新此体素中距离D l和权重值W l,更新公式如下式(8)所示:
Figure PCTCN2020116920-appb-000012
式中,W l(x,y,z)为当前帧全局数据立方体中体素的权重,W l-1(x,y,z)为上一帧全局数据立方体中体素的权重,maxweight为上一帧全局数据立方体中所有体素的权重中的最大权重,可以设定为1,D l(x,y,z)为当前全局数据立方体中体素到物体表面的距离,D l-1(x,y,z)为上一帧全局数据立方体重体素到物体表面的距离,d l(x,y,z)为根据当前帧深度数据计算得到的全局数据立方体中体素到物体表面的距离,z表示体素在相机坐标系下的z轴坐标,D l(u,v)表示当前帧深度图像在像素点(u,v)处的深度值,[min truncation,max truncation]为截断范围,其会影响到重建结果的精细程度。
基于步骤S11至步骤S17可以构建出一张基于稠密点云的稠密点云地图(即点云地图的一种示例),该地图以二进制格式存储该稠密点云地图到本地,在视觉定位过程中,该地图将被加载使用。
在本申请实施例中,视觉定位部分通过使用三维视觉传感器,采集深度信息并将其转换为点云,再通过迭代最近点算法算法(Iterative Closest Point,ICP)匹配稠密点云地图,得到当前相机在地图中的位姿,以达到定位目的,具体的技术步骤至少可以包括步骤S21至步骤S24:
步骤S21,加载构建好的稠密点云地图;
步骤S22,利用三维视觉传感器进行深度图像采集,得到待处理的深度图像;
步骤S23,通过深度图像和相机的内参矩阵得到当前点云,所述当前点云中包括所述深度图像的像素点的相机坐标;
步骤S24,通过ICP算法匹配当前点云和稠密点云地图,得到当前相机在地图中的精确位姿。
针对步骤S23中的通过深度图像和相机的内参矩阵得到当前点云的方法,可参考步骤S15。
针对步骤S24中的通过ICP算法匹配当前点云和稠密点云地图,得到当前相机在地图中的精确位姿,这里给出如下解释。ICP算法本质上是基于最小二乘法的最优配准方法。该算法重复进行选择对应关系点对,计算最优刚体变换,知道满足正确配准的收敛精度要求。ICP算法的基本原理是:分别在待匹配的目标点云P和源点云Q中,按照一定的约束条件,找到最邻近的点(p i,q i),然后计算出最优的旋转R和平移T,使得误差函数最小,误差函数E(R,T)如公式(9):
Figure PCTCN2020116920-appb-000013
式中,n为邻近点对的数量,p i为目标点云P中的一点,q i为源点云Q中与p i对应的最近点,R为旋转矩阵,T为平移向量。算法包括以下步骤S241至步骤S246:
步骤S241,在当前点云P中取点集p i∈P;
步骤S242,找出稠密点云地图Q中的对应点集q i∈Q,使得||q i-p i||=min;
步骤S243,计算旋转矩阵R和平移矩阵T,使得误差函数最小;
步骤S244,对p i使用步骤S243求得的旋转矩阵R和平移矩阵T进行旋转和平移变换,得到新的对应点集p’ i={p’ i=Rp i+T,p i∈P};
步骤S245,计算p’ i与对应点集q i的平均距离
Figure PCTCN2020116920-appb-000014
步骤S246,若d小于给定阈值d TH或者大于预设的迭代次数,则停止迭代计算,算法输出当前的旋转矩阵R和平移矩阵T;否则跳回到步骤S242。
在本申请实施例中,基于步骤S21至步骤S24可以通过三维视觉传感器提供的深度信息,在预定义的稠密点云地图中达成定位目的,得到图像采集模组在地图坐标系下的位置和姿态。该定位结果精度较高,不需要依赖外部基站设备,抗环境干扰性强,鲁棒性强。
本申请实施例提供的定位方法,能够获得以下技术效果:1、利用三维视觉传感器可以得到深度信息,从而利用深度信息实现地图构建和定位,如此定位精度不会受到光照变换情况下的影响,定位方法的鲁棒性较高;2、在定位结果上可以同时提供位置和姿态,相对于其他室内定位方法提高了定位准确度;3、定位方法不需要引入物体识别等错误率较高的算法,定位成功率高,鲁棒性强;4、构建的地图形式为稠密点云地图,不需要存储环境的RGB信息,因此地图的私密性较好。
在本申请实施例中,利用三维视觉传感器采集深度信息构建稠密点云地图,并结合高精度高鲁棒性的点云匹配算法的进行室内环境定位。在地图构建上,本申请实施例通过使用三维视觉传感器采集深度图像信息,以稠密点云的形式存储为稠密点云地图。在定位方法上,本申请实施例采用ICP算法匹配当前点云和地图稠密点云,精确地计算出当前自身位置和姿态。两者结合形成了一套高精度、高鲁棒性的室内定位方法。
基于前述的实施例,本申请实施例提供一种定位装置,该装置包括所包括的各模块、以及各模块所包括的各单元,可以通过电子设备中的处理器来实现;当然也可通过具体的逻辑电路实现;在实施的过程中,处理器可以为中央处理器(CPU)、微处理器(MPU)、数字信号处理器(DSP)或现场可编程门阵列(FPGA)等。
图4A为本申请实施例定位装置的组成结构示意图,如图4A所示,所述装置400包括坐标转换模块401和定位模块402,其中:
坐标转换模块401,配置为根据图像采集模组的内参矩阵和所述图像采集模组采集的深度图像中多个像素点的深度值,将所述多个像素点的像素坐标转换为相机坐标;
定位模块402,配置为将每一所述像素点的相机坐标与预先构建的点云地图中多个体素的目标世界坐标进行匹配,得出所述图像采集模组的定位结果;
其中,每一所述体素的目标世界坐标是根据多个样本图像对,更新每一所述体素的初始世界坐标而获得的,所述样本图像对包括二维样本图像和深度样本图像。
在一些实施例中,如图4B所示,所述装置400还包括量化处理模块403、坐标更新模块404和地图构建模块405;其中,量化处理模块403,配置为对特定物理空间的尺寸进行量化处理,得到多个体素的初始世界坐标;坐标更新模块404,配置为根据所述图像采集模组在所述特定物理空间中采集的多个样本图像对,对每一所述体素的初始世界坐标进行更新,得到每一所述体素的目标世界坐标,所述样本图像对包括二维样本图像和深度样本图像;地图构建模块405,配置为根据每一所述体素的目标世界坐标,构建所述点云地图。
在一些实施例中,坐标更新模块404,包括:控制子模块,配置为控制所述图像采集模组按照预设帧率采集样本图像对;坐标更新子模块,配置为根据所述图像采集模组在当前时刻采集的第一样本图像对和在历史时刻采集的第二样本图像对,更新每一所述体素的初始世界坐标;根据所述第一样本图像对和所述图像采集模组在下一时刻采集的第三样本图像对, 继续更新每一所述体素的当前世界坐标,直到样本图像采集结束时,将所述体素的当前世界坐标作为所述目标世界坐标。
在一些实施例中,所述坐标更新子模块,包括:相机坐标确定单元,配置为根据所述第一样本图像对和所述第二样本图像对,确定每一所述体素的当前相机坐标;深度值获取单元,配置为从所述第一样本图像对的深度样本图像中,获取与每一所述体素的当前像素坐标对应的深度值;坐标更新单元,配置为:根据每一所述体素的当前相机坐标和与每一所述体素的当前像素坐标对应的深度值,更新与所述体素对应的初始世界坐标;根据所述第一样本图像对和所述图像采集模组在下一时刻采集的第三样本图像对,继续更新每一所述体素的当前世界坐标,直到样本图像采集结束时,将所述体素的当前世界坐标作为所述目标世界坐标。
在一些实施例中,所述坐标更新单元,配置为:获取每一所述体素到物体表面的历史距离;将每一所述体素的当前相机坐标的Z轴坐标值、与每一所述体素的当前像素坐标对应的深度值和每一所述体素到物体表面的历史距离,输入至与所述体素对应的距离模型中,以更新所述历史距离,得到目标距离;将每一所述体素到物体表面的目标距离,更新为与所述体素对应的初始世界坐标中的Z轴坐标值,以实现对与所述体素对应的初始世界坐标进行更新;根据所述第一样本图像对和所述图像采集模组在下一时刻采集的第三样本图像对,继续更新每一所述体素的当前世界坐标,直到样本图像采集结束时,将所述体素的当前世界坐标作为所述目标世界坐标。
在一些实施例中,所述相机坐标确定单元,配置为:根据所述第一样本图像对和所述第二样本图像对,确定相机坐标系相对于世界坐标系的当前变换关系;根据所述当前变换关系,将每一所述体素的初始世界坐标转换为当前相机坐标。
在一些实施例中,所述深度值获取单元,配置为:根据所述图像采集模组的内参矩阵,将每一所述体素的当前相机坐标转换为当前像素坐标;从所述第一样本图像对的深度样本图像中,获取与每一所述体素的当前像素坐标对应的深度值。
在一些实施例中,所述定位模块402,包括:迭代子模块,配置为根据迭代策略,将每一所述像素点的相机坐标与所述多个体素的目标世界坐标进行匹配,得出相机坐标系相对于世界坐标系的目标变换关系;定位子模块,配置为根据所述目标变换关系,确定所述图像采集模组的定位结果。
在一些实施例中,所述迭代子模块,包括:选取单元,配置为从所述多个体素中选取与每一所述像素点匹配的初始目标体素;确定单元,配置为:根据每一所述像素点的相机坐标和对应的初始目标体素的目标世界坐标,确定所述相机坐标系相对于所述世界坐标系的第一变换关系;根据所述第一变换关系、每一所述像素点的相机坐标和对应的初始目标体素的目标世界坐标,确定匹配误差;如果所述匹配误差大于预设阈值,重新选取初始目标体素,并重新确定匹配误差;将重新确定的匹配误差小于等于所述预设阈值时的第一变换关系确定为所述目标变换关系。
在一些实施例中,所述选取单元,配置为:获取所述相机坐标系相对于所述世界坐标系的第二变换关系;根据所述第二变换关系和第j个像素点的相机坐标,确定所述第j个像素点的第一世界坐标,j为大于0的整数;将每一所述像素点的第一世界坐标与所述多个体素的目标世界坐标进行匹配,得出对应的初始目标体素。
在一些实施例中,所述确定单元,配置为:根据所述第一变换关系和第j个像素点的相机坐标,确定所述第j个像素点的第二世界坐标,j为大于0的整数;根据每一所述像素点的第二世界坐标和对应的初始目标体素的目标世界坐标,确定所述匹配误差。
在一些实施例中,所述确定单元,配置为:确定每一所述像素点的第二世界坐标与对应的初始目标体素的目标世界坐标之间的距离;根据每一所述距离,确定所述匹配误差。
在一些实施例中,所述选取单元,配置为:如果所述匹配误差大于预设阈值,将所述第一变换关系作为所述第二变换关系,重新选取初始目标体素。
以上装置实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的 有益效果。对于本申请装置实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。
需要说明的是,本申请实施例中,如果以软件功能模块的形式实现上述的定位方法,并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得电子设备(可以是手机、平板电脑、笔记本电脑、台式计算机、服务器、机器人、无人机等)执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本申请实施例不限制于任何特定的硬件和软件结合。
对应地,本申请实施例提供一种电子设备,图5为本申请实施例电子设备的一种硬件实体示意图,如图5所示,该电子设备500的硬件实体包括:包括存储器501和处理器502,所述存储器501存储有可在处理器502上运行的计算机程序,所述处理器502执行所述程序时实现上述实施例中提供的定位方法中的步骤。
存储器501配置为存储由处理器502可执行的指令和应用,还可以缓存待处理器502以及电子设备500中各模块待处理或已经处理的数据(例如,图像数据、音频数据、语音通信数据和视频通信数据),可以通过闪存(FLASH)或随机访问存储器(Random Access Memory,RAM)实现。
对应地,本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述实施例中提供的定位方法中的步骤。
这里需要指出的是:以上存储介质和设备实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本申请存储介质和设备实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。
应理解,说明书通篇中提到的“一个实施例”或“一些实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一些实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
上文对各个实施例的描述倾向于强调各个实施例之间的不同之处,其相同或相似之处可以互相参考,为了简洁,本文不再赘述。
本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如对象A和/或对象B,可以表示:单独存在对象A,同时存在对象A和对象B,单独存在对象B这三种情况。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元;既可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。
另外,在本申请各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。
或者,本申请上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得电子设备(可以是手机、平板电脑、笔记本电脑、台式计算机、服务器、机器人、无人机等)执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、磁碟或者光盘等各种可以存储程序代码的介质。
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。
本申请所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。
本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。
以上所述,仅为本申请的实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (20)

  1. 一种定位方法,所述方法包括:
    根据图像采集模组的内参矩阵和所述图像采集模组采集的深度图像中多个像素点的深度值,将所述多个像素点的像素坐标转换为相机坐标;
    将每一所述像素点的相机坐标与预先构建的点云地图中多个体素的目标世界坐标进行匹配,得出所述图像采集模组的定位结果;
    其中,每一所述体素的目标世界坐标是根据多个样本图像对更新每一所述体素的初始世界坐标而获得的,所述样本图像对包括二维样本图像和深度样本图像。
  2. 根据权利要求1所述的方法,其中,所述点云地图的构建过程,包括:
    对特定物理空间的尺寸进行量化处理,得到多个体素的初始世界坐标;
    根据所述图像采集模组在所述特定物理空间中采集的多个样本图像对,对每一所述体素的初始世界坐标进行更新,得到每一所述体素的目标世界坐标;
    根据每一所述体素的目标世界坐标,构建所述点云地图。
  3. 根据权利要求2所述的方法,其中,所述根据所述图像采集模组在所述特定物理空间中采集的多个样本图像对,对每一所述体素的初始世界坐标进行更新,得到每一所述体素的目标世界坐标,包括:
    控制所述图像采集模组按照预设帧率采集样本图像对;
    根据所述图像采集模组在当前时刻采集的第一样本图像对和在历史时刻采集的第二样本图像对,更新每一所述体素的初始世界坐标;
    根据所述第一样本图像对和所述图像采集模组在下一时刻采集的第三样本图像对,继续更新每一所述体素的当前世界坐标,直到样本图像采集结束时,将所述体素的当前世界坐标作为所述目标世界坐标。
  4. 根据权利要求3所述的方法,其中,所述根据所述图像采集模组在当前时刻采集的第一样本图像对和在历史时刻采集的第二样本图像对,更新每一所述体素的初始世界坐标,包括:
    根据所述第一样本图像对和所述第二样本图像对,确定每一所述体素的当前相机坐标;
    从所述第一样本图像对的深度样本图像中,获取与每一所述体素的当前像素坐标对应的深度值;
    根据每一所述体素的当前相机坐标和与每一所述体素的当前像素坐标对应的深度值,更新与所述体素对应的初始世界坐标。
  5. 根据权利要求4所述的方法,其中,所述根据每一所述体素的当前相机坐标和与每一所述体素的当前像素坐标对应的深度值,更新与所述体素对应的初始世界坐标,包括:
    获取每一所述体素到物体表面的历史距离;
    将每一所述体素的当前相机坐标的Z轴坐标值、与每一所述体素的当前像素坐标对应的深度值和每一所述体素到物体表面的历史距离,输入至与所述体素对应的距离模型中,以更新所述历史距离,得到目标距离;
    将每一所述体素到物体表面的目标距离,更新为与所述体素对应的初始世界坐标中的Z轴坐标值,以实现对与所述体素对应的初始世界坐标进行更新。
  6. 根据权利要求4所述的方法,其中,所述根据所述第一样本图像对和所述第二样本图像对,确定每一所述体素的当前相机坐标,包括:
    根据所述第一样本图像对和所述第二样本图像对,确定相机坐标系相对于世界坐标系的当前变换关系;
    根据所述当前变换关系,将每一所述体素的初始世界坐标转换为当前相机坐标。
  7. 根据权利要求4所述的方法,其中,所述从所述第一样本图像对的深度样本图像中,获取与每一所述体素的当前像素坐标对应的深度值,包括:
    根据所述图像采集模组的内参矩阵,将每一所述体素的当前相机坐标转换为当前像素坐标;
    从所述第一样本图像对的深度样本图像中,获取与每一所述体素的当前像素坐标对应的深度值。
  8. 根据权利要求1至7任一项所述的方法,其中,所述将每一所述像素点的相机坐标与预先构建的点云地图中多个体素的目标世界坐标进行匹配,得出所述图像采集模组的定位结果,包括:
    根据迭代策略,将每一所述像素点的相机坐标与所述多个体素的目标世界坐标进行匹配,得出相机坐标系相对于世界坐标系的目标变换关系;
    根据所述目标变换关系,确定所述图像采集模组的定位结果。
  9. 根据权利要求8所述的方法,其中,所述根据迭代策略,将每一所述像素点的相机坐标与所述多个体素的目标世界坐标进行匹配,得出相机坐标系相对于世界坐标系的目标变换关系,包括:
    从所述多个体素中选取与每一所述像素点匹配的初始目标体素;
    根据每一所述像素点的相机坐标和对应的初始目标体素的目标世界坐标,确定所述相机坐标系相对于所述世界坐标系的第一变换关系;
    根据所述第一变换关系、每一所述像素点的相机坐标和对应的初始目标体素的目标世界坐标,确定匹配误差;
    如果所述匹配误差大于预设阈值,重新选取初始目标体素,并重新确定匹配误差;
    将重新确定的匹配误差小于等于所述预设阈值时的第一变换关系确定为所述目标变换关系。
  10. 根据权利要求9所述的方法,其中,所述从所述多个体素中选取与每一所述像素点匹配的初始目标体素,包括:
    获取所述相机坐标系相对于所述世界坐标系的第二变换关系;
    根据所述第二变换关系和第j个像素点的相机坐标,确定所述第j个像素点的第一世界坐标,j为大于0的整数;
    将每一所述像素点的第一世界坐标与所述多个体素的目标世界坐标进行匹配,得出对应的初始目标体素。
  11. 根据权利要求9所述的方法,其中,所述根据所述第一变换关系、每一所述像素点的相机坐标和对应的初始目标体素的目标世界坐标,确定匹配误差,包括:
    根据所述第一变换关系和第j个像素点的相机坐标,确定所述第j个像素点的第二世界坐标,j为大于0的整数;
    根据每一所述像素点的第二世界坐标和对应的初始目标体素的目标世界坐标,确定所述匹配误差。
  12. 根据权利要求11所述的方法,其中,所述根据每一所述像素点的第二世界坐标和对应的初始目标体素的目标世界坐标,确定所述匹配误差,包括:
    确定每一所述像素点的第二世界坐标与对应的初始目标体素的目标世界坐标之间的距离;
    根据每一所述距离,确定所述匹配误差。
  13. 根据权利要求10所述的方法,其中,所述如果所述匹配误差大于预设阈值,重新选取初始目标体素,包括:
    如果所述匹配误差大于预设阈值,将所述第一变换关系作为所述第二变换关系,重新选取初始目标体素。
  14. 一种定位装置,包括:
    坐标转换模块,配置为根据图像采集模组的内参矩阵和所述图像采集模组采集的深度图像中多个像素点的深度值,将所述多个像素点的像素坐标转换为相机坐标;
    定位模块,配置为将每一所述像素点的相机坐标与预先构建的点云地图中多个体素的目标世界坐标进行匹配,得出所述图像采集模组的定位结果;
    其中,每一所述体素的目标世界坐标是根据多个样本图像对,更新每一所述体素的初始世界坐标而获得的,所述样本图像对包括二维样本图像和深度样本图像。
  15. 根据权利要求14所述的装置,其中,还包括:
    量化处理模块,配置为对特定物理空间的尺寸进行量化处理,得到多个体素的初始世界坐标;
    坐标更新模块,配置为根据所述图像采集模组在所述特定物理空间中采集的多个样本图像对,对每一所述体素的初始世界坐标进行更新,得到每一所述体素的目标世界坐标,所述样本图像对包括二维样本图像和深度样本图像;
    地图构建模块,配置为根据每一所述体素的目标世界坐标,构建所述点云地图。
  16. 根据权利要求15所述的装置,其中,所述坐标更新模块,包括:
    控制子模块,配置为控制所述图像采集模组按照预设帧率采集样本图像对;
    坐标更新子模块,配置为根据所述图像采集模组在当前时刻采集的第一样本图像对和在历史时刻采集的第二样本图像对,更新每一所述体素的初始世界坐标;根据所述第一样本图像对和所述图像采集模组在下一时刻采集的第三样本图像对,继续更新每一所述体素的当前世界坐标,直到样本图像采集结束时,将所述体素的当前世界坐标作为所述目标世界坐标。
  17. 根据权利要求16所述的装置,其中,所述坐标更新子模块,包括:
    相机坐标确定单元,配置为根据所述第一样本图像对和所述第二样本图像对,确定每一所述体素的当前相机坐标;
    深度值获取单元,配置为从所述第一样本图像对的深度样本图像中,获取与每一所述体素的当前像素坐标对应的深度值;
    坐标更新单元,配置为根据每一所述体素的当前相机坐标和与每一所述体素的当前像素坐标对应的深度值,更新与所述体素对应的初始世界坐标;根据所述第一样本图像对和所述图像采集模组在下一时刻采集的第三样本图像对,继续更新每一所述体素的当前世界坐标,直到样本图像采集结束时,将所述体素的当前世界坐标作为所述目标世界坐标。
  18. 根据权利要求17所述的装置,其中,所述坐标更新单元,配置为:
    获取每一所述体素到物体表面的历史距离;
    将每一所述体素的当前相机坐标的Z轴坐标值、与每一所述体素的当前像素坐标对应的深度值和每一所述体素到物体表面的历史距离,输入至与所述体素对应的距离模型中,以更新所述历史距离,得到目标距离;
    将每一所述体素到物体表面的目标距离,更新为与所述体素对应的初始世界坐标中的Z轴坐标值,以实现对与所述体素对应的初始世界坐标进行更新;
    根据所述第一样本图像对和所述图像采集模组在下一时刻采集的第三样本图像对,继续更新每一所述体素的当前世界坐标,直到样本图像采集结束时,将所述体素的当前世界坐标作为所述目标世界坐标。
  19. 一种电子设备,包括存储器和处理器,所述存储器存储有可在处理器上运行的计算机程序,所述处理器执行所述程序时实现权利要求1至13任一项所述定位方法中的步骤。
  20. 一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现权利要求1至13任一项所述定位方法中的步骤。
PCT/CN2020/116920 2019-09-27 2020-09-22 定位方法及装置、设备、存储介质 WO2021057739A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP20869168.3A EP4016458A4 (en) 2019-09-27 2020-09-22 METHOD AND DEVICE FOR POSITIONING, APPARATUS AND STORAGE MEDIA
US17/686,091 US12051223B2 (en) 2019-09-27 2022-03-03 Positioning method, electronic device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910921654.0A CN110728717B (zh) 2019-09-27 2019-09-27 定位方法及装置、设备、存储介质
CN201910921654.0 2019-09-27

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/686,091 Continuation US12051223B2 (en) 2019-09-27 2022-03-03 Positioning method, electronic device, and storage medium

Publications (1)

Publication Number Publication Date
WO2021057739A1 true WO2021057739A1 (zh) 2021-04-01

Family

ID=69218426

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/116920 WO2021057739A1 (zh) 2019-09-27 2020-09-22 定位方法及装置、设备、存储介质

Country Status (4)

Country Link
US (1) US12051223B2 (zh)
EP (1) EP4016458A4 (zh)
CN (1) CN110728717B (zh)
WO (1) WO2021057739A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113129339A (zh) * 2021-04-28 2021-07-16 北京市商汤科技开发有限公司 一种目标跟踪方法、装置、电子设备及存储介质
CN113327318A (zh) * 2021-05-18 2021-08-31 禾多科技(北京)有限公司 图像显示方法、装置、电子设备和计算机可读介质
CN113743206A (zh) * 2021-07-30 2021-12-03 洛伦兹(宁波)科技有限公司 矿车装料控制方法、装置、系统及计算机可读介质

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110728717B (zh) * 2019-09-27 2022-07-15 Oppo广东移动通信有限公司 定位方法及装置、设备、存储介质
CN111400537B (zh) * 2020-03-19 2023-04-28 北京百度网讯科技有限公司 一种道路元素信息获取方法、装置和电子设备
CN111442722B (zh) * 2020-03-26 2022-05-17 达闼机器人股份有限公司 定位方法、装置、存储介质及电子设备
CN111504299B (zh) * 2020-04-03 2023-08-18 北京小狗吸尘器集团股份有限公司 一种地图建立方法、装置、可读介质及电子设备
CN111563138B (zh) * 2020-04-30 2024-01-05 浙江商汤科技开发有限公司 定位方法及装置、电子设备和存储介质
CN113721599B (zh) * 2020-05-25 2023-10-20 华为技术有限公司 定位方法和定位装置
CN111829535B (zh) * 2020-06-05 2022-05-03 阿波罗智能技术(北京)有限公司 生成离线地图的方法、装置、电子设备和存储介质
CN112816967B (zh) * 2021-02-03 2024-06-14 成都康烨科技有限公司 图像距离测量方法、装置、测距设备和可读存储介质
CN112927363B (zh) * 2021-04-07 2024-06-18 Oppo广东移动通信有限公司 体素地图构建方法及装置、计算机可读介质和电子设备
CN113487741B (zh) * 2021-06-01 2024-05-28 中国科学院自动化研究所 稠密三维地图更新方法及装置
CN113256721B (zh) * 2021-06-21 2021-12-03 浙江光珀智能科技有限公司 一种室内多人三维高精度定位方法
CN113643421B (zh) * 2021-07-06 2023-08-25 北京航空航天大学 图像的三维重建方法和三维重建装置
CN113689487A (zh) * 2021-08-09 2021-11-23 广东中星电子有限公司 在视频画面上进行平面测量的方法、装置、终端设备
CN113673603B (zh) * 2021-08-23 2024-07-02 北京搜狗科技发展有限公司 一种要素点匹配的方法及相关装置
CN113822932B (zh) * 2021-08-30 2023-08-18 亿咖通(湖北)技术有限公司 设备定位方法、装置、非易失性存储介质及处理器
CN113793255A (zh) * 2021-09-09 2021-12-14 百度在线网络技术(北京)有限公司 用于图像处理的方法、装置、设备、存储介质和程序产品
CN113902847B (zh) * 2021-10-11 2024-04-16 岱悟智能科技(上海)有限公司 基于三维特征约束的单目深度图像位姿优化方法
CN114937071B (zh) * 2022-07-26 2022-10-21 武汉市聚芯微电子有限责任公司 一种深度测量方法、装置、设备及存储介质
CN115372899B (zh) * 2022-08-24 2024-10-01 深圳市鼎飞技术有限公司 基于tof技术实现无信号区域定位的方法
CN116630442B (zh) * 2023-07-19 2023-09-22 绘见科技(深圳)有限公司 一种视觉slam位姿估计精度评估方法及装置
CN117061719B (zh) * 2023-08-11 2024-03-08 元橡科技(北京)有限公司 一种车载双目相机视差校正方法
CN117392571B (zh) * 2023-12-08 2024-02-13 中国电力科学研究院有限公司 一种基于无人机图像的架空输配电线路验收方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140064602A1 (en) * 2012-09-05 2014-03-06 Industrial Technology Research Institute Method and apparatus for object positioning by using depth images
CN104517289A (zh) * 2014-12-12 2015-04-15 浙江大学 一种基于混合摄像机的室内场景定位方法
CN106384353A (zh) * 2016-09-12 2017-02-08 佛山市南海区广工大数控装备协同创新研究院 一种基于rgbd的目标定位方法
CN109579852A (zh) * 2019-01-22 2019-04-05 杭州蓝芯科技有限公司 基于深度相机的机器人自主定位方法及装置
CN110675457A (zh) * 2019-09-27 2020-01-10 Oppo广东移动通信有限公司 定位方法及装置、设备、存储介质
CN110728717A (zh) * 2019-09-27 2020-01-24 Oppo广东移动通信有限公司 定位方法及装置、设备、存储介质

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101310213B1 (ko) * 2009-01-28 2013-09-24 한국전자통신연구원 깊이 영상의 품질 개선 방법 및 장치
WO2014050830A1 (ja) * 2012-09-25 2014-04-03 日本電信電話株式会社 画像符号化方法、画像復号方法、画像符号化装置、画像復号装置、画像符号化プログラム、画像復号プログラム及び記録媒体
CN103716641B (zh) * 2012-09-29 2018-11-09 浙江大学 预测图像生成方法和装置
US9984498B2 (en) 2013-07-17 2018-05-29 Microsoft Technology Licensing, Llc Sparse GPU voxelization for 3D surface reconstruction
JP6452324B2 (ja) * 2014-06-02 2019-01-16 キヤノン株式会社 画像処理装置、画像処理方法及びプログラム
CA3005894A1 (en) * 2015-11-20 2017-05-26 Magic Leap, Inc. Methods and systems for large-scale determination of rgbd camera poses
US11232583B2 (en) * 2016-03-25 2022-01-25 Samsung Electronics Co., Ltd. Device for and method of determining a pose of a camera
EP3457682A1 (en) * 2016-05-13 2019-03-20 Olympus Corporation Calibration device, calibration method, optical device, imaging device, projection device, measurement system and measurement method
CN106056092B (zh) * 2016-06-08 2019-08-20 华南理工大学 基于虹膜与瞳孔的用于头戴式设备的视线估计方法
CN106780576B (zh) * 2016-11-23 2020-03-17 北京航空航天大学 一种面向rgbd数据流的相机位姿估计方法
CN107590832A (zh) * 2017-09-29 2018-01-16 西北工业大学 基于自然特征的物理对象追踪定位方法
CN108563989A (zh) 2018-03-08 2018-09-21 北京元心科技有限公司 室内定位方法及装置
CN108648240B (zh) * 2018-05-11 2022-09-23 东南大学 基于点云特征地图配准的无重叠视场相机姿态标定方法
CN108986161B (zh) * 2018-06-19 2020-11-10 亮风台(上海)信息科技有限公司 一种三维空间坐标估计方法、装置、终端和存储介质
CN110874179B (zh) * 2018-09-03 2021-09-14 京东方科技集团股份有限公司 指尖检测方法、指尖检测装置、指尖检测设备及介质
CN111694429B (zh) * 2020-06-08 2023-06-02 北京百度网讯科技有限公司 虚拟对象驱动方法、装置、电子设备及可读存储
US11893675B1 (en) * 2021-02-18 2024-02-06 Splunk Inc. Processing updated sensor data for remote collaboration
US11741631B2 (en) * 2021-07-15 2023-08-29 Vilnius Gediminas Technical University Real-time alignment of multiple point clouds to video capture
CN113808261B (zh) * 2021-09-30 2022-10-21 大连理工大学 一种基于全景图的自监督学习场景点云补全的数据集生成方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140064602A1 (en) * 2012-09-05 2014-03-06 Industrial Technology Research Institute Method and apparatus for object positioning by using depth images
CN104517289A (zh) * 2014-12-12 2015-04-15 浙江大学 一种基于混合摄像机的室内场景定位方法
CN106384353A (zh) * 2016-09-12 2017-02-08 佛山市南海区广工大数控装备协同创新研究院 一种基于rgbd的目标定位方法
CN109579852A (zh) * 2019-01-22 2019-04-05 杭州蓝芯科技有限公司 基于深度相机的机器人自主定位方法及装置
CN110675457A (zh) * 2019-09-27 2020-01-10 Oppo广东移动通信有限公司 定位方法及装置、设备、存储介质
CN110728717A (zh) * 2019-09-27 2020-01-24 Oppo广东移动通信有限公司 定位方法及装置、设备、存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4016458A4

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113129339A (zh) * 2021-04-28 2021-07-16 北京市商汤科技开发有限公司 一种目标跟踪方法、装置、电子设备及存储介质
CN113129339B (zh) * 2021-04-28 2023-03-10 北京市商汤科技开发有限公司 一种目标跟踪方法、装置、电子设备及存储介质
CN113327318A (zh) * 2021-05-18 2021-08-31 禾多科技(北京)有限公司 图像显示方法、装置、电子设备和计算机可读介质
CN113743206A (zh) * 2021-07-30 2021-12-03 洛伦兹(宁波)科技有限公司 矿车装料控制方法、装置、系统及计算机可读介质
CN113743206B (zh) * 2021-07-30 2024-04-23 洛伦兹(宁波)科技有限公司 矿车装料控制方法、装置、系统及计算机可读介质

Also Published As

Publication number Publication date
EP4016458A1 (en) 2022-06-22
US20220262039A1 (en) 2022-08-18
CN110728717A (zh) 2020-01-24
US12051223B2 (en) 2024-07-30
EP4016458A4 (en) 2022-10-19
CN110728717B (zh) 2022-07-15

Similar Documents

Publication Publication Date Title
WO2021057739A1 (zh) 定位方法及装置、设备、存储介质
WO2021057742A1 (zh) 定位方法及装置、设备、存储介质
WO2021057744A1 (zh) 定位方法及装置、设备、存储介质
WO2021057745A1 (zh) 地图融合方法及装置、设备、存储介质
JP6745328B2 (ja) 点群データを復旧するための方法及び装置
Walch et al. Image-based localization using lstms for structured feature correlation
TWI777538B (zh) 圖像處理方法、電子設備及電腦可讀儲存介質
Yang et al. Fast depth prediction and obstacle avoidance on a monocular drone using probabilistic convolutional neural network
WO2021057743A1 (zh) 地图融合方法及装置、设备、存储介质
CN110675457B (zh) 定位方法及装置、设备、存储介质
WO2019161813A1 (zh) 动态场景的三维重建方法以及装置和系统、服务器、介质
CN109559320B (zh) 基于空洞卷积深度神经网络实现视觉slam语义建图功能的方法及系统
WO2021017314A1 (zh) 信息处理方法、定位方法及装置、电子设备和存储介质
WO2020102944A1 (zh) 点云处理方法、设备及存储介质
US11788845B2 (en) Systems and methods for robust self-relocalization in a visual map
WO2015154008A1 (en) System and method for extracting dominant orientations from a scene
WO2023065657A1 (zh) 地图构建方法、装置、设备、存储介质及程序
Zhang et al. Vehicle global 6-DoF pose estimation under traffic surveillance camera
CN108801225B (zh) 一种无人机倾斜影像定位方法、系统、介质及设备
KR102249381B1 (ko) 3차원 영상 정보를 이용한 모바일 디바이스의 공간 정보 생성 시스템 및 방법
KR20220055072A (ko) 딥러닝을 이용한 실내 위치 측위 방법
WO2023088127A1 (zh) 室内导航方法、服务器、装置和终端
Sui et al. An accurate indoor localization approach using cellphone camera
WO2022252036A1 (zh) 障碍物信息获取方法、装置、可移动平台及存储介质
Dong et al. A rgb-d slam algorithm combining orb features and bow

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20869168

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020869168

Country of ref document: EP

Effective date: 20220317

NENP Non-entry into the national phase

Ref country code: DE