WO2022007367A1

WO2022007367A1 - Systems and methods for pose determination

Info

Publication number: WO2022007367A1
Application number: PCT/CN2020/141652
Authority: WO
Inventors: Lizhi Hu; Hui Lin; Jianhua KE; Wei Lu; Jun Yin
Original assignee: Zhejiang Dahua Technology Co., Ltd.
Priority date: 2020-07-09
Filing date: 2020-12-30
Publication date: 2022-01-13
Also published as: EP4153940A4; KR20230029981A; EP4153940A1

Abstract

Systems (100) and methods for pose determination. The system (100) may obtain an initial pose of a moving device (110)(510). The system (100) may also determine a plurality of candidate poses of the moving device(110) based on the initial pose of the moving device(110) according to at least one map associated with the moving device(110) (520). The system(100) may determine a target initial pose of the moving device(110) based on the plurality of candidate poses and laser data acquired by the moving device(110) (530).

Description

SYSTEMS AND METHODS FOR POSE DETERMINATION

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to Chinese Patent Application No. 202010655087.1 filed on July 9, 2020 and Chinese Patent Application No. 202010963179.6 filed on September 14, 2020, the entire contents of each of which are hereby incorporated by reference.

TECHNICAL FIELD

The present disclosure generally relates to automation technology, and in particular, to systems and methods for determining a pose of a moving device.

BACKGROUND

With the development of automation technology and computer technology, pose determination is becoming more and more important. Commonly, a pose of a moving device (e.g., a robot) is determined based on a single type of data (e.g., image data, laser data, odometry data) acquired by the moving device. For example, the pose of the moving device is determined based on laser data and a predetermined global map of a region where the moving device is located. However, in some cases, a single type of data can't ensure the accuracy of the pose determination. For example, an actual environment surrounding the moving device changes dynamically and the predetermined global map can't reflect a real-time situation of the actual environment, which may accordingly affect the accuracy of the pose determination based on the laser data and the predetermined global map. Therefore, it is desirable to provide systems and methods for accurately and efficiently determining a pose of a moving device.

SUMMARY

According to one aspect of the present disclosure, a system may be provided. The system may include: at least one storage device including a set of instructions; and at least one processor in communication with the at least one storage device, wherein when executing the set of instructions, the at least one processor may be configured to cause the system to: obtain an initial pose of a moving device; determine a plurality of candidate poses of the moving device based on the initial pose of the moving device according to at least one map associated with the moving device; and determine a target initial pose of the moving device based on the plurality of candidate poses and laser data acquired by the moving device.

In some embodiments, wherein to obtain the initial pose of the moving device, the at least one processor may be configured to cause the system to: determine feature information of an image acquired by the moving device; generate a matching result by matching the feature information of the image and reference feature information of a plurality of reference images, the reference feature information being stored in a feature database; and obtain the initial pose of the moving device based on the matching result.

In some embodiments, wherein to obtain the initial pose of the moving device based on the matching result, the at least one processor may be configured to cause the system to: determine a plurality of similarities between the feature information of the image and the reference feature information of the plurality of reference images; identify, from the plurality of similarities, a similarity exceeding a similarity threshold; and determine the initial pose of the moving device based on a reference pose or reference feature information corresponding to a reference image with the similarity exceeding the similarity threshold.

In some embodiments, wherein to obtain the initial pose of the moving device based on the matching result, the at least one processor may be configured to cause the system to: in response to determining that all the plurality of similarities are less than or equal to the similarity threshold, obtain a second image by moving the moving device; determine second feature information of the second image; determine a plurality of second similarities between the second feature information of the second image and the reference feature information of the plurality of reference images; identify, from the plurality of second similarities, a second similarity exceeding the similarity threshold; and determine the initial pose of the moving device based on a second reference pose or second reference feature information corresponding to a second reference image with the second similarity exceeding the similarity threshold.

In some embodiments, wherein to obtain the initial pose of the moving device based on the matching result, the at least one processor may be configured to cause the system to: in response to determining that all of the plurality of second similarities are less than or equal to the similarity threshold, determine the initial pose of the moving device based on one or more previous poses of the moving device and odometry data of the moving device.

In some embodiments, the feature database may be generated by: obtaining a reference map; obtaining a plurality of reference poses of a reference moving device based on the reference map, two adjacent reference poses of the plurality of reference poses satisfying a preset condition; determining a plurality of images acquired by the reference moving device from the plurality of reference poses as the plurality of reference images; and for each of the plurality of reference images, extracting and storing the reference feature information of the reference image, the reference feature information including at least one of a reference feature point, a reference representation of the reference feature point, or a reference coordinate of the reference feature point.

In some embodiments, the preset condition may include that a time difference between time points corresponding to the two adjacent poses respectively exceeds a time threshold or a difference between the two adjacent poses of the reference moving device exceeds a difference threshold.

In some embodiments, the at least one processor may be configured to cause the system further to: determine a matching result of the laser data acquired by the moving device and a map associated with the moving device; in response to determining that the matching result satisfies a preset condition, determine whether a highest similarity between the feature information and corresponding reference feature information in the feature database is smaller than a similarity threshold; and in response to determining that the similarity is smaller than the similarity threshold, update the feature database by replacing reference feature information corresponding to the reference image with the feature information of the image.

In some embodiments, wherein the at least one map includes at least two maps with different resolutions associated with the moving device, and to determine the plurality of candidate poses of the moving device based on the initial pose of the moving device according to the at least one map associated with the moving device, the at least one processor may be configured to cause the system to: determine the plurality of candidate poses of the moving device by rotating or translating the initial pose within a predetermined range on the at least two maps.

In some embodiments, wherein to determine the target initial pose of the moving device based on the plurality of candidate poses and the laser data acquired by the moving device, the at least one processor may be configured to cause the system to: determine, using a branch and bound algorithm, the target initial pose of the moving device based on the at least two maps and the plurality of candidate poses of the moving device.

In some embodiments, wherein to determine, using the branch and bound algorithm, the target initial pose of the moving device based on the at least two maps and the plurality of candidate poses of the moving device, the at least one processor may be configured to cause the system to: for each of the at least two maps, determine one or more modified maps by down-sampling the map; and determine, using the branch and bound algorithm, the target initial pose of the moving device based on the at least two maps, modified maps corresponding to the at least maps, and the plurality of candidate poses of the moving device.

In some embodiments, wherein the at least two maps include a first map with a first resolution and a second map with a second resolution; and to determine, using the branch and bound algorithm, the target initial pose of the moving device based on the at least two maps and the plurality of candidate poses of the moving device, the at least one processor may be configured to cause the system to: determine, based on the first map with the first resolution and the initial pose of the moving device, a plurality of first candidate poses of the moving device among the plurality of candidate poses of the moving device; determine, using the branch and bound algorithm, a first target initial pose of the moving device based on the plurality of first candidate poses; determine, based on the second map with the second resolution and the first target initial pose of the moving device, a plurality of second candidate poses of the moving device among the plurality of candidate poses of the moving device; determine, using the branch and bound algorithm, a second target initial pose of the moving device based on the plurality of second candidate poses of the moving device; and designate the second target initial pose of the moving device as the target initial pose of the moving device.

According to another aspect of the present disclosure, a system may be provided. The system may include: at least one storage device including a set of instructions; and at least one processor in communication with the at least one storage device, wherein when executing the set of instructions, the at least one processor may be configured to cause the system to: determine a first pose of a moving device based at least in part on odometry data or laser data acquired by the moving device; obtain at least one local map associated with the moving device; determine a second pose of the moving device based on the at least one local map and the laser data acquired by the moving device; and determine a target pose of the moving device based on the first pose and the second pose using a pose adjustment algorithm.

In some embodiments, wherein to determine the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device, the at least one processor may be configured to cause the system to: determine whether a first difference between the odometry data corresponding to a current time point and previous odometry data corresponding to a previous time point adjacent to the current time point exceeds a first difference threshold or a second difference between the current time point and the previous time point exceeds a second difference threshold; and in response to determining that the first difference is smaller than or equal to the first difference threshold and the second difference is smaller than or equal to the second difference threshold, determine the first pose of the moving device based on the odometry data acquired by the moving device.

In some embodiments, wherein to determine the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device, the at least one processor may be configured to cause the system to: in response to determining that the first difference exceeds the first difference threshold or the second difference exceeds the second difference threshold, determine a first candidate pose of the moving device based on the odometry data acquired by the moving device; determine a second candidate pose of the moving device based on a portion of the laser data and a global map associated with the moving device, the portion of the laser data corresponding to long-term features in a region where the moving device is located; and determine the first pose of the moving device based on the first candidate pose and the second candidate pose.

In some embodiments, wherein to determine the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device, the at least one processor may be configured to cause the system to: determine whether at least one marker is detected based on the laser data; and in response to determining that the at least one marker is detected based on the laser data, determine the first pose of the moving device based on predetermined reference information associated with the detected at least one marker.

In some embodiments, wherein to determine the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device, the at least one processor may be configured to cause the system to: determine whether a matching result of the laser data and a global map associated with the moving device satisfies a preset condition; and in response to determining that the matching result satisfies the preset condition, determine the first pose of the moving device based on the laser data and the global map associated with the moving device.

In some embodiments, each of the at least one local map may include a plurality of grids and a plurality of occupancy rates corresponding to the plurality of grids respectively.

In some embodiments, the at least one processor may be configured to cause the system further to: update the at least one local map associated with the moving device based at least in part on the first pose of the moving device and the laser data acquired by the moving device by: for each of the at least one local map, projecting the laser data onto the local map based on the first pose of the moving device; and updating the local map by updating the plurality of occupancy rates corresponding to the plurality of grids based on the projected laser data.

In some embodiments, the at least one local map may be constructed or updated based on previous laser data acquired by the moving device at previous time points within a predetermined range from a current time point corresponding to the laser data. The at least one local map may be dynamically constructed or updated according to a matching result between the previous laser data and a global map associated with the moving device.

In some embodiments, the at least one local map may be dynamically constructed or released according to at least one of: a predetermined time interval, a count of data frames included in the at least one local map, or a matching result between a pose of the moving device determined based on the at least one local map and a global map associated with the moving device.

In some embodiments, the at least one local map may include a first local map, a second local map, and a third local map. The second local map may be constructed when a count of data frames in the first local map reaches a first predetermined count. The third local map may be constructed when a count of data frames in the second local map reaches the first predetermined count. The first local map may be released when the count of data frames in the first local map reaches a second predetermined count or a matching result between a pose of the moving device determined based on the second local map and a global map associated with the moving device satisfies a preset condition.

In some embodiments, the pose adjustment algorithm may include a sparse pose adjustment (SPA) algorithm.

In some embodiments, wherein to determine the first pose of the moving device based on the odometry data, the at least one processor may be configured to cause the system to: obtain a target initial pose of the moving device; and determine the first pose of the moving device based on the target initial pose of the moving device and the odometry data or the laser data acquired by the moving device. The target initial pose of the moving device may be determined by: obtaining an initial pose of the moving device; determining a plurality of candidate poses of the moving device based on the initial pose of the moving device; and determining the pose of the moving device at the second time point based on the plurality of candidate poses and laser data acquired by the moving device at the second time point.

According to another aspect of the present disclosure, a method may be provided. The method may be implemented on a computing device including at least one processor, at least one storage medium, and a communication platform connected to a network. The method may include: obtaining an initial pose of a moving device; determining a plurality of candidate poses of the moving device based on the initial pose of the moving device according to at least one map associated with the moving device; and determining a target initial pose of the moving device based on the plurality of candidate poses and laser data acquired by the moving device.

In some embodiments, the obtaining the initial pose of the moving device may include: determining feature information of an image acquired by the moving device; generating a matching result by matching the feature information of the image and reference feature information of a plurality of reference images, the reference feature information being stored in a feature database; and obtaining the initial pose of the moving device based on the matching result.

In some embodiments, the obtaining the initial pose of the moving device based on the matching result may include: determining a plurality of similarities between the feature information of the image and the reference feature information of the plurality of reference images; identifying, from the plurality of similarities, a similarity exceeding a similarity threshold; and determining the initial pose of the moving device based on a reference pose or reference feature information corresponding to a reference image with the similarity exceeding the similarity threshold.

In some embodiments, the obtaining the initial pose of the moving device based on the matching result may include: in response to determining that all the plurality of similarities are less than or equal to the similarity threshold, obtaining a second image by moving the moving device; determining second feature information of the second image; determining a plurality of second similarities between the second feature information of the second image and the reference feature information of the plurality of reference images; identifying, from the plurality of second similarities, a second similarity exceeding the similarity threshold; and determining the initial pose of the moving device based on a second reference pose or second reference feature information corresponding to a second reference image with the second similarity exceeding the similarity threshold.

In some embodiments, the obtaining the initial pose of the moving device based on the matching result may include: in response to determining that all of the plurality of second similarities are less than or equal to the similarity threshold, determining the initial pose of the moving device based on one or more previous poses of the moving device and odometry data of the moving device.

In some embodiments, the feature database may be generated by: obtaining a reference map; obtaining a plurality of reference poses of a reference moving device based on the reference map, two adjacent reference poses of the plurality of reference poses satisfying a preset condition; determining a plurality of images acquired by the reference moving device from the plurality of reference poses as the plurality of reference images; and for each of the plurality of reference images, extracting and storing the reference feature information of the reference image. The reference feature information may include at least one of a reference feature point, a reference representation of the reference feature point, or a reference coordinate of the reference feature point.

In some embodiments, the method may further include: determining a matching result of the laser data acquired by the moving device and a map associated with the moving device; in response to determining that the matching result satisfies a preset condition, determining whether a highest similarity between the feature information and corresponding reference feature information in the feature database is smaller than a similarity threshold; and in response to determining that the similarity is smaller than the similarity threshold, updating the feature database by replacing reference feature information corresponding to the reference image with the feature information of the image.

In some embodiments, wherein the at least one map includes at least two maps with different resolutions associated with the moving device, and the determining the plurality of candidate poses of the moving device based on the initial pose of the moving device according to the at least one map associated with the moving device may include: determining the plurality of candidate poses of the moving device by rotating or translating the initial pose within a predetermined range on the at least two maps.

In some embodiments, the determining the target initial pose of the moving device based on the plurality of candidate poses and the laser data acquired by the moving device may include: determining, using a branch and bound algorithm, the target initial pose of the moving device based on the at least two maps and the plurality of candidate poses of the moving device.

In some embodiments, the determining, using the branch and bound algorithm, the target initial pose of the moving device based on the at least two maps and the plurality of candidate poses of the moving device may include: for each of the at least two maps, determining one or more modified maps by down-sampling the map; and determining, using the branch and bound algorithm, the target initial pose of the moving device based on the at least two maps, modified maps corresponding to the at least maps, and the plurality of candidate poses of the moving device.

In some embodiments, wherein the at least two maps include a first map with a first resolution and a second map with a second resolution; and the determining, using the branch and bound algorithm, the target initial pose of the moving device based on the at least two maps and the plurality of candidate poses of the moving device may include: determining, based on the first map with the first resolution and the initial pose of the moving device, a plurality of first candidate poses of the moving device among the plurality of candidate poses of the moving device; determining, using the branch and bound algorithm, a first target initial pose of the moving device based on the plurality of first candidate poses; determining, based on the second map with the second resolution and the first target initial pose of the moving device, a plurality of second candidate poses of the moving device among the plurality of candidate poses of the moving device; determining, using the branch and bound algorithm, a second target initial pose of the moving device based on the plurality of second candidate poses of the moving device; and designating the second target initial pose of the moving device as the target initial pose of the moving device.

According to another aspect of the present, a method may be provided. The method may be implemented on a computing device including at least one processor, at least one storage medium, and a communication platform connected to a network. The method may include: determining a first pose of a moving device based at least in part on odometry data or laser data acquired by the moving device; obtaining at least one local map associated with the moving device; determining a second pose of the moving device based on the at least one local map and the laser data acquired by the moving device; and determining a target pose of the moving device based on the first pose and the second pose using a pose adjustment algorithm.

In some embodiments, the determining the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device may include: determining whether a first difference between the odometry data corresponding to a current time point and previous odometry data corresponding to a previous time point adjacent to the current time point exceeds a first difference threshold or a second difference between the current time point and the previous time point exceeds a second difference threshold; and in response to determining that the first difference is smaller than or equal to the first difference threshold and the second difference is smaller than or equal to the second difference threshold, determining the first pose of the moving device based on the odometry data acquired by the moving device.

In some embodiments, wherein the determining the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device may include: in response to determining that the first difference exceeds the first difference threshold or the second difference exceeds the second difference threshold, determining a first candidate pose of the moving device based on the odometry data acquired by the moving device; determining a second candidate pose of the moving device based on a portion of the laser data and a global map associated with the moving device, the portion of the laser data corresponding to long-term features in a region where the moving device is located; and determining the first pose of the moving device based on the first candidate pose and the second candidate pose.

In some embodiments, the determining the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device may include: determining whether at least one marker is detected based on the laser data; and in response to determining that the at least one marker is detected based on the laser data, determining the first pose of the moving device based on predetermined reference information associated with the detected at least one marker.

In some embodiments, the determining the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device may include: determining whether a matching result of the laser data and a global map associated with the moving device satisfies a preset condition; and in response to determining that the matching result satisfies the preset condition, determining the first pose of the moving device based on the laser data and the global map associated with the moving device.

In some embodiments, the method may further include: updating the at least one local map associated with the moving device based at least in part on the first pose of the moving device and the laser data acquired by the moving device by: for each of the at least one local map, projecting the laser data onto the local map based on the first pose of the moving device; and updating the local map by updating the plurality of occupancy rates corresponding to the plurality of grids based on the projected laser data.

In some embodiments, the determining the first pose of the moving device based on the odometry data may include: obtaining a target initial pose of the moving device; and determining the first pose of the moving device based on the target initial pose of the moving device and the odometry data or the laser data acquired by the moving device. The target initial pose of the moving device may be determined by: obtaining an initial pose of the moving device; determining a plurality of candidate poses of the moving device based on the initial pose of the moving device; and determining the pose of the moving device at the second time point based on the plurality of candidate poses and laser data acquired by the moving device at the second time point.

According to another aspect of the present disclosure, a non-transitory computer readable medium may be provided. The non-transitory computer readable medium may include executable instructions that, when executed by at least one processor, direct the at least one processor to perform a method. The method may include: obtaining an initial pose of a moving device; determining a plurality of candidate poses of the moving device based on the initial pose of the moving device according to at least one map associated with the moving device; and determining a target initial pose of the moving device based on the plurality of candidate poses and laser data acquired by the moving device.

According to another aspect of the present disclosure, a non-transitory computer readable medium may be provided. The non-transitory computer readable medium may include executable instructions that, when executed by at least one processor, direct the at least one processor to perform a method. The method may include: determining a first pose of a moving device based at least in part on odometry data or laser data acquired by the moving device; obtaining at least one local map associated with the moving device; determining a second pose of the moving device based on the at least one local map and the laser data acquired by the moving device; and determining a target pose of the moving device based on the first pose and the second pose using a pose adjustment algorithm.

Additional features will be set forth in part in the description which follows, and in part will become apparent to those skilled in the art upon examination of the following and the accompanying drawings or may be learned by production or operation of the examples. The features of the present disclosure may be realized and attained by practice or use of various aspects of the methodologies, instrumentalities and combinations set forth in the detailed examples discussed below.

BRIEF DESCRIPTION OF THE DRAWINGS

The present disclosure is further described in terms of exemplary embodiments. These exemplary embodiments are described in detail with reference to the drawings. The drawings are not to scale. These embodiments are non-limiting schematic embodiments, in which like reference numerals represent similar structures throughout the several views of the drawings, and wherein:

FIG. 1 is a schematic diagram illustrating an exemplary pose determination system according to some embodiments of the present disclosure;

FIG. 2 is a schematic diagram illustrating exemplary hardware and/or software components of an exemplary computing device according to some embodiments of the present disclosure;

FIG. 3 is a schematic diagram illustrating exemplary hardware and/or software components of an exemplary mobile device according to some embodiments of the present disclosure;

FIG. 4 is a block diagram illustrating an exemplary processing device according to some embodiments of the present disclosure;

FIG. 5 is a flowchart illustrating an exemplary process for determining a target initial pose of a moving device according to some embodiments of the present disclosure;

FIG. 6 is a flowchart illustrating an exemplary process for generating a feature database according to some embodiments of the present disclosure;

FIG. 7 is a flowchart illustrating an exemplary process for determining a target initial pose of a moving device according to some embodiments of the present disclosure;

FIGs. 8A-8C are schematic diagrams illustrating an exemplary process for down-sampling a map according to some embodiments of the present disclosure;

FIG. 9 is a schematic diagram illustrating a branch and bound algorithm according to some embodiments of the present disclosure;

FIG. 10 is a flowchart illustrating an exemplary process for determining a target initial pose of a moving device according to some embodiments of the present disclosure;

FIGs. 11A and 11B are a flowchart illustrating an exemplary process for determining a target initial pose of a moving device according to some embodiments of the present disclosure;

FIG. 12 is a block diagram illustrating an exemplary processing device according to some embodiments of the present disclosure;

FIG. 13 is a flowchart illustrating an exemplary process for determining a target pose of a moving device according to some embodiments of the present disclosure;

FIGs. 14A-14F are schematic diagrams illustrating an exemplary process for updating occupancy rates of grids of a local map according to some embodiments of the present disclosure;

FIG. 15 is a schematic diagram illustrating a principle of a sparse pose adjustment algorithm according to some embodiments of the present disclosure;

FIG. 16 is a flowchart illustrating an exemplary process for determining a target pose of a moving device according to some embodiments of the present disclosure; and

FIG. 17 is a flowchart illustrating an exemplary process for determining a target poe of a moving device according to some embodiments of the present disclosure.

DETAILED DESCRIPTION

In the following detailed description, numerous specific details are set forth by way of examples in order to provide a thorough understanding of the relevant disclosure. However, it should be apparent to those skilled in the art that the present disclosure may be practiced without such details. In other instances, well-known methods, procedures, systems, components, and/or circuitry have been described at a relatively high-level, without detail, in order to avoid unnecessarily obscuring aspects of the present disclosure. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present disclosure. Thus, the present disclosure is not limited to the embodiments shown, but to be accorded the widest scope consistent with the claims.

The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting. As used herein, the singular forms “a, ” “an, ” and “the” may be intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be understood that the term “speed” used herein not only includes magnitude information but also includes moving direction information, that is, the term “speed” can be used interchangeably with the term “velocity. ” It will be further understood that the terms “comprise, ” “comprises, ” and/or “comprising, ” “include, ” “includes, ” and/or “including, ” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It will be understood that the terms “system, ” “unit, ” “module, ” and/or “block” used herein are one method to distinguish different components, elements, parts, sections or assemblies of different levels in ascending order. However, the terms may be displaced by another expression if they achieve the same purpose.

The modules (or units, blocks, units) described in the present disclosure may be implemented as software and/or hardware modules and may be stored in any type of non-transitory computer-readable medium or other storage devices. In some embodiments, a software module may be compiled and linked into an executable program. It will be appreciated that software modules can be callable from other modules or from themselves, and/or can be invoked in response to detected events or interrupts. Software modules configured for execution on computing devices can be provided on a computer readable medium, such as a compact disc, a digital video disc, a flash drive, a magnetic disc, or any other tangible medium, or as a digital download (and can be originally stored in a compressed or installable format that requires installation, decompression, or decryption prior to execution) . Such software code can be stored, partially or fully, on a memory device of the executing computing device, for execution by the computing device. Software instructions can be embedded in a firmware, such as an EPROM. It will be further appreciated that hardware modules (e.g., circuits) can be included of connected or coupled logic units, such as gates and flip-flops, and/or can be included of programmable units, such as programmable gate arrays or processors. The modules or computing device functionality described herein are preferably implemented as hardware modules, but can be software modules as well. In general, the modules described herein refer to logical modules that can be combined with other modules or divided into units despite their physical organization or storage.

It will be understood that when a unit, engine, module or block is referred to as being “on, ” “connected to, ” or “coupled to, ” another unit, engine, module, or block, it may be directly on, connected or coupled to, or communicate with the other unit, engine, module, or block, or an intervening unit, engine, module, or block may be present, unless the context clearly indicates otherwise. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.

These and other features, and characteristics of the present disclosure, as well as the methods of operation and functions of the related elements of structure and the combination of parts and economies of manufacture, may become more apparent upon consideration of the following description with reference to the accompanying drawings, all of which form a part of this disclosure. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended to limit the scope of the present disclosure.

The flowcharts used in the present disclosure illustrate operations that systems implement according to some embodiments of the present disclosure. It is to be expressly understood, the operations of the flowcharts may be implemented not in order. Conversely, the operations may be implemented in inverted order, or simultaneously. Moreover, one or more other operations may be added to the flowcharts. One or more operations may be removed from the flowcharts.

An aspect of the present disclosure relates to systems and methods for determining a target initial pose of a moving device based on laser data acquired by the moving device (e.g., a robot) . The systems may determine a plurality of candidate poses of the moving device by rotating and/or translating an initial pose of the moving device on at least two maps with different resolutions. The systems may determine a target initial pose of the moving device based on the plurality of candidate poses and the laser data using a branch and bound (BB) algorithm and the at least two maps. In some embodiments, the at least two maps may include a first map with a first resolution and a second map with a second resolution finer than the first resolution. The systems may determine, based on the initial pose and the first map, a plurality of first candidate poses and determine a first target initial pose of the moving device based on the plurality of first candidate poses using the BB algorithm. The systems may determine, based on the first target initial pose and the second map, a plurality of second candidate poses and determine a second target initial pose of the moving device based on the plurality of second candidate poses using the BB algorithm. The systems may designate the second target initial pose as the target initial pose of the moving device.

In some embodiments, the systems may determine the initial pose based on image data (e.g., at least one image) acquired by the moving device. Thus, the target initial pose may be determined based on multiple types of environment data, thereby improving the accuracy of the target initial pose. Besides, by using the BB algorithm and the at least two maps with different resolutions, the speed and the accuracy for determining the target initial pose may be improved.

Another aspect of the present disclosure relates to systems and methods for determining a target pose of a moving device (e.g., a robot) . The systems may determine a first pose of the moving device based at least in part on odometry data or laser data acquired by the moving device and a second pose of the moving device based on at least one local map associated with the moving device and the laser data. The at least one local map may be dynamically constructed or updated based on the laser data and the first pose to allow the at least one map to reflect a rea-time situation of a region where the moving device is located, thereby improving the accuracy of the second pose. Further, the systems may determine the target pose by optimizing the first pose and/or the second pose using a pose adjustment algorithm (e.g., a sparse pose adjustment (SPA) algorithm) , thereby improving the accuracy of the target pose.

FIG. 1 is a schematic diagram illustrating an exemplary pose determination system according to some embodiments of the present disclosure. In some embodiments, the pose determination system 100 may include a moving device 110, a capture device 120, a server 130, a network 140, and a storage device 150. In some embodiments, the pose determination system 100 may be applied in various scenarios, for example, automatic freight transport, sightseeing in a predetermined region (e.g., a park) , automatic food delivery, etc.

The moving device 110 may be configured to move to execute a predetermined task (e.g., a transport task) . In some embodiments, during the task, the moving device 110 may acquire environment data (e.g., image data, odometry data, laser data) of a region where the moving device 110 is located. In some embodiments, the moving device 110 may include a plurality of sensors configured to acquire the environment data, for example, at least one light detection and ranging (LIDAR) 112 (e.g., a two-dimensional LIDAR) , at least one camera 114, at least one odometry 116, an inertial measurement unit (IMU) (not shown in FIG. 1) , etc. Specifically, the LIDAR 112 may be configured to acquire the laser data of the region, the camera 114 may be configured to acquire the image data of the region, and the odometry 116 may be configured to acquire the odometry data of the moving device 110. In some embodiments, the moving device 110 may include at least one component configured to facilitate the movement of the moving device 110, for example, a plurality of wheels 118, a battery, a motor, a computing unit (not shown in FIG. 1) , etc. In some embodiments, the moving device 110 may include a robot, an automated guided vehicle (AGV) , etc.

The capture device 120 may be configured to capture the environment data of the region where the moving device 110 is located. In some embodiments, the capture device 120 may include a camera, a video recorder, an image sensor, a smartphone, a tablet computer, a laptop computer, a wearable device, or the like, or any combination thereof. The camera may include a box camera, a gun camera, a dome camera, an integrated camera, a monocular camera, a binocular camera, a multi-sensor camera, a stereo camera, an RGB-D camera, or the like, or any combination thereof. The video recorder may include a PC Digital Video Recorder (DVR) , an embedded DVR, or the like, or any combination thereof. The image sensor may include a Charge Coupled Device (CCD) , a Complementary Metal Oxide Semiconductor (CMOS) , or the like, or any combination thereof. In some embodiments, the capture device 120 may be integrated into, mounted on, or connected to the moving device 110. In some embodiments, the capture device 120 may be omitted and the functions of the capture device 120 can be implemented by the sensors of the moving device 110.

The server 130 may be a single server or a server group. The server group may be centralized or distributed (e.g., the server 130 may be a distributed system) . In some embodiments, the server 130 may be local or remote. For example, the server 130 may access information and/or data stored in the moving device 110, the capture device 120, and/or the storage device 150 via the network 140. As another example, the server 130 may be directly connected to the moving device 110, the capture device 120, and/or the storage device 150 to access stored information and/or data. In some embodiments, the server 130 may be implemented on a cloud platform or an onboard computer. Merely by way of example, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof. In some embodiments, the server 130 may be implemented on a computing device 200 including one or more components illustrated in FIG. 2 of the present disclosure.

In some embodiments, the server 130 may include a processing device 132. The processing device 132 may process information and/or data associated with pose determination to perform one or more functions described in the present disclosure. For example, the processing device 132 may determine a plurality of candidate poses of the moving device 110 based on an initial pose (e.g., an initial pose determined based on image data) of the moving device and determine a target initial pose of the moving device 110 based on the plurality of candidate poses and laser data acquired by the moving device 110. As another example, the processing device 132 may determine a first pose of a moving device based at least in part on odometry data or laser data acquired by the moving device, obtain at least one local map associated with the moving device 110, determine a second pose of the moving device based on the at least one local map and the laser data acquired by the moving device, and determine a target pose of the moving device based on the first pose and the second pose, for example, using a pose adjustment algorithm.

In some embodiments, the processing device 132 may include one or more processing engines (e.g., single-core processing engine (s) or multi-core processor (s) ) . Merely by way of example, the processing device 132 may include a central processing unit (CPU) , an application-specific integrated circuit (ASIC) , an application-specific instruction-set processor (ASIP) , a graphics processing unit (GPU) , a physics processing unit (PPU) , a digital signal processor (DSP) , a field-programmable gate array (FPGA) , a programmable logic device (PLD) , a controller, a microcontroller unit, a reduced instruction-set computer (RISC) , a microprocessor, or the like, or any combination thereof.

In some embodiments, the server 130 may be connected to the network 140 to communicate with one or more components (e.g., the moving device 110, the capture device 120, the storage device 150) of the pose determination system 100. In some embodiments, the server 130 may be directly connected to or communicate with one or more components (e.g., the moving device 110, the capture device 120, the storage device 150) of the pose determination system 100. In some embodiment, the server 130 may be unnecessary and all or part of the functions of the server 130 may be implemented by other components (e.g., the moving device 110) of the pose determination system 100. For example, the processing device 132 may be integrated into the moving device 110 and the functions of the processing device 132 112 may be implemented by the moving device 110.

The network 140 may facilitate exchange of information and/or data. In some embodiments, one or more components (e.g., the moving device 110, the capture device 120, the server 130, the storage device 150) of the pose determination system 100 may transmit information and/or data to other component (s) of the pose determination system 100 via the network 140. For example, the server 130 may obtain a feature database from the storage device 150 via the network 140. As another example, the server 130 may obtain at least one local map from the moving device 110 or the storage device 150 via the network 140. In some embodiments, the network 140 may be any type of wired or wireless network, or combination thereof. Merely by way of example, the network 140 may include a cable network, a wireline network, an optical fiber network, a telecommunications network, an intranet, an Internet, a local area network (LAN) , a wide area network (WAN) , a wireless local area network (WLAN) , a metropolitan area network (MAN) , a public telephone switched network (PSTN) , a Bluetooth network, a ZigBee network, a near field communication (NFC) network, or the like, or any combination thereof. In some embodiments, the network 140 may include one or more network access points. For example, the network 140 may include wired or wireless network access points (e.g., a point 140-1, a point 140-2) , through which one or more components of the pose determination system 100 may be connected to the network 140 to exchange data and/or information.

The storage device 150 may store data and/or instructions. In some embodiments, the storage device 150 may store data obtained from the moving device 110, the capture device 120, the server 130, or an external storage device. For example, the storage device 150 may store a target initial pose of the moving device 110 determined by the server 130. As another example, the storage device 150 may store a target pose of the moving device 110 determined by the server 130. As a further example, the storage device 150 may store a feature database which can be used to determine a pose of the moving device 110. As a still a further example, the storage device 150 may store at least one local map associated with the moving device 110. In some embodiments, the storage device 150 may store data and/or instructions that the processing device 132 may execute or use to perform exemplary methods described in the present disclosure. For example, the storage device 150 may store instructions that the processing device 132 may execute or use to determine a target initial pose of the moving device 110 based on a plurality of candidate poses of the moving device 110 and laser data acquired by the moving device 110. As another example, the storage device 150 may store instructions that the processing device 132 may execute or use to determine a first pose of a moving device based at least in part on odometry data or laser data acquired by the moving device, obtain at least one local map associated with the moving device 110, determine a second pose of the moving device 110 based on the at least one local map and the laser data acquired by the moving device, and determine a target pose of the moving device based on the first pose and the second pose, for example, using a pose adjustment algorithm.

In some embodiments, the storage device 150 may include a mass storage, a removable storage, a volatile read-and-write memory, a read-only memory (ROM) , or the like, or any combination thereof. Exemplary mass storage may include a magnetic disk, an optical disk, a solid-state drive, etc. Exemplary removable storage may include a flash drive, a floppy disk, an optical disk, a memory card, a zip disk, a magnetic tape, etc. Exemplary volatile read-and-write memory may include a random access memory (RAM) . Exemplary RAM may include a dynamic RAM (DRAM) , a double date rate synchronous dynamic RAM (DDR SDRAM) , a static RAM (SRAM) , a thyristor RAM (T-RAM) , and a zero-capacitor RAM (Z-RAM) , etc. Exemplary ROM may include a mask ROM (MROM) , a programmable ROM (PROM) , an erasable programmable ROM (EPROM) , an electrically-erasable programmable ROM (EEPROM) , a compact disk ROM (CD-ROM) , and a digital versatile disk ROM, etc. In some embodiments, the storage device 150 may be implemented on a cloud platform. Merely by way of example, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof.

In some embodiments, the storage device 150 may be connected to the network 140 to communicate with one or more components (e.g., the moving device 110, the capture device 120, the server 130) of the pose determination system 100. One or more components of the pose determination system 100 may access the data or instructions stored in the storage device 150 via the network 140. In some embodiments, the storage device 150 may be directly connected to or communicate with one or more components (e.g., the moving device 110, the capture device 120, the server 130) of the pose determination system 100. In some embodiments, the storage device 150 may be part of the server 130. For example, the storage device 150 may be integrated into the server 130.

It should be noted that the pose determination system 100 is merely provided for the purposes of illustration, and is not intended to limit the scope of the present disclosure. For persons having ordinary skills in the art, multiple variations or modifications may be made under the teachings of the present disclosure. However, those variations and modifications do not depart from the scope of the present disclosure. For example, the pose determination system 100 may also include a user device (not shown) configured to receive information and/or data from the moving device 110, the capture device 120, the server 130, and/or the storage device 150. The user device may provide a user interface via which a user may view information (image data) and/or input data (e.g., an initial pose, at least one local map) and/or instructions to the pose determination system 100.

FIG. 2 is a schematic diagram illustrating exemplary hardware and/or software components of an exemplary computing device according to some embodiments of the present disclosure. The computing device 200 may be used to implement any component of the pose determination system 100 as described herein. For example, the processing device 132 may be implemented on the computing device 200, via its hardware, software program, firmware, or a combination thereof. Although only one such computer is shown, for convenience, the computer functions relating to pose determination as described herein may be implemented in a distributed fashion on a number of similar platforms to distribute the processing load.

The computing device 200, for example, may include COM ports 250 connected to and from a network connected thereto to facilitate data communications. The computing device 200 may also include a processor (e.g., a processor 220) , in the form of one or more processors (e.g., logic circuits) , for executing program instructions. For example, the processor 220 may include interface circuits and processing circuits therein. The interface circuits may be configured to receive electronic signals from a bus 210, wherein the electronic signals encode structured data and/or instructions for the processing circuits to process. The processing circuits may conduct logic calculations, and then determine a conclusion, a result, and/or an instruction encoded as electronic signals. Then the interface circuits may send out the electronic signals from the processing circuits via the bus 210.

The computing device 200 may further include one or more storages configured to store various data files (e.g., program instructions) to be processed and/or transmitted by the computing device 200. In some embodiments, the one or more storages may include a high speed random access memory (not shown) , a non-volatile memory (e.g., a magnetic storage device, a flash memory, or other non-volatile solid state memories) (not shown) , a disk 270, a read-only memory (ROM) 230, a random-access memory (RAM) 240, or the like, or any combination thereof. In some embodiments, the one or more storages may further include a remote storage corresponding to the processor 220. The remote storage may connect to the computing device 200 via the network 140. The computing device 200 may also include program instructions stored in the one or more storages (e.g., the ROM 230, RAM 240, and/or another type of non-transitory storage medium) to be executed by the processor 220. The methods and/or processes of the present disclosure may be implemented as the program instructions. The computing device 200 may also include an I/O component 260, supporting input/output between the computing device 200 and other components. The computing device 200 may also receive programming and data via network communications.

Merely for illustration, only one processor is illustrated in FIG. 2. Multiple processors 220 are also contemplated; thus, operations and/or method steps performed by one processor 220 as described in the present disclosure may also be jointly or separately performed by the multiple processors. For example, if in the present disclosure the processor 220 of the computing device 200 executes both operation A and operation B, it should be understood that operation A and operation B may also be performed by two different processors 220 jointly or separately in the computing device 200 (e.g., a first processor executes operation A and a second processor executes operation B, or the first and second processors jointly execute operations A and B) .

FIG. 3 is a schematic diagram illustrating exemplary hardware and/or software components of an exemplary mobile device according to some embodiments of the present disclosure. In some embodiments, the server 130 (e.g., the processing device 132) or the user device may be implemented on the mobile device 300.

As illustrated in FIG. 3, the mobile device 300 may include a communication platform 310, a display 320, a graphics processing unit (GPU) 330, a central processing unit (CPU) 340, an I/O 350, a memory 360, a mobile operating system (OS) 370, and a storage 390. In some embodiments, any other suitable components, including but not limited to a system bus or a controller (not shown) , may also be in the mobile device 300.

In some embodiments, the mobile operating system 370 (e.g., iOS ^TM, Android ^TM, Windows Phone ^TM) and one or more applications 380 may be loaded into the memory 360 from the storage 390 in order to be executed by the CPU 340. The applications 380 may include a browser or any other suitable mobile apps for receiving and rendering information relating to pose determination or other information from the pose determination system 100. User interactions with the information stream may be achieved via the I/O 350 and provided to the processing device 132 and/or other components of the pose determination system 100 via the network 140.

FIG. 4 is a block diagram illustrating an exemplary processing device 132 according to some embodiments of the present disclosure. The processing device 132 may include an initial pose obtaining module 410, a candidate pose determination module 420, and a target initial pose determination module 430.

The initial pose obtaining module 410 may be configured to obtain and/or determine an initial pose of a moving device. In some embodiments, the initial pose obtaining module 410 may determine feature information of an image acquired by the moving device and generate a matching result by matching the feature information of the image and reference feature information of a plurality of reference images. The reference feature information may be stored in a feature database. The initial pose obtaining module 410 may obtain the initial pose of the moving device based on the matching result.

In some embodiments, the initial pose obtaining module 410 may determine a plurality of similarities between the feature information of the image and the reference feature information of the plurality of reference images. The initial pose obtaining module 410 may identify, from the plurality of similarities, a similarity exceeding a similarity threshold. The initial pose obtaining module 410 may determine the initial pose of the moving device based on a reference pose or reference feature information corresponding to a reference image with the similarity exceeding the similarity threshold.

In some embodiments, in response to determining that all the plurality of similarities are less than or equal to the similarity threshold, the initial pose obtaining module 410 may obtain a second image by moving the moving device and determine second feature information of the second image. The initial pose obtaining module 410 may also determine a plurality of second similarities between the second feature information of the second image and the reference feature information of the plurality of reference images and identify, from the plurality of second similarities, a second similarity exceeding the similarity threshold. The initial pose obtaining module 410 may determine the initial pose of the moving device based on a second reference pose or second reference feature information corresponding to a second reference image with the second similarity exceeding the similarity threshold.

In some embodiments, in response to determining that all of the plurality of second similarities are less than or equal to the similarity threshold, the initial pose obtaining module 410 may determine the initial pose of the moving device based on one or more previous poses of the moving device and odometry data of the moving device. More descriptions regarding the initial pose may be found elsewhere in the present disclosure, for example, operation 510 and the descriptions thereof.

The candidate pose determination module 420 may be configured to determine a plurality of candidate poses of the moving device based on the initial pose of the moving device according to at least one map associated with the moving device. In some embodiments, the at least one map may include at least two maps with different resolutions associated with the moving device. The candidate pose determination module 420 may determine the plurality of candidate poses of the moving device by rotating or translating the initial pose within a predetermined range on the at least two maps. More descriptions regarding the candidate poses may be found elsewhere in the present disclosure, for example, operation 520, FIG. 7, and the descriptions thereof.

The target initial pose determination module 430 may be configured to determine a target initial pose of the moving device based on the plurality of candidate poses and laser data acquired by the moving device. In some embodiments, the target initial pose determination module 430 may determine, using a branch and bound algorithm, the target initial pose of the moving device based on the at least two maps and the plurality of candidate poses of the moving device.

In some embodiments, for each of the at least two maps, the target initial pose determination module 430may determine one or more modified maps by down-sampling the map. The target initial pose determination module 430 may also determine, using the branch and bound algorithm, the target initial pose of the moving device based on the at least two maps, modified maps corresponding to the at least maps, and the plurality of candidate poses of the moving device.

In some embodiments, the at least two maps may include a first map with a first resolution and a second map with a second resolution. The target initial pose determination module 430 may determine, based on the first map with the first resolution and the initial pose of the moving device, a plurality of first candidate poses of the moving device among the plurality of candidate poses of the moving device. The target initial pose determination module 430 may determine, using the branch and bound algorithm, a first target initial pose of the moving device based on the plurality of first candidate poses. The target initial pose determination module 430 may determine, based on the second map with the second resolution and the first target initial pose of the moving device, a plurality of second candidate poses of the moving device among the plurality of candidate poses of the moving device. The target initial pose determination module 430 may determine, using the branch and bound algorithm, a second target initial pose of the moving device based on the plurality of second candidate poses of the moving device. The target initial pose determination module 430 may designate the second target initial pose of the moving device as the target initial pose of the moving device. More descriptions regarding the initial pose may be found elsewhere in the present disclosure, for example, operation 530, FIG. 7, and the descriptions thereof.

In some embodiments, the processing device 132 may also include a feature database generation module (non shown in FIG. 4) . The feature database generation module may be configured to generate the feature database. The feature database generation module may obtain a reference map and a plurality of reference poses of a reference moving device based on the reference map. Two adjacent reference poses of the plurality of reference poses may satisfy a preset condition. The feature database generation module may determine a plurality of images acquired by the reference moving device from the plurality of reference poses as the plurality of reference images. For each of the plurality of reference images, the feature database generation module may extract and store the reference feature information of the reference image. The reference feature information may include a reference feature point, a reference representation of the reference feature point, a reference coordinate of the reference feature point, or the like, or any combination thereof.

In some embodiments, the feature database generation module may determine a matching result of the laser data acquired by the moving device and a map associated with the moving device. In response to determining that the matching result satisfies a preset condition, the feature database generation module may determine whether a highest similarity between the feature information and corresponding reference feature information in the feature database is smaller than a similarity threshold. In response to determining that the similarity is smaller than the similarity threshold, the feature database generation module may update the feature database by replacing reference feature information corresponding to the reference image with the feature information of the image.

The modules in the processing device 132 may be connected to or communicated with each other via a wired connection or a wireless connection. The wired connection may include a metal cable, an optical cable, a hybrid cable, or the like, or any combination thereof. The wireless connection may include a Local Area Network (LAN) , a Wide Area Network (WAN) , a Bluetooth, a ZigBee, a Near Field Communication (NFC) , or the like, or any combination thereof. Two or more of the modules may be combined into a single module, and any one of the modules may be divided into two or more units. For example, the candidate pose determination module 420 and the target initial pose determination module 430 may be combined as a single module which may determine the plurality of candidate poses and determine the target initial pose of the moving device. As another example, the processing device 132 may include a storage module (not shown) which may be used to store data generated by the above-mentioned modules.

FIG. 5 is a flowchart illustrating an exemplary process for determining a target initial pose of a moving device according to some embodiments of the present disclosure. In some embodiments, the process 500 may be implemented as a set of instructions (e.g., an application) stored in the storage ROM 230 or RAM 240. The processor 220 and/or the modules in FIG. 4 may execute the set of instructions, and when executing the instructions, the processor 220 and/or the modules may be configured to perform the process 500. The operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 500 may be accomplished with one or more additional operations not described and/or without one or more of the operations herein discussed. Additionally, the order in which the operations of the process illustrated in FIG. 5 and described below is not intended to be limiting.

In 510, the processing device 132 (e.g., the initial pose obtaining module 410) (e.g., the interface circuits of the processor 220) may obtain an initial pose of a moving device (e.g., a robot, an AGV) (e.g., the moving device 110 illustrated in FIG. 1) . As used herein, a pose of a moving device may indicate a position of the moving device and/or an orientation of the moving device.

In some embodiments, the initial pose of the moving device may be set manually. In some embodiments, the initial pose of the moving device may be a default value of the pose determination system 100. In some embodiments, the initial pose of the moving device may be any initial pose determined based on image data, laser data, odometry data, or any combination thereof.

In some embodiments, the processing device 132 may determine the initial pose of the moving device based on an image acquired by the moving device (e.g., a camera thereof) and a feature database. The feature database may include reference feature information of a plurality of reference images, a plurality of reference poses corresponding to the plurality of reference images, etc. For example, take a specific reference image as an example, reference feature information of the reference image may include a reference feature point, a reference representation (also referred to as a “reference feature descriptor” ) (e.g., a feature vector) of the reference feature point, a reference coordinate (e.g., a two-dimensional coordinate in a coordinate system of a reference camera of a reference moving device, a three-dimensional coordinate in a reference coordinate system (e.g., the world coordinate system) ) of the reference feature point, or the like, or any combination thereof. More descriptions regarding the feature database may be found elsewhere in the present disclosure (e.g., FIG. 6 and the descriptions thereof) .

In some embodiments, the processing device 132 may determine feature information of the image acquired by the moving device. For example, similarly, the feature information of the image may include a feature point of the image, a representation (also referred to as a “feature descriptor” ) (e.g., a feature vector) of the feature point, a coordinate (e.g., a coordinate in a coordinate system of the camera of the moving device) of the feature point, or the like, or any combination thereof.

Then the processing device 132 may generate a matching result associated with the image with respect to the feature database based on the feature information. Specifically, the processing device 132 may determine a plurality of similarities (e.g., similarities between the feature points (or corresponding representations) in the image and the reference feature points (or corresponding representations) in the reference images) between the image (or the feature information of the image) and the plurality of reference images (or the reference information of the plurality of reference images) , for example, using a loop closure detection (LCD) algorithm (e.g., a distributed bag of word3 (DBOW3) algorithm) , etc.

In some embodiments, the processing device 132 may generate the matching result by matching the feature information of the image and the reference feature information of the plurality of reference images. In this situation, the feature database only stores the reference feature information without the actual reference images, which can save storage capacity and improve processing efficiency. It should be noted that the processing device 132 also can generate the matching result by directly matching the image and the plurality of reference images based on the feature information of the image and the reference feature information of the plurality of reference images. That is, the feature database also can store the actual reference images.

Further, the processing device 132 may obtain the initial pose of the moving device based on the matching result. Specifically, the processing device 132 may identify a similarity exceeding a similarity threshold from the plurality of similarities. Furthermore, the processing device 132 may determine the initial pose of the moving device based on a reference pose or reference feature information corresponding to a reference image with the similarity exceeding the similarity threshold.

In some embodiments, the processing device 132 may directly designate the reference pose corresponding to the reference image with the similarity exceeding the similarity threshold as the initial pose of the moving device. In some embodiments, the processing device 132 may identify at least one reference feature point in the reference image (with the similarity exceeding the similarity threshold) that matches at least one feature point of the image. As used herein, “match” refers to that a feature point and a corresponding reference feature point corresponds to a same physical point in a region where the moving device is located. Further, the processing device 132 may determine the initial pose based on reference coordinate (s) of the at least one reference feature point. For example, the processing device 132 may determine the initial pose based on reference coordinate (s) of the at least one reference feature point and coordinate (s) of the at least one feature point according to a perspective-n-point (PNP) algorithm (see details in formula (2) later) . Merely by way of example, the reference coordinate (s) of the at least one reference feature point may be 3D coordinate (s) in the world coordinate system, the coordinate (s) of the at least one feature point may be 2D coordinate (s) in the camera coordinate system, since the at least one reference feature point and the at least one feature point correspond to same physical point (s) , the at least one feature point can be understood as 2D projection point (s) of the at least one reference feature point according to the initial pose of the moving device when the image including the at least one feature point is acquired. Accordingly, the initial pose can be determined based on the reference coordinate (s) (3D coordinate (s) ) of the at least one reference feature point and the coordinate (s) (2D coordinate (s) ) of the at least one feature point.

In some embodiments, in response to determining that all the plurality of similarities are less than or equal to the similarity threshold, the processing device 132 may obtain a second image by moving the moving device. For example, the moving device may rotate by an angle (e.g., 10 degrees, 20 degrees, 30 degrees, 45 degrees, 60 degrees, 90 degrees, 180 degrees, 360 degrees) or translate by a distance smaller than a distance threshold (e.g., 0.2 meter, 0.5 meters, 1 meter) without colliding with other objects (e.g., another moving device, an obstacle, a shelf, goods) . Similar to the first image, the processing device 132 may determine second feature information of the second image and determine a plurality of second similarities between the second feature information of the second image and the reference information of the plurality of reference images. In some embodiments, the processing device 132 may identify a second similarity exceeding the similarity threshold from the plurality of second similarities and determine the initial pose of the moving device based on a second reference pose or reference feature information corresponding to a second reference image with the second similarity exceeding the similarity threshold.

In some embodiments, in response to determining that all of the plurality of second similarities are less than or equal to the similarity threshold, the processing device 132 may determine the initial pose of the moving device based on one or more previous poses of the moving device and odometry data (which is described above in operation 510) of the moving device. For example, the processing device 132 may determine the initial pose of the moving device using a motion model represented by Formula (1) below:

where [x ₁ y ₁ θ ₁] ^T refers to the initial pose of the moving device, [x y θ] ^T refers to one of the one or more previous pose or an average of the one or more previous poses, [dx dy dθ] ^T refers to a difference between the odometry data at the current time point and odometry data at a previous time point corresponding to the previous pose or a difference between the odometry data at the current time point and odometry data at an average previous time point corresponding to the one or more previous poses.

In some embodiments, the feature database may be updated according to a predetermined time interval (e.g., per 1 minute, per 2 minutes, per 5 minutes, per month, per two months) , a predetermined distance interval (e.g., per 50 centimeters, per 1 meter, per 2 meters, per 5 meters) , or a matching result between laser data acquired by the moving device and a map (e.g., a predetermined occupancy grid map of a region where the moving device is located, see details in FIG. 13) associated with the moving device. Specifically, the processing device 132 may determine a matching result (e.g., a matching score) of the laser data and the map associated with the moving device. The larger the matching score is, the better the laser data may match the map. In response to determining that the matching result satisfies a preset condition (e.g., the matching score is smaller than a score threshold (e.g., 35, 40, 45, 50) ) , the processing device 132 may determine whether a highest similarity between the feature information and corresponding reference feature information in the feature database is smaller than a similarity threshold. In response to determining that the highest similarity is smaller than the similarity threshold, the processing device 132 may update the feature database by replacing reference feature information corresponding to the reference image with the feature information of the image. In some embodiments, the processing device 132 may determine at least one similarity between the image and at least one reference image with reference pose (s) close to the initial pose. In response to determining that the at least one similarity is smaller than the similarity threshold, the processing device 132 may update the feature database by replacing reference feature information corresponding to the at least one reference image with the feature information of the image.

In 520, the processing device 132 (e.g., the candidate pose determination module 420) (e.g., the processing circuits of the processor 220) may determine a plurality of candidate poses of the moving device based on the initial pose of the moving device according to at least one map associated with the moving device. In some embodiments, the at least one map may be determined using, for example, a simultaneous localization and mapping (SLAM) algorithm based on laser data of a region where the moving device is located) .

In some embodiments, the processing device 132 may determine the plurality of candidate poses of the moving device by rotating and/or translating the initial pose within a predetermined range on the at least one map. In some embodiments, the processing device 132 may rotate the initial pose by a preset angle (e.g., a range from -10 degrees to 10 degrees, a range from -5 degrees to 5 degrees, a range from -1 degree to 1 degree, 30 degrees, 45 degrees, 60 degrees) on the at least one map. In some embodiments, the processing device 132 may translate the initial pose by a preset step length within the predetermined range (e.g., a range with the specific candidate pose as a center) on the at least one map.

In some embodiments, the processing device 132 may first determine intermediate poses of the moving device by rotating the initial pose on the at least one map and then determine the plurality of candidate poses by translating the intermediate poses on the at least one map. In some embodiments, the processing device 132 may determine the plurality of candidate poses by only rotating the initial pose on the at least one map. In some embodiments, the processing device 132 may determine the plurality of candidate poses by only translating the initial pose on the at least one map.

In some embodiments, the at least one map may include at least two maps with different resolutions associated with the moving device. The different resolutions may be default settings of the pose determination system 100 or may be adjustable under different situations. For example, the at least two maps may include a map with a resolution of 8 centimeters and a map with a resolution of 2 centimeters.

In some embodiments, for each of the at least two maps, the processing device 132 may generate one or more modified maps with one or more modified resolutions by down-sampling the map. The processing device 132 may also determine candidate poses by rotating and/or translating the initial pose within a predetermined range on the modified maps. Take a specific modified map as an example, the processing device 132 may first determine intermediate poses of the moving device by rotating the initial pose on the modified map and then determine candidate poses by translating the intermediate poses on the modified map. More descriptions regarding the candidate poses may be found elsewhere in the present disclosure (e.g., FIG. 7 and the descriptions thereof) .

In 530, the processing device 132 (e.g., the target initial pose determination module 430) (e.g., the processing circuits of the processor 220) may determine a target initial pose of the moving device based on the plurality of candidate poses and laser data acquired by the moving device.

In some embodiments, as described above, the processing device 132 may determine the target initial pose of the moving device based on the at least two maps (and/or the modified maps) and the plurality of candidate poses of the moving device. In some embodiments, for each of at least a portion of the plurality of candidate poses, the processing device 132 may determine a score corresponding to the candidate pose based on a matching result of the laser data and a map (or a modified map) corresponding to the candidate pose. Take a specific candidate pose and a corresponding map (or a modified map) as an example, the processing device 132 may project the laser data onto the map (or the modified map) based on the candidate pose and determine a score of the candidate pose based on occupancy rates (which indicate probabilities of being occupied by obstacles) of grids where the laser data is projected on the map (or the modified map) . In some embodiments, the higher the score of the candidate pose is, the better the candidate pose matches the corresponding map (or the modified map) may be, accordingly, the more accurate the candidate pose may be. Further, the processing device 132 may determine the target initial pose of the moving device based on the scores of the candidate poses. For example, the processing device 132 may designate a candidate pose having a highest score as the target initial pose of the moving device.

In some embodiments, the identification of the candidate pose having the highest score can be understood as a “searching process, ” accordingly, in order to accelerate the searching process, the processing device 132 may determine, using a branch and bound (BB) algorithm, the target initial pose of the moving device based on the at least two maps with different resolutions and the plurality of candidate poses of the moving device. Take the at least two maps including a first map with a first resolution and a second map with a second resolution (which is different from the first resolution) as an example, the processing device 132 may determine a plurality of first candidate poses of the moving device among the plurality of candidate poses of the moving device based on the first map with the first resolution and the initial pose of the moving device. The processing device 132 may determine a first target initial pose of the moving device based on the plurality of first candidate poses using the BB algorithm. Then processing device 132 may determine a plurality of second candidate poses of the moving device among the plurality of candidate poses of the moving device based on the second map with the second resolution and the first target initial pose of the moving device. The processing device 132 may determine a second target initial pose of the moving device based on the plurality of second candidate poses of the moving device using the BB algorithm. Further, the processing device 132 may designate the second target initial pose of the moving device as the target initial pose of the moving device. More descriptions regarding the target initial pose may be found elsewhere in the present disclosure (e.g., FIG. 7 and the descriptions thereof) .

According to some embodiments of the present disclosure, at least two maps (e.g., a first map with a relatively Iow resolution and a second map with a relatively high resolution) with different resolutions are used to execute the searching process using the BB algorithm, for example, an intermediate searching process is executed using the first map with the relatively Iow resolution and then an intermediate searching process is executed using the second map with the relatively high resolution based on an intermediate search result of the first map, accordingly, a searching range in the second map with the relatively high resolution can be reduced, thereby improving searching speed and accordingly improving the accuracy and efficiency of the pose determination.

It should be noted that the above description is merely provided for the purposes of illustration, and not intended to limit the scope of the present disclosure. For persons having ordinary skills in the art, multiple variations or modifications may be made under the teachings of the present disclosure. However, those variations and modifications do not depart from the scope of the present disclosure. For example, one or more other optional operations (e.g., a storing operation) may be added elsewhere in the process 500. In the storing operation, the processing device 132 may store information and/or data (e.g., the initial pose of the moving device, the at least two maps with different resolutions) associated with the target initial pose of the moving device in a storage device (e.g., the storage device 150, the ROM 230, the RAM 240, and/or the storage 390) disclosed elsewhere in the present disclosure.

FIG. 6 is a flowchart illustrating an exemplary process for generating a feature database according to some embodiments of the present disclosure. In some embodiments, the process 600 may be implemented as a set of instructions (e.g., an application) stored in the storage ROM 230 or RAM 240. The processor 220 and/or the modules in FIG. 4 may execute the set of instructions, and when executing the instructions, the processor 220 and/or the modules may be configured to perform the process 600. The operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 600 may be accomplished with one or more additional operations not described and/or without one or more of the operations herein discussed. Additionally, the order in which the operations of the process illustrated in FIG. 6 and described below is not intended to be limiting.

In 610, the processing device 132 (e.g., the feature database generation module) (e.g., the interface circuits of the processor 220) may obtain a reference map.

In some embodiments, the reference map may be an occupancy grid map (see details in FIG. 13) which includes a plurality of grids each of which corresponds to an occupancy rate. The occupancy rate may indicate a probability that a corresponding grid is occupied by an obstacle. In some embodiments, the reference map may be predetermined based on reference laser data of a reference region using, for example, a simultaneous localization and mapping (SLAM) algorithm. In some embodiments, the reference laser data may be acquired by the moving device or by other devices which can acquire laser data. In some embodiments, the processing device 132 may obtain the reference map from a storage device (e.g., the storage device 150, an external storage device) that stores the reference map.

In 620, the processing device 132 (e.g., the feature database generation module) (e.g., the processing circuits of the processor 220) may obtain a plurality of reference poses of a reference moving device (e.g., any moving device similar to or different from the moving device 110) based on the reference map.

In some embodiments, the processing device 132 may obtain reference laser data by moving the reference moving device along a plurality of preset routes in the reference region. Then the processing device 132 may determine the plurality of reference poses based on the reference laser data, for example, using a scan to map algorithm.

In some embodiments, two adjacent reference poses of the plurality of reference poses satisfy a preset condition. The preset condition may include that a time difference between time points corresponding to the two adjacent poses respectively exceeds a time threshold, a difference (e.g., a translation difference, a rotation difference) between the two adjacent poses of the reference moving device exceeds a difference threshold, etc. In some embodiments, the time threshold and/or the difference threshold may be default settings of the pose determination system 100 or may be adjustable under different situations.

In 630, the processing device 132 (e.g., the feature database generation module) (e.g., the processing circuits of the processor 220) may determine a plurality of images acquired by the reference moving device (e.g., a reference camera thereof) from the plurality of reference poses as the plurality of reference images.

In 640, for each of the plurality of reference images, the processing device 132 (e.g., the feature database generation module) (e.g., the processing circuits of the processor 220) may extract and store reference feature information of the reference image.

As described in connection with FIG. 5, the reference feature information of the reference image may include a reference feature point, a reference representation of the reference feature point, a reference coordinate (e.g., a 2D coordinate, a 3D coordinate) of the reference feature point, or the like, or any combination thereof. In some embodiments, the processing device 132 may extract the reference feature point and the corresponding reference representation using a Harris algorithm, a scale-invariant feature transform (SIFT) algorithm, a feature from accelerated segment test (FAST) algorithm, etc.

In some embodiments, the reference coordinate of the reference feature point may include a coordinate (called as “first coordinate” for convenience) (e.g., a 2D coordinate) in a coordinate system of the reference camera, a coordinate (called as “second coordinate” for convenience) (e.g., a 3D coordinate) in the world coordinate system, etc. The processing device 132 may determine the first coordinate of the reference feature point based on a reference position of the reference feature point in the reference image and the coordinate system of the reference camera. The processing device 132 may determine the second coordinate of the reference feature point based on the first coordinate using a camera perspective projection model. For example, the processing device 132 may determine the second coordinate according to formula (2) below:

where T refers to a reference pose corresponding to the reference image, p refers to the first coordinate of the reference feature point, P ^W refers to the second coordinate of the reference feature point, and K refers to an internal parameter of the reference camera.

In some embodiments, for two adjacent reference poses (or two adjacent reference images) , the processing device 132 may determine whether there are matched reference feature points ( “matched” refers to that the reference feature points correspond to a same physical point in real-world) . In response to determining that there are matched reference feature points, the processing device 132 may determine a coordinate (i.e., second coordinate) of a latter reference feature point (i.e., reference feature point in the latter of the two adjacent reference images) in the world coordinate system as a coordinate (i.e., second coordinate) of a former reference feature point (i.e., reference feature point in the former of the two adjacent reference images) in the world coordinate system. In response to determining that there are no matched reference feature points, the processing device 132 may determine a coordinate (i.e., the second coordinate) of a latter reference feature point (i.e., reference feature point in the latter of the two adjacent reference images) in the world coordinate system according to formula (2) above.

In some embodiments, the processing device 132 may identify the matched reference feature points based on reference representations of reference feature points in the two adjacent reference images. For example, the processing device 132 may identify the matched reference feature points using a polar geometric constraint according to formula (3) below:

x ₀Ex ₁ = 0 (3)

where x ₀ refers to a normalized coordinate of a reference feature point in the former of the two adjacent reference images, x ₁ refers to a normalized coordinate of a reference feature point in the latter of the two adjacent reference images, and E refers to an essential matrix.

It should be noted that the above description is merely provided for the purposes of illustration, and not intended to limit the scope of the present disclosure. For persons having ordinary skills in the art, multiple variations or modifications may be made under the teachings of the present disclosure. However, those variations and modifications do not depart from the scope of the present disclosure.

FIG. 7 is a flowchart illustrating an exemplary process for determining a target initial pose of a moving device according to some embodiments of the present disclosure. FIGs. 8A-8C are schematic diagrams illustrating an exemplary process for down-sampling a map according to some embodiments of the present disclosure. FIG. 9 is a schematic diagram illustrating a branch and bound algorithm according to some embodiments of the present disclosure. In some embodiments, the process 700 may be implemented as a set of instructions (e.g., an application) stored in the storage ROM 230 or RAM 240. The processor 220 and/or the modules in FIG. 4 may execute the set of instructions, and when executing the instructions, the processor 220 and/or the modules may be configured to perform the process 700. The operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 700 may be accomplished with one or more additional operations not described and/or without one or more of the operations herein discussed. Additionally, the order in which the operations of the process illustrated in FIG. 7 and described below is not intended to be limiting.

In 710, for each of at least two maps (e.g., the at least two maps illustrated in FIG. 5) , the processing device 132 (e.g., the target initial determination module 430) (e.g., the processing circuits of the processor 220) may determine one or more modified maps by down-sampling the map.

In some embodiments, the processing device 132 may generate the one or more modified maps with one or more modified resolutions by down-sampling the map at one or more sampling rates. The larger the sampling rate is, the larger a ratio of a modified resolution to the resolution (i.e., the original resolution of the map) may be. For example, the sampling rate may be equal to the ratio of the modified resolution to the original resolution.

In some embodiments, the at least two maps may be occupancy grid maps. The occupancy grid map may include a plurality of grids each of which corresponds to an occupancy rate. The occupancy rate may indicate a probability that a corresponding grid is occupied by an obstacle. In some embodiments, for each of the at least two maps, a count of grids in the map is the same as a count of modified grids in the modified map (s) . In some embodiments, take a specific modified map as an example, modified occupancy rates corresponding to the modified grids may be determined by sliding a window (e.g., a window including multiple grids of which the count may be equal to the sampling rate) associated with the sampling rate on the map, for example, along a left-to-right direction, a top-to-bottom direction of the map, etc. Further, modified occupancy rates corresponding to the modified grids within the window on the modified map may be designated as a largest occupancy rate of grids within the window on the original map. For example, as shown in FIGs. 8A-8C, a modified map 820 is generated by down-sampling a map 810 along a left-to-right direction of the map 810 at a sampling rate of 2; a modified map 830 is generated by down-sampling a map 810 along a top-to-bottom direction of the map 810 at a sampling rate of 2.

In some embodiments, for each of the at least two maps, the count of the one or more modified maps and/or the sampling rates may be default settings of the pose determination system 100 or may be adjustable under different situations. In some embodiments, the count of the one or more modified maps and/or the sampling rates may be associated with a size of the original map. For example, the smaller the size of the original map is, the smaller the count of the one or more modified maps may be. In some embodiments, the one or more modified resolutions of the one or more modified maps may be smaller than or equal to a resolution threshold, for example, 1.6 meters, 2 meters, 2.5 meters, 3 meters, etc. For example, take a map with a resolution of 10 centimeters and the resolution threshold being of 1.6 meters as an example, one or more modified maps corresponding to the map with the resolution of 10 centimeters may include a modified map with a modified resolution of 20 centimeters, a modified map with a modified resolution of 40 centimeters, a modified map with a modified resolution of 80 centimeters, and a modified map with a modified resolution of 160 centimeters.

In some embodiments, for the at least two maps, the counts of modified map (s) may be the same or different. Take the at least two maps including a first map with a resolution of 8 centimeters and a second map with a resolution of 2 centimeters as an example, for the first map, four modified maps may be generated, for example, a modified map with a resolution of 16 centimeters, a modified map with a resolution of 32 centimeters, a modified map with a resolution of 64 centimeters, and a modified map with a resolution of 128 centimeters; for the second map, two modified maps may be generated, for example, a modified map with a resolution of 4 centimeters and a modified map with a resolution of 8 centimeters.

In 720, the processing device 132 (e.g., the target initial determination module 430) (e.g., the processing circuits of the processor 220) may determine, using a branch and bound (BB) algorithm, a target initial pose of the moving device based on the at least two maps, modified maps corresponding to the at least maps, and a plurality of candidate poses of the moving device.

As described in connection with

operations

520 and 530, the processing device 132 may determine the plurality of candidate poses of the moving device by rotating and/or translating an initial pose (e.g., the initial pose determined in operation 510) of the moving device within a predetermined range (e.g., a rectangle of 2 meters ^＊2 meters, a rectangle of 0.2 meters ^＊0.2 meters) on the at least two maps.

In some embodiments, take a specific map as an example, the processing device 132 may first determine intermediate poses of the moving device by rotating the initial pose on the map and then determine candidate poses by translating the intermediate poses on the map. In some embodiments, the processing device 132 may determine candidate poses by only rotating the initial pose on the map. In some embodiments, the processing device 132 may determine candidate poses by only translating the initial pose on the map.

In some embodiments, the predetermined range may be a default setting of the pose determination system 100 or may be adjustable under different situations (e.g., a speed requirement for determining the target initial pose, an accuracy requirement for determining the target initial pose) . In some embodiments, the predetermined range may be different for different resolutions of the at least two maps. Merely by way of example, a predetermined range of a map with a resolution of 8 centimeters may be a rectangle of 2 meters ^＊2 meters and a predetermined range of a map with a resolution of 2 centimeters may be a rectangle of 0.2 meters ^＊0.2 meters.

As described in connection with operation 530, the processing device 132 may determine a plurality of scores corresponding to at least a portion of the plurality of candidate poses and designate a candidate pose having a highest score as the target initial pose of the moving device.

In some embodiments, the at least two maps may include a first map with a first resolution, a second map with a second resolution, ..., a jth map with a jth resolution, ..., and an mth map with an mth resolution, wherein m is an integer larger than 1 and a jth resolution is lower than a (j+1) th resolution.

In some embodiments, the processing device 132 may determine a plurality of first candidate pose of the moving device among the plurality of candidate poses based on the first map (and corresponding modified map (s) ) and the initial pose of the moving device. The processing device 132 may determine a first target initial pose of the moving device based on the plurality of first candidate poses using the BB algorithm. Iteratively, the processing device 132 may determine a plurality of jth candidate poses of the moving device among the plurality of candidate poses based on the jth map (and corresponding modified map (s) ) and a (j-1) th target initial pose of the moving device. The processing device 132 may determine a jth target initial pose of the moving device based on the plurality of jth candidate poses using the BB algorithm. Further, similarly, the processing device 132 may determine an mth target initial pose of the moving device based on a plurality of mth candidate poses using the BB algorithm. Then the processing device 132 may designate the mth target initial pose of the moving device as the target initial pose of the moving device.

In some embodiments, take a specific map and one or more corresponding modified maps (we can assume that the corresponding modified maps are determined by down-sampling the specific map along a left-to-right direction and a top-to-bottom direction both at a sampling rate of 2) as an example, the map and the modified map (s) may be ordered as a first intermediate map (i.e, the specific map) , a second intermediate map, ..., an ith intermediate map, ..., an nth intermediate map, wherein n is an integer and a resolution of the ith intermediate map is higher than a resolution of a (i+1) th intermediate map. The processing device 132 may classify candidate poses determined based on the specific map and the modified map (s) into a plurality of groups, for example, a first group, a second group, ..., an ith group, ..., an nth group corresponding to the first intermediate map (i.e., the specific map) , the second intermediate map, ..., the ith intermediate map, ..., the nth intermediate map respectively. The first group may include candidate poses (e.g., pose A1, A2, A3, A4, A5, A6, A7, A8 illustrated in FIG. 9) determined based on the initial pose and the first intermediate map. The second group may include candidate poses (e.g., pose B1, B2, B3, B4, B5 illustrated in FIG. 9) determined based on the second intermediate map and the initial pose. The ith group may include candidate poses determined based on the initial pose and the ith intermediate map. The nth group may include candidate poses determined based on the initial pose and the nth intermediate map.

As described above, since the modified maps are determined by down-sampling the specific map both along a left-to-right direction and a top-to-bottom direction both at a sampling rate of 2, each pose in the ith group may correspond to four poses in the (i-1) th group. Accordingly, the pose in the ith group may be considered as a parent pose of the four poses in the (i-1) th group and the four poses in the (i-1) th group may be considered as child poses of the pose in the ith group. Take poses illustrated in FIG. 9 as an example, pose B1 in a second group is a parent pose of pose A1, A2, A3, and A4 in a first group; pose C1 in a third group is a parent pose of pose B1, B2, B3, and B4 in the second group; and pose D1 in a fourth group is a parent pose of pose C1, C2, C3, and C4 in the third group.

In some embodiments, the processing device 132 may select a pose in the nth intermediate map and execute an iteratively reciprocating process from the nth intermediate map to the first intermediate map until a pose with the highest score is identified.

For example, for a specific pose in the nth group, the processing device 132 may determine corresponding four child poses in the (n-1) group and select (e.g., randomly select) one from the four child poses in the (n-1) group. Then the processing device 132 may determine four child poses in the (n-1) group corresponding to the select one. Iteratively, the processing device 132 may determine corresponding four child poses in the (i-1) th group corresponding to a selected one in the ith group until reaches the first group (or the first intermediate map) , that is, the processing device 132 may determine corresponding four child poses in the first group corresponding to a selected one in the second group. The processing device 132 may determine four scores of the four child poses in the first group and designate a highest score thereof as a first score.

Further, the processing device 132 may return to the second group (or the second intermediate map) from the pose with the first score in the first group. For a specific pose in the second group, if a score of the pose is smaller than or equal to the first score, the processing device 132 may ignore the pose in the second group (corresponding child poses are also ignored in further process) . Take the pose D3 as an example, if the pose D3 is ignored, poses in a rectangle 910 are also ignored (i.e., corresponding scores will not be determined) . If a score of the pose is larger than the first score, the processing device 132 may further return to the first group (or the first intermediate map) and determine corresponding four child poses in the first group corresponding to the pose with the score larger than the first score. Similarly, the processing device 132 may determine four scores of the four child poses in the first group and designate a highest score thereof and repeat the above process. Finally, the processing device 132 may designate a highest score among scores corresponding to poses in the second group as a second score.

Iteratively, the processing device 132 may return to the ith group (or the ith intermediate map) from a pose with the (i-1) th score. For a specific pose in the ith group, if a score of the pose is smaller than or equal to an (i-1) th score, the processing device 132 may ignore the pose in the ith group (corresponding child poses are also ignored in further process) . If the score of the pose in the ith group is larger than the (i-1) th score, the processing device 132 may further return to the lower groups (or maps) (i.e., the first group (or the first intermediate map) , the second group (or the second intermediate map) , .. ) subsequently and determine corresponding child poses in the lower groups corresponding to the pose with the score larger than the (i-1) th score. The processing device 132 may determine scores of the child poses corresponding to the pose in the lower groups of the ith group and designate a highest score thereof and repeat the above process. Finally, the processing device 132 may designate a highest score among scores corresponding to poses in the ith group as an ith score.

Iteratively, the processing device 132 may return to the nth group (or the nth intermediate map) from the pose with an (n-1) th score in an (n-1) group. For a specific pose in the nth group, if the score of the pose is smaller than or equal to an (n-1) th score, the processing device 132 may ignore the pose in the nth group (corresponding child poses are also ignored in further process) . If a score of the pose in the nth group is larger than the (n-1) th score, the processing device 132 may further return to the lower groups (i.e., the first group (or the first intermediate map) , the second group (or the second intermediate map) , .., the (n-1) th group (or the (n-1) th intermediate map) ) subsequently and determine corresponding child poses in the lower groups corresponding to the pose with the score larger than the (n-1) th score. The processing device 132 may determine scores of the child poses corresponding to the poses in the lower groups of the nth group and designate a highest score thereof. Finally the processing device 132 may designate a pose with a highest score among scores corresponding to poses in the nth group as the first target initial pose of the moving device.

Similarly, the processing device 132 may perform the above process on the second map, .., the jth map, and the mth map to determine a second target initial pose, ..., a jth target initial pose, ..., and an mth target initial pose and designate the mth target initial pose as the final target initial pose of the moving device.

FIG. 10 is a flowchart illustrating an exemplary process for determining a target initial pose of a moving device according to some embodiments of the present disclosure. In some embodiments, the process 1000 may be implemented as a set of instructions (e.g., an application) stored in the storage ROM 230 or RAM 240. The processor 220 and/or the modules in FIG. 4 may execute the set of instructions, and when executing the instructions, the processor 220 and/or the modules may be configured to perform the process 1000. The operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 1000 may be accomplished with one or more additional operations not described and/or without one or more of the operations herein discussed. Additionally, the order in which the operations of the process illustrated in FIG. 10 and described below is not intended to be limiting.

In 1010, the processing device 132 may obtain a first image (e.g., the image described in operation 510) acquired by a moving device (e.g., the moving device 110) when a pose of the moving device is lost.

In 1020, the processing device 132 may search reference feature information of a second image in a feature database (e.g., the feature database described in FIGs. 5 and 6) , wherein a similarity between the reference feature information of the second image and feature information of the first image is greater than or equal to a first preset threshold. Then the processing device 132 may determine an initial pose of the moving device based on a reference pose or reference feature information corresponding to the second image. More

descriptions regarding operations

1010 and 1020 may be found elsewhere in the present disclosure, for example, operation 510 and the descriptions thereof.

In 1030, the processing device 132 may rotate and/or translate the initial pose to obtain N first candidate poses (which include the initial pose) , wherein N may be an integer greater than or equal to 1. In some embodiments, the N first candidate poses may include the candidate poses described in FIGs. 5 and 7.

In 1040, the processing device 132 may determine a target initial pose of the moving device according to scores corresponding to the N first candidate poses, which may be determined based on laser data acquired by the moving device. More descriptions regarding operation 1030 may be found elsewhere in the present disclosure, for example, operation 530 or FIG. 7 and the descriptions thereof.

According to the process 1000, the target initial pose may be determined based on multiple types (e.g., the image and the laser data) of data acquired by the moving device, thereby improving the accuracy of the target initial pose.

FIGs. 11A and 11B are a flowchart illustrating a process for determining a target initial pose of a moving device according to some embodiments of the present disclosure. In some embodiments, the process 1100 may be implemented as a set of instructions (e.g., an application) stored in the storage ROM 230 or RAM 240. The processor 220 and/or the modules in FIG. 4 may execute the set of instructions, and when executing the instructions, the processor 220 and/or the modules may be configured to perform the process 1100. The operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 1100 may be accomplished with one or more additional operations not described and/or without one or more of the operations herein discussed. Additionally, the order in which the operations of the process illustrated in FIG. 11 and described below is not intended to be limiting.

In 1101, the processing device 132 may determine a plurality of reference poses of a reference moving device (e.g., a reference camera thereof) using a SLAM algorithm. In some embodiments, the processing device 132 may determine the plurality of reference poses by matching, for example, using a scan to map algorithm, reference laser data acquired by the reference moving device and a reference map associated with a moving device (e.g., the moving device described in FIG. 10) determined based on the SLAM algorithm.

In 1102, the processing device 132 may generate a feature database by selecting a plurality of reference images acquired from the reference poses and extracting reference information (e.g., reference feature points and reference representations) of the plurality of reference images respectively. In some embodiments, reference feature information other than the reference images may be stored in the feature database, thereby saving a storage capacity of the pose determination system 100. More

descriptions regarding operations

1101 and 1102 may be found elsewhere in the present disclosure, for example, FIG. 6 and the descriptions thereof.

In 1103, when a pose of a moving device is lost, the processing device 132 may obtain an image corresponding to a current time point. The processing device 132 may also extract feature information (e.g., feature points and representations) of the image.

In 1104, the processing device 132 may determine a plurality of similarities between the feature points (or corresponding representations) in the image and the reference feature points (or corresponding representations) in the reference images using a loop closure detection (LCD) algorithm (e.g., a DBOW3 algorithm) .

In 1105, the processing device 132 may determine whether one of the plurality of similarities exceeds a similarity threshold.

In 1106, in response to determining that the similarity exceeds the similarity threshold, the processing device 132 may determine an initial pose of the moving device based on a reference pose or reference feature information corresponding to a reference image with the similarity exceeding the similarity threshold. More descriptions regarding operations 1103-1106 may be found elsewhere in the present disclosure, for example, operation 510 and the descriptions thereof.

In 1107, the processing device 132 may determine a target initial pose of the moving device based on the initial pose and laser data acquired by the moving device at the current time point. More descriptions regarding operation 1107 may be found elsewhere in the present disclosure, for example,

operations

520, 530, FIG. 7, and the descriptions thereof.

In 1108, in response to determining that all the plurality of similarities are less than or equal to the similarity threshold, the processing device 132 may move the moving device by a preset distance or rotate the moving device by a preset angle to obtain a second image. In some embodiments, the processing device 132 may determine second feature information of the second image and determine a plurality of second similarities between the second feature information of the second image and the reference feature information of the plurality of reference images. The processing device 132 may also identify, from the plurality of second similarities, a second similarity exceeding the similarity threshold. The processing device 132 may determine the initial pose of the moving device based on a second reference pose corresponding to a second reference image with the second similarity exceeding the similarity threshold.

In 1109, the processing device 132 may determine whether a matching result of the laser data acquired by the moving device and a map associated with the moving device satisfies a preset condition.

In 1110, in response to determining that the matching result satisfies the preset condition, the processing device 132 may determine whether a highest similarity between the feature information and corresponding reference feature information in the feature database is smaller than a similarity threshold.

In 1111, in response to determining that the similarity is smaller than the similarity threshold, the processing device 132 may update the feature database by replacing reference feature information corresponding to the reference image with the feature information of the image. More descriptions regarding operations 1108-1111 may be found elsewhere in the present disclosure, for example, operation 510 and the descriptions thereof.

FIG. 12 is a block diagram illustrating an exemplary processing device 132 according to some embodiments of the present disclosure. The processing device 132 may include a first pose determination module 1210, a local map obtaining module 1220, a second pose determination module 1230, and a target pose determination module 1240.

The first pose determination module 1210 may be configured to determine a first pose of a moving device based at least in part on odometry data or laser data acquired by the moving device. The first pose determination module 1210 may determine whether a first difference between the odometry data corresponding to a current time point and previous odometry data corresponding to a previous time point adjacent to the current time point exceeds a first difference threshold or a second difference between the current time point and the previous time point exceeds a second difference threshold. In response to determining that the first difference is smaller than or equal to the first difference threshold and the second difference is smaller than or equal to the second difference threshold, the first pose determination module 1210 may determine the first pose of the moving device based on the odometry data acquired by the moving device.

In some embodiments, in response to determining that the first difference exceeds the first difference threshold or the second difference exceeds the second difference threshold, the first pose determination module 1210 may determine a first candidate pose of the moving device based on the odometry data acquired by the moving device. The first pose determination module 1210 may also determine a second candidate pose of the moving device based on a portion of the laser data and a global map associated with the moving device. The portion of the laser data may correspond to long-term features in a region where the moving device is located. The first pose determination module 1210 may determine the first pose of the moving device based on the first candidate pose and the second candidate pose.

In some embodiments, the first pose determination module 1210 may determine whether at least one marker is detected based on the laser data. In response to determining that the at least one marker is detected based on the laser data, the first pose determination module 1210 may determine the first pose of the moving device based on predetermined reference information associated with the detected at least one marker.

In some embodiments, the first pose determination module 1210 may determine whether a matching result of the laser data and a global map associated with the moving device satisfies a preset condition. In response to determining that the matching result satisfies the preset condition, the first pose determination module 1210 may determine the first pose of the moving device based on the laser data and the global map associated with the moving device.

In some embodiments, the first pose determination module 1210 may obtain a target initial pose of the moving device and determine the first pose of the moving device based on the target initial pose of the moving device and the odometry data or the laser data acquired by the moving device. In some embodiments, the target initial pose of the moving device may be determined by: obtaining an initial pose of the moving device; determining a plurality of candidate poses of the moving device based on the initial pose of the moving device; and determining the pose of the moving device at the second time point based on the plurality of candidate poses and laser data acquired by the moving device at the second time point. More descriptions regarding the first pose may be found elsewhere in the present disclosure, for example, operation 1310 and the descriptions thereof.

The local map obtaining module 1220 may be configured to obtain or construct or update at least one local map associated with the moving device. In some embodiments, each of the at least one local map includes a plurality of grids and a plurality of occupancy rates corresponding to the plurality of grids respectively. The local map obtaining module 1220 may update the at least one local map associated with the moving device based at least in part on the first pose of the moving device and the laser data acquired by the moving device. For each of the at least one local map, the local map obtaining module 1220 may project the laser data onto the local map based on the first pose of the moving device and update the local map by updating the plurality of occupancy rates corresponding to the plurality of grids based on the projected laser data.

In some embodiments, the local map obtaining module 1220 may construct or update the at least one local map based on previous laser data acquired by the moving device at previous time points within a predetermined range from a current time point corresponding to the laser data. In some embodiments, the local map obtaining module 1220 may dynamically construct or update the at least one local map according to a matching result between the previous laser data and a global map associated with the moving device.

In some embodiments, the local map obtaining module 1220 may dynamically construct or release the at least one local map according to a predetermined time interval, a count of data frames included in the at least one local map, a matching result between a pose of the moving device determined based on the at least one local map and a global map associated with the moving device, or the like, or any combination thereof.

In some embodiments, the at least one local map may include a first local map, a second local map, and a third local map. The local map obtaining module 1220 may construct the second local map when a count of data frames in the first local map reaches a first predetermined count. The local map obtaining module 1220 may construct the third local map when a count of data frames in the second local map reaches the first predetermined count. The local map obtaining module 1220 may release the first local map when the count of data frames in the first local map reaches a second predetermined count or a matching result between a pose of the moving device determined based on the second local map and a global map associated with the moving device satisfies a preset condition. More descriptions regarding the at least one local map may be found elsewhere in the present disclosure, for example, operation 1320, FIGs. 14A-A4F, and the descriptions thereof.

The second pose determination module 1230 may be configured to determine a second pose of the moving device based on the at least one local map and the laser data acquired by the moving device. In some embodiments, the second pose determination module 1230 may generate a matching result by matching the laser data and the at least one local map and determine the second pose based on the matching result, for example, using a scan to map algorithm.

In some embodiments, as described above, it is assumed that at a specific time point, the at least one local map may include a first local map, a second local map, and a third local map, the second pose determination module 1230 may determine the second pose based on the laser data and the second local map using the scan to map algorithm. The second pose determination module 1230 may determine whether a matching result (e.g., a matching score) between the global map and a pose of the moving device determined based on the second local map and the laser data satisfies a preset condition (e.g., the matching score larger than a score threshold) . In response to determining that the matching result satisfies the preset condition, the second pose determination module 1230 may designate the pose determined based on the second local map and the laser data as the second pose. In response to determining that the matching result does not satisfy the preset condition, the second pose determination module 1230 may determine a pose of the moving device based on the laser data and the first local map and designate the pose of the moving device determined based on the laser data and the first local map as the second pose. More descriptions regarding the second pose may be found elsewhere in the present disclosure, for example, operation 1330 and the descriptions thereof.

The target pose determination module 1240 may be configured to determine a target pose of the moving device based on the first pose and the second pose using a pose adjustment algorithm (e.g., a sparse pose adjustment (SPA) algorithm) . In some embodiments, the second pose (which is determined based on laser data and the at least one local map) can be understood as an offset (can be referred to as a “measured offset” ) between the first pose (which is determined based on odometry data or laser data) and a pose (called as a “third pose” ) determined based on the laser data and a global map. Generally, there is a difference between the measured offset (i.e., the second pose) and an actual offset. Accordingly, the target pose determination module 1240 may determine an error function associated with the actual offset and the measured offset and minimize the error function (i.e., minimize a deviation between the actual offset and the measured offset) to optimize the first pose and the third pose. Further, the target pose determination module 1240 may designate the optimized third pose as the target pose.

In some embodiments, the target pose determination module 1240 may perform a cyclic process for optimizing the first pose and the third pose. For example, for an ith cycle, the target pose determination module 1240 may determine a value of the error function and determine a variable (e.g., an increment or decrement including a translation and/or a rotation angle) corresponding to the first pose and the third pose. More descriptions regarding the target pose may be found elsewhere in the present disclosure, for example, operation 1340, FIG. 15, and the descriptions thereof.

The modules in the processing device 132 may be connected to or communicated with each other via a wired connection or a wireless connection. The wired connection may include a metal cable, an optical cable, a hybrid cable, or the like, or any combination thereof. The wireless connection may include a Local Area Network (LAN) , a Wide Area Network (WAN) , a Bluetooth, a ZigBee, a Near Field Communication (NFC) , or the like, or any combination thereof. Two or more of the modules may be combined into a single module, and any one of the modules may be divided into two or more units. For example, the local map obtaining module 1220 and the second pose determination module 1230 may be combined as a single module which may obtain the at least one local map and determine the second pose of the moving device based on the at least one local map and the laser data acquired by the moving device. As another example, the processing device 132 may include a storage module (not shown) which may be used to store data generated by the above-mentioned modules.

FIG. 13 is a flowchart illustrating an exemplary process for determining a target pose of a moving device according to some embodiments of the present disclosure. In some embodiments, the process 1300 may be implemented as a set of instructions (e.g., an application) stored in the storage ROM 230 or RAM 240. The processor 220 and/or the modules in FIG. 12 may execute the set of instructions, and when executing the instructions, the processor 220 and/or the modules may be configured to perform the process 1300. The operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 1300 may be accomplished with one or more additional operations not described and/or without one or more of the operations herein discussed. Additionally, the order in which the operations of the process illustrated in FIG. 13 and described below is not intended to be limiting.

In 1310, the processing device 132 (e.g., the first pose determination module 1210) (e.g., the processing circuits of the processor 220) may determine a first pose of a moving device (e.g., a robot, a UAV) (e.g., the moving device 110 illustrated in FIG. 1) based at least in part on odometry data or laser data acquired by the moving device. As used herein, a pose of a moving device may indicate a position of the moving device and/or an orientation of the moving device.

In some embodiments, the processing device 132 may determine whether a first difference between the odometry data corresponding to a current time point and previous odometry data corresponding to a previous time point adjacent to the current time point exceeds a first difference threshold (e.g., 10 centimeters, 30 centimeters, 50 centimeters, 80 centimeters) or a second difference between the current time point and the previous time point exceeds a second difference threshold (e.g., 0.05 seconds, 0.1 seconds, 0.2 seconds, 0.5 seconds, 1 second, 2 seconds) . In response to determining that the first difference is smaller than or equal to the first difference threshold and the second difference is smaller than or equal to the second difference threshold, the processing device 132 may determine the first pose of the moving device based on the odometry data acquired by the moving device. In some embodiments, the first difference threshold and/or the second difference threshold may be default settings of the pose determination system 100 or may be adjustable under different situations.

In some embodiments, the processing device 132 may determine the first pose of the moving device based on a previous pose of the moving device corresponding to the previous time point and the odometry data using a motion model represented by formula (4) below:

where [x ₂ y ₂ θ ₂] ^T refers to the first pose of the moving device, [x y θ] ^T refers to the previous pose of the moving device, and [dx dy dθ] ^T refers to a difference between the odometry data at the current time point and odometry data at a previous time point corresponding to the previous pose.

Further, in response to determining that the first difference exceeds the first difference threshold or the second difference exceeds the second difference threshold, the processing device 132 may determine the first pose of the moving device based on the odometry data, the laser data, and a global map (e.g., an occupancy grid map) associated with the moving device.

As used herein, the global map may include a plurality of grids corresponding to a first region where the moving device is located, a plurality of occupancy rates corresponding to the plurality of grids respectively, a plurality of positions of the plurality of grids, etc. The occupancy rate indicates a probability that a corresponding grid is occupied by an obstacle. For example, for a grid occupied by an obstacle, a corresponding occupancy rate may be designated as a first constant value, for example, 1; for a grid not occupied by an obstacle, a corresponding occupancy rate may be associated with (e.g., negatively correlated) a distance between the grid and a nearest obstacle. Merely by way of example, for a grid with a distance to a nearest obstacle exceeding a distance threshold (e.g., 0.3 meters, 0.4 meters, 0.5 meters) , a corresponding occupancy rate of the grid may be designated as a second constant value, for example, 0; for a grid with a distance to a nearest obstacle smaller than or equal to the distance threshold, the smaller the distance is, the larger the corresponding occupancy rate (e.g., a value in a range from 0 to 1) may be. In some embodiments, the plurality of occupancy rates may be determined using a likelihood field model and/or a Brushfire algorithm. For example, the plurality of occupancy rates may be expressed as a normal distribution with a mean of 0 and a covariance of σ.

In some embodiments, the processing device 132 may determine a first candidate pose of the moving device based on the odometry data, for example, according to formula (4) above. The processing device 132 may also determine, for example, using a scan to map algorithm, a second candidate pose of the moving device based on a portion of the laser data and the global map. In some embodiments, the laser data may correspond to long-term features in the first region and short-term features in the first region. The long-term features may correspond to features of original objects in the first region when the global map is constructed; the short-term features may correspond to features of newly added objects or newly disappeared objects in the first region at a time point when the laser data is acquired. The processing device 132 may determine the second candidate pose using laser data corresponding to the long-term features (i.e., the portion of laser data corresponds to the long-term features) .

In some embodiments, the processing device 132 may determine the portion of the laser data (i.e., the laser data corresponding to the long-term features) by projecting the laser data onto the global map. Generally, the laser data includes a plurality of groups of sub-laser data corresponding to different angles. Accordingly, after the laser data is projected onto the global map, each of the plurality of groups of sub-laser data corresponds to a projection point on the global map. For each of the plurality of groups of sub-laser data, the processing device 132 may determine a distance between a corresponding projection point and a nearest grid on the global map and determine whether the distance is less than a distance threshold. In response to determining that the distance is larger than or equal to the distance threshold, the processing device 132 may determine that the group of sub-laser data corresponds to the short-term features. In response to determining that the distance is smaller than the distance threshold, the processing device 132 may determine that the group of sub-laser data corresponds to the long-term features.

In some embodiments, take a specific group of sub-laser data as an example, the processing device 132 may determine a coordinate of a corresponding projection point in a coordinate system (e.g., the world coordinate system) of the global map and determine the distance between the corresponding projection point and a nearest grid on the global map accordingly. In some embodiments, the processing device 132 may determine the coordinate of the corresponding projection point in the coordinate system of the global map according to Formula (5) below:

where [x y θ] ^T refers to the first candidate pose of the moving device, [x _k, sens y _k, sens θ _k, sens] ^T refers to a coordinate corresponding to the group of sub-laser data in a coordinate system of the moving device, and

refers to the coordinate of the corresponding projection point in the coordinate system of the global map.

In some embodiments, after determining the first candidate pose and the second candidate pose, the processing device 132 may determine the first pose of the moving device based on the first candidate pose and the second candidate pose, for example, using a fusion algorithm. In some embodiments, the fusion algorithm may include a Kalman filter (KF) , an invariant KF (e.g., an extended KF, a multi-state constraint KF, an unscented KF) , a weighted average algorithm, a multiple Bayes estimation algorithm, a Dempster-Shafer (D-S) evidence theory, a production rule, a fuzzy logic, an artificial neural network model, or the like, or any combination thereof.

In some embodiments, the processing device 132 may determine whether at least one marker (e.g., a quick response (QR) code, a calibration object, a light pole) is detected based on the laser data (or based on image data acquired by a camera of the moving device) . In response to determining that the at least one marker is detected based on the laser data, the processing device 132 may determine the first pose of the moving device based on predetermined reference information (e.g., position information) associated with the detected at least one marker. In some embodiments, the processing device 132 may also determine a positional relationship between the at least one marker and the moving device and then determine the first pose based on the predetermined reference information and the positional relationship.

In some embodiments, the processing device 132 may determine whether a matching result (e.g., a matching score) of the laser data and the global map associated with the moving device satisfies a preset condition (e.g., the matching score larger than a score threshold (e.g., 60, 65, 70) ) . In response to determining that the matching result of the laser data and the global map satisfies the preset condition, the processing device 132 may determine the first pose of the moving device based on the laser data and the global map, for example, using the scan to map algorithm.

In some embodiments, the processing device 132 may obtain a target initial pose (e.g., determined based on the process 500 or 700) of the moving device and determine the first pose based on the target initial pose and the odometry data. The target initial pose may be an initial value of a pose of the moving device when the moving device begins to move. The processing device 132 may iteratively determine the first pose based on the target initial pose according to the process above.

In 1320, the processing device 132 (e.g., the local map obtaining module 1220) (e.g., the interface circuits of the processor 220) may obtain at least one local map associated with the moving device.

As used herein, as described above, similar to the global map, each of the at least one local map may include a plurality of grids corresponding to a second region where the moving device is located, a plurality of occupancy rates corresponding to the plurality of grids respectively, a plurality of positions of the plurality of grids, etc. The second region corresponding to the local map may be part of the first region corresponding to the global map. In other words, the global map indicates global and overall environmental information surrounding the moving device; the local map indicates a relatively local environmental information near the moving device.

In some embodiments, the processing device 132 may obtain the at least one local map from the moving device or a storage device (e.g., the storage device 150) disclosed elsewhere in the present disclosure.

In some embodiments, the at least one local map may be constructed or updated based on previous laser data acquired by the moving device at previous time points within a predetermined range from a current time point corresponding to the laser data. For example, for a firstly constructed local map, laser data acquired at a construction time point may be projected onto a plane (e.g., a plane corresponding to the global map) based on a pose (e.g., a first pose described above) of the moving device at the construction time point, then the local map may be constructed by determining (e.g., using a brushfire algorithm, a likelihood model) the plurality of occupancy rates corresponding to the plurality of grids based on the projected laser data. Further, take a specific previous time point after the construction time point as an example, laser data acquired at the specific previous time point may be projected onto a previously constructed local map based on a pose (e.g., a first pose described above) of the moving device corresponding to the previous time point, then the local map may be updated by updating the plurality of occupancy rates corresponding to the plurality of grids based on the projected laser data.

As another example, the processing device 132 may also update the at least one local map based at least in part on the first pose of the moving device and the laser data acquired by the moving device at the current time point. For each of the at least one local map, the processing device 132 may project the laser data onto the local map based on the first pose of the moving device and update the local map by updating the plurality of occupancy rates corresponding to the plurality of grids based on the projected laser data.

Take a specific grid in the local map as an example, the processing device 132 may update an intermediate occupancy rate of the grid according to formula (6) below:

where p refers to a preset parameter, x refers to a position of the grid in the local map, M _new (x) refers to the intermediate occupancy rate of the grid, and M _old (x) refers to the latest previous occupancy rate of the grid in the local map. In some embodiments, if the grid is hit by projection point (s) of the laser data on the local map, the preset parameter may be set as a value range from 0.5 to 1 (e.g., 0.65) , accordingly, the intermediate occupancy rate is determined to be larger than the latest previous occupancy rate; if the grid is not hit by projection point (s) of the laser data on the local map, the preset parameter may be a value range from 0 to 0.5 (e.g., 0.4) , accordingly, the intermediate occupancy rate is determined to be smaller than the latest previous occupancy rate.

In some embodiments, the larger the intermediate occupancy rate of a grid is, the larger a probability that the grid is occupied by an obstacle. The processing device 132 may determine occupied grids and unoccupied grids among the plurality of grids based on the intermediate occupancy rates of the plurality of grids. The processing device 132 may designate a grid with an intermediate occupancy rate larger than or equal to a first occupancy threshold (e.g., 0.65, 0.7, 0.8) as an occupied grid and designate a grid with an intermediate occupancy rate smaller than or equal to a second occupancy threshold (e.g., 0.1 0.2, 0.3) as an unoccupied grid.

In some embodiments, for a gird whose intermediate occupancy rate is smaller than the first occupancy threshold and larger than the second occupancy threshold, the processing device 132 may update a corresponding occupancy rate of the grid based on a distance between the grid to a corresponding nearest obstacle in the local map. For example, if the distance is smaller than or equal to a distance threshold, the processing device 132 may determine, based on the distance, the corresponding occupancy rate of the grid using a brushfire algorithm; if the distance is larger than the distance threshold, the processing device 132 may update the intermediate occupancy rate of the grid as the occupancy rate of the grid. More descriptions regarding the updating of the occupancy rate may be found elsewhere in the present disclosure, for example, FIGs. 14A-14F.

In some embodiments, the at least one local map may be dynamically constructed, updated, or released according to a predetermined time interval (e.g., per 0.5 minutes, per 1 minute, per 2 minutes, per 5 minutes) , a count of data frames (each of which including laser data and a corresponding pose of the moving device corresponding to a time point when the laser data is acquired) included in the at least one local map, a matching result between a pose of the moving device determined based on the at least one local map and a global map associated with the moving device, a matching result (e.g., a matching score) between the global map associated with the moving device and laser data, whether at least one marker (e.g., a light pole, a QR code) is detected based on the laser data and/or image data, or the like, or a combination thereof.

In some embodiments, the processing device 132 may determine whether the matching score between the global map and the laser data is larger than or equal to a score threshold. In response to determining that the matching score is larger than the score threshold, the processing device 132 may use the laser data to construct or update the at least one local map at a time point when the laser data is acquired. In some embodiments, if the at least one marker is detected based on the laser data and/or the image data, the processing device 132 may use the laser data to construct or update the at least one local map at a time point when the laser data and/or the image data is acquired. By doing this, the accuracy of a corresponding first pose determined based on the laser data can be ensured, accordingly, the accuracy of the at least one local map constructed or updated based on the laser data can be ensured.

In some embodiments, if a time difference between two adjacent time points when laser data is acquired exceeds a time threshold, the processing device 132 may use laser data corresponding to a later time point to construct or update the at least one local map, that is, the processing device 132 may construct or update the at least one local map according to a predetermined time interval (e.g., per 0.5 minutes, per 1 minute, per 2 minutes, per 5 minutes) . By doing this, the at least one local map can reflect an up-to-date environment surrounding the moving device.

In some embodiments, if a count of data frames included in a local map exceeds a count threshold (e.g., 40, 80) , the processing device 132 may use latest laser data and a corresponding pose of the moving device to construct another local map and release a previous local map. By doing so, a storage capacity can be saved and processing efficiency can be improved. In some embodiments, when the count of data frames included in a local map exceeds the count threshold, the processing device 132 may further determine whether a matching result (e.g., a matching score) between the global map and a pose of the moving device determined based on the local map satisfies a preset condition (e.g., the matching score larger than a score threshold) . In response to determining that the matching result satisfies the preset condition, the processing device 132 may release the local map. In response to determining that the matching result does no satisfy the preset condition, the processing device 132 may not release the local map.

Merely by way of example, it is assumed that at a specific time point, the at least one local map may include a first local map, a second local map, and a third map (it is assumed that the third map is just started to be constructed at the specific time point) . The first local map may be the earliest one of the three maps, that is, the first local map is constructed earliestly. The second local map may be constructed when a count of data frames in the first local map reaches a first predetermined count (e.g., 20, 30, 40, 50) . The third local map may be constructed when a count of data frames in the second local map reaches the first predetermined count (now the count of data frames in the first local map reaches double of the first predetermined count, that is, after the second local map is constructed, the first local map and the second local map are simultaneously updated) .

In some embodiments, the first local map may be released when the count of data frames in the first local map reaches a second predetermined count (e.g., 40, 60, 80, 100) (e.g., double of the first predetermined count, that is, when the third local map is just started to be constructed, the first local map is released) . In some embodiments, the first local map may be released when a matching result (e.g., a matching score) between a pose of the moving device determined based on the second local map and a global map associated with the moving device satisfies a preset condition (e.g., the matching score larger than a score threshold) , that is, when the third local map is just started to be constructed, the first local map is not immediately released and will be released until the accuracy of the second local map meets requirements.

In 1330, the processing device 132 (e.g., the second pose determination module 1210) (e.g., the processing circuits of the processor 220) may determine a second pose of the moving device based on the at least one local map and the laser data acquired by the moving device.

In some embodiments, the processing device 132 may generate a matching result by matching the laser data and the at least one local map and determine the second pose based on the matching result, for example, using a scan to map algorithm.

In some embodiments, as described above, it is assumed that at a specific time point, the at least one local map may include a first local map, a second local map, and a third local map, the processing device 132 may determine the second pose based on the laser data and the second local map using the scan to map algorithm. The processing device 132 may determine whether a matching result (e.g., a matching score) between the global map and a pose of the moving device determined based on the second local map and the laser data satisfies a preset condition (e.g., the matching score larger than a score threshold) . In response to determining that the matching result satisfies the preset condition, the processing device 132 may designate the pose determined based on the second local map and the laser data as the second pose. In response to determining that the matching result does not satisfy the preset condition, the processing device 132 may determine a pose of the moving device based on the laser data and the first local map and designate the pose of the moving device determined based on the laser data and the first local map as the second pose, that is, when the accuracy of the second local map does not meet requirements, the first local map will not be released and the second pose of the moving device may be determined based on the first local map.

In 1340, the processing device 132 (e.g., the target pose determination module 1210) (e.g., the processing circuits of the processor 220) may determine a target pose of the moving device based on the first pose and the second pose using a pose adjustment algorithm (e.g., a sparse pose adjustment (SPA) algorithm) .

In some embodiments, the second pose (which is determined based on laser data and the at least one local map) can be understood as an offset (can be referred to as a “measured offset” ) between the first pose (which is determined based on odometry data or laser data) and a pose (called as a “third pose” ) determined based on the laser data and a global map. Generally, there is a difference between the measured offset (i.e., the second pose) and an actual offset. Accordingly, the processing device 132 may determine an error function associated with the actual offset and the measured offset and minimize the error function (i.e., minimize a deviation between the actual offset and the measured offset) to optimize the first pose and the third pose. Further, the processing device 132 may designate the optimized third pose as the target pose.

In some embodiments, the processing device 132 may perform a cyclic process for optimizing the first pose and the third pose. For example, for an ith cycle, the processing device 132 may determine a value of the error function and determine a variable (e.g., an increment or decrement including a translation and/or a rotation angle) corresponding to the first pose and the third pose.

In some embodiments, the processing device 132 may determine the error function according to formula (7) below:

where e _ij (x) refers to the error function, t _i refers to a translation of the first pose, t _j refers to a translation of the third pose, t _ij refers to a translation of the second pose, R _i refers to a rotation matrix of the first pose, R _j refers to a rotation matrix of the third pose, R _ij refers to a rotation matrix of the second pose, θ _i refers to a rotation angle of the first pose, θ _j refers to a rotation angle of the third pose, and θ _ij refers to a rotation angle of the second pose.

In some embodiments, the processing device 132 may determine a relationship between a first pose (or a third pose) in the ith cycle and a variance matrix of the first pose (or the third pose) in the ith cycle according to formula (8) below:

where [t _x, t _y, θ] ^T refers to the first pose (or the third pose) in the ith cycle and T refers to the variance matrix of the first pose (or the third pose) in the ith cycle.

In some embodiments, the processing device 132 may determine a Jacobian matrix of the error function according to formulas (9) and (10) below:

where

refers to the Jacobian matrix.

Since the error function may be only associated with the first pose (or the third pose) in the ith cycle, the processing device 132 may transform the Jacobian matrix as formula (11) below:

where J _ij refers to the transformed Jacobian matrix.

In some embodiments, the processing device 132 may linearize the error function to determine a linearization equation associated with the error function, a value of the transformed Jacobian matrix, and a variable of the first pose (or the third pose) in the ith cycle according to formulas (12) - (14) below:

e _ij (x + Δx) = e _ij (x) +J _ij. Δx (12)

where Δx refers to the variable (e.g., an increment) of the first pose (or a third pose) in the ith cycle, H _ij refers to Hessian matrix of the first pose (or a third pose) in the ith cycle, and b _ij refers to a residual of the first pose (or a third pose) in the ith cycle.

Further, the processing device 132 may determine the variable of the first pose (or the third pose) in the ith cycle and a first pose (or a third pose) in an (i+1) cycle corresponding to the first pose (or the third pose) in the ith cycle according to formulas (15) and (16) below:

Δx = (∑ _ij H _ij) ^-1· (∑ _ij b _ij) (15)

x ₁ = x + Δx (16)

where x ₁ refers to the (i+1) th pose in the (i+1) cycle corresponding to the first pose (or the third pose) in the ith cycle.

In some embodiments, if the first pose is determined based on the at least one marker, the accuracy of the first pose may be relatively high, and it may be unnecessary to optimize the first pose. Accordingly, during the pose adjustment process, the processing device 132 may not adjust a value of the first pose and only adjust and/or optimize the third pose.

It should be noted that the above descriptions are for illustration purposes and are non-limiting. In some embodiments, the processing device 132 may determine the error function by taking at least one supplementary parameter into consideration. The at least one supplementary parameter may be determined based on laser data and/or local maps determined within a predetermined time period from a current time point to a previous time point. For example, the at least one supplement parameter may include one or more poses (e.g., a pose corresponding to a construction time point of a local map) corresponding to the at least one local map, one or more poses corresponding to one or more previous released local maps (e.g., 3, 4, 5, or 6) determined during the predetermined time period, previous third poses corresponding to the construction time points, or the like, or any combination thereof.

It should be noted that the above description is merely provided for the purposes of illustration, and not intended to limit the scope of the present disclosure. For persons having ordinary skills in the art, multiple variations or modifications may be made under the teachings of the present disclosure. However, those variations and modifications do not depart from the scope of the present disclosure. For example, in operation 1310, the processing device 132 may fuse the odometry data, the laser data, and the global map and then determine the first pose of the moving device based on the fused data.

FIGs. 14A-14F are schematic diagrams illustrating an exemplary process for updating occupancy rates of grids of a local map according to some embodiments of the present disclosure.

In some embodiments, as described in connection with operation 1320, a local map may include a plurality of grids (e.g., containing occupied grids, unoccupied grids) each of which may correspond to an occupancy rate. The processing device 132 may update the local map by updating the plurality of occupancy rates corresponding to the plurality of grids in the local map. In some embodiments, for a specific occupied grid, if an intermediate occupancy rate and a latest occupancy rate of the occupied grid are larger than the first occupancy threshold, it may be considered that a previously existed obstacle is maintained at the grid. If the intermediate occupancy rate of the grid is larger than the first occupancy threshold and the latest occupancy rate of the grid is smaller than or equal to the first occupancy threshold, it may be considered that a new obstacle is located in the gird, and the processing device 132 may designate the grid as a newly occupied grid.

For a specific unoccupied grid, if an intermediate occupancy rate and a latest occupancy rate of the unoccupied grid are smaller the second occupancy threshold, it may be considered that a previously existed obstacle is maintained at the grid. If the intermediate occupancy rate of the grid is smaller than the second occupancy threshold and the latest occupancy rate of the grid is larger than or equal to the second occupancy threshold, it may be considered that a previous obstacle has left from the grid, and the processing device 132 may designate the grid as a newly unoccupied grid.

As described in connection with operation 1320, the processing device 132 may determine updated occupancy rates of the newly occupied grids and newly unoccupied grids according to the formula (6) described above. In some embodiments, the processing device 132 may determine updated occupancy rates of girds whose distances to corresponding nearest obstacles are smaller than or equal to the distance threshold as illustrated below.

In some embodiments, the processing device 132 may obtain a distance map including a plurality of distances each of which corresponds to a distance between a grid in the local map and a nearest occupied grid (i.e., a grip occupied by an obstacle) to the grid and an obstacle reference map including a plurality of coordinates each of which corresponds to a coordinate of a nearest occupied grid. In an initialized state, all distances in the distance map and all coordinates in the obstacle reference map may be default settings (e.g., the distances are set as infinity) . After determining the newly occupied grids and newly unoccupied grids, the processing device 132 may determine distances corresponding to the newly unoccupied grids in the distance map as infinity respectively and distances corresponding to the newly occupied grid in the distance map as 0. The processing device 132 may determine coordinates of the newly occupied grids as coordinates corresponding to the newly occupied grids in the obstacle reference map respectively.

In some embodiments, the processing device 132 may order the plurality of grids according to their distances to corresponding nearest obstacles. The processing device 132 may optimize to process (e.g., update a corresponding occupancy rate) a grid with a smaller distance to a nearest obstacle. In some embodiments, it may be unnecessary to order the plurality of grids (that is the ordering operation may be omitted) during the updating of occupancy rates of the grids.

In some embodiments, the processing device 132 may perform a raise operation on each newly unoccupied grid. Take a specific newly unoccupied grid (e.g., a grid 1420 illustrated in FIGs. 14C-14F) as an example, the processing device 132 may determine one or more adjacent grids whose distance to the newly unoccupied grid is smaller than a distance threshold (e.g., 0.4 meters) . For example, the one or more adjacent grids of the newly unoccupied grid may include adjacent grids in four directions (e.g., the upper direction, the lower direction, the left direction, and the right direction) of the newly unoccupied grid. As another example, the one or more adjacent grids of the newly unoccupied grid may include adjacent nodes in eight directions (e.g., the upper direction, the upper-left direction, the upper-right direction, the lower direction, the lower-left direction, the lower-right direction, the left direction, and the right direction) of the newly unoccupied grid.

The processing device 132 may determine whether a distance between each adjacent grid and a nearest occupied grid of the adjacent grid is smaller than or equal to the distance threshold. In response to determining that the distance between the adjacent grid and the nearest occupied grid (e.g., a grid 1430-2 illustrated in FIG. 14F) is smaller than or equal to the distance threshold, the processing device 132 may mark the adjacent grid and the corresponding nearest occupied grid for further processing (e.g., performing a lower operation illustrated below) . In response to determining that the distance between the adjacent grid and the nearest occupied grid (e.g., a grid 1430-1 illustrated in FIG. 14E) is larger than the distance threshold, the processing device 132 may not mark the adjacent grid.

After performing the raise operation on all the newly unoccupied grids, the processing device 132 may perform a lower operation on each newly occupied grid and each of the marked occupied gird (s) determined in the raise operation. Take a specific newly occupied grid (e.g., a grid 1410 illustrated in FIGs. 14A and 14B) or a specific marked occupied grid (e.g., the grid 1430-2 illustrated in FIG. 14F) as an example, the processing device 132 may determine one or more adjacent grids whose distance to the newly occupied grid (or the added occupied grid) is smaller than the distance threshold (e.g., 0.4 meters) . The processing device 132 may determine a corresponding distance of the newly occupied grid in the distance map as 0 and determine a coordinate of the newly occupied grid as a corresponding coordinate of the newly occupied grid in the obstacle reference map.

Take a specific adjacent grid as an example, if a nearest occupied grid of the adjacent grid is the newly occupied grid, the processing device 132 may update the occupancy rate of the adjacent grids based on a distance between the adjacent grid and the newly occupied grid, for example, using a likelihood field model. The processing device 132 may determine a corresponding distance of the adjacent grid in the distance map as the distance between the adjacent grid and the newly occupied grid (or the added occupied grid) and determine a corresponding coordinate of the adjacent grid in the obstacle reference map as the coordinate of the newly occupied grid. If the nearest occupied grid of the adjacent grid is another occupied grid (that is, not the newly occupied grid) , the processing device 132 may update the occupancy rate of the adjacent grid based on a distance between the adjacent grid and the another occupied grid. The processing device 132 may determine a corresponding distance of the adjacent grid in the distance map as the distance between the adjacent grid and the another occupied grid and determine a corresponding coordinate of the adjacent grid in the obstacle reference map as the coordinate of the another occupied grid. If the another occupied grid is not marked, the processing device 132 may mark the another occupied grid and perform the lower operation on the another occupied grid.

FIG. 15 is a schematic diagram illustrating a principle of a SPA algorithm according to some embodiments of the present disclosure.

As shown in FIG. 15, each of dotted circles 1510 represents a first pose of a moving device. Each of hollow circles 1520 represents a third pose of the moving device. A rectangle 1530 represents a global map associated with the moving device. Line segments connecting the dotted circles 1510 and the hollow circles 1520 represent constraints between the first poses and the third poses. Line segments connecting the hollow circles 1520 and the global map represent constraints between the third poses and the global map. The SPA algorithm may be used to optimize the first pose and the second pose to minimize an error (e.g., represented by the error function illustrated in FIG. 13) introduced by the constraints.

FIG. 16 is a flowchart illustrating a process for determining a target pose of a moving device according to some embodiments of the present disclosure. In some embodiments, the process 1600 may be implemented as a set of instructions (e.g., an application) stored in the storage ROM 230 or RAM 240. The processor 220 and/or the modules in FIG. 12 may execute the set of instructions, and when executing the instructions, the processor 220 and/or the modules may be configured to perform the process 1600. The operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 1600 may be accomplished with one or more additional operations not described and/or without one or more of the operations herein discussed. Additionally, the order in which the operations of the process illustrated in FIG. 16 and described below is not intended to be limiting.

In 1610, the processing device 132 may construct a local map of an environment surrounding a moving device (e.g., the moving device 110) based on a first pose of the moving device at a current time point.

In some embodiments, as described in connection with operation 1310, the processing device 132 may determine the first pose based at least in part on odometry data or laser data acquired by the moving device. For example, the processing device 132 may determine the first pose based on a global map associated with the moving device and laser data associated with the environment. More descriptions regarding the local map and/or the first pose may be found elsewhere in the present disclosure, for example, FIGs. 13 and 14 and the descriptions thereof.

In 1620, the processing device 132 may update the local map based on a change of an obstacle (e.g., whether a grid of the local map is occupied by the obstacle) associated with the environment surrounding the moving device. More descriptions regarding the updating of the local map may be found elsewhere in the present disclosure, for example, FIGs. 13 and 14 and the descriptions thereof.

In 1630, the processing device 132 may match the updated local map with the laser data associated with the environment to determine a second pose of the moving device. Operation 1630 may be similar to operation 1330, the descriptions of which may be not repeated.

In 1640, the processing device 132 may adjust the first pose based on the second pose to determine a target pose of the moving device. In some embodiments, the processing device 132 may adjust the first pose using a SPA algorithm. More descriptions of the adjustment of the first pose may be found elsewhere in the present disclosure, for example, operation 1340 and the descriptions thereof.

FIG. 17 is a flowchart illustrating an exemplary process for determining a target pose of a moving device according to some embodiments of the present disclosure. In some embodiments, the process 1700 may be implemented as a set of instructions (e.g., an application) stored in the storage ROM 230 or RAM 240. The processor 220 and/or the modules in FIG. 12 may execute the set of instructions, and when executing the instructions, the processor 220 and/or the modules may be configured to perform the process 1700. The operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 1700 may be accomplished with one or more additional operations not described and/or without one or more of the operations herein discussed. Additionally, the order in which the operations of the process illustrated in FIG. 17 and described below is not intended to be limiting.

In 1701, the processing device 132 may obtain odometry data and determine a first candidate pose of the moving device at a current time point based on the odometry data and a previous pose of the moving device at a previous time point adjacent to the current time point. More descriptions regarding the first candidate pose may be found elsewhere in the present disclosure, for example, operation 1310 and the descriptions thereof.

In 1702, the processing device 132 may determine whether a difference (also referred to as a first difference) between the odometry data corresponding to the current time point and previous odometry data corresponding to the previous time point exceeds a first threshold or a time difference (also referred to as a second difference) between the current time point and the previous time point exceeds a second threshold.

In 1703, in response to determining that the difference exceeds the first threshold or the time difference exceeds the second threshold, the processing device 132 may down-sample laser data acquired at the current time point and divide the laser data (including a plurality of groups of sub-laser-data) into groups of sub-laser data corresponding to short-term features and groups of sub-laser-data corresponding to long-term features based on a distance between each group of sub-laser data and a corresponding nearest obstacle in a global map associated with the moving device.

In 1704, the processing device 132 may determine a second candidate pose of the moving device by matching the long-term features and the global map.

In 1705, the processing device 132 may determine a first pose of the moving device by fusing the first candidate pose and the second candidate pose using an EKF algorithm. More descriptions regarding the second candidate pose and/or the first pose may be found elsewhere in the present disclosure, for example, operation 1310 and the descriptions thereof.

In 1706, the processing device 132 may construct (or update) a local map (e.g., including a plurality of grids each of which corresponds to an occupancy rate) based on the laser data, wherein a pose of the local map corresponds to the first pose of the moving device. In some embodiments, the processing device 132 may update the local map according to operations 1707-1710 illustrated below.

In 1707, for each of the plurality of grids in the local map, the processing device 132 may determine whether the grid is a newly unoccupied grid or a newly occupied grid. The processing device 132 may designate a grid whose occupancy rate is larger than a first rate threshold (e.g., 0.65) as an occupied grid and designate a grid whose occupancy rate is smaller than a second rate threshold (e.g., 0.2) as an unoccupied grid. The processing device 132 may designate a grid that is occupied at a current time point and was unoccupied at a previous time point adjacent to the current time point as a newly occupied grid. The processing device 132 may designate a grid that is unoccupied at the current time point and was occupied at the previous time point as a newly unoccupied grid.

In 1708, in response to determining that the grid is the newly unoccupied grid, the processing device 132 may perform a raise operation on adjacent grids of the grid.

In 1709, in response to determining that the grid is the newly occupied grid, the processing device 132 may perform a lower operation on the adjacent grids of the grid.

After performing the operations 1707-1709 on all grids of the local map, in 1710, the processing device 132 may match the laser data and the local map to determine the second pose using a scan to local map algorithm. More descriptions regarding operations 1707-1710 may be found elsewhere in the present disclosure, for example, operation 1320 and FIG. 14 and the descriptions thereof.

In 1711, the processing device 132 may determine a target pose of the moving device based on the first pose and the second pose using a SPA algorithm. In some embodiments, the processing device 132 may optimize the first pose based on the second pose. Operation 1711 may be similar to operation 1340, the descriptions of which may be not repeated.

Having thus described the basic concepts, it may be rather apparent to those skilled in the art after reading this detailed disclosure that the foregoing detailed disclosure is intended to be presented by way of example only and is not limiting. Various alterations, improvements, and modifications may occur and are intended to those skilled in the art, though not expressly stated herein. These alterations, improvements, and modifications are intended to be suggested by this disclosure, and are within the spirit and scope of the exemplary embodiments of this disclosure.

Moreover, certain terminology has been used to describe embodiments of the present disclosure. For example, the terms “one embodiment, ” “an embodiment, ” and/or “some embodiments” mean that a particular feature, structure or characteristic described in connection with the embodiment is in at least one embodiment of the present disclosure. Therefore, it is emphasized and should be appreciated that two or more references to “an embodiment” or “one embodiment” or “an alternative embodiment” in various portions of this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined as suitable in one or more embodiments of the present disclosure.

Further, it will be appreciated by one skilled in the art, aspects of the present disclosure may be illustrated and described herein in any of a number of patentable classes or context including any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof. Accordingly, aspects of the present disclosure may be implemented entirely hardware, entirely software (including firmware, resident software, micro-code, etc. ) or combining software and hardware implementation that may all generally be referred to herein as a “unit, ” “module, ” or “system. ” Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer readable media having computer readable program code embodied thereon.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including electro-magnetic, optical, or the like, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that may communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable signal medium may be transmitted using any appropriate medium, including wireless, wireline, optical fiber cable, RF, or the like, or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB. NET, Python or the like, conventional procedural programming languages, such as the ″C″ programming language, Visual Basic, Fortran 2003, Peri, COBOL 2002, PHP, ABAP, dynamic programming languages such as Python, Ruby and Groovy, or other programming languages. The program code may execute entirely on the user′s computer, partly on the user′s computer, as a stand-alone software package, partly on the user′s computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user′s computer through any type of network, including a local area network (LAN) or a wide area network (WAN) , or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider) or in a cloud computing environment or offered as a service such as a Software as a Service (SaaS) .

Furthermore, the recited order of processing elements or sequences, or the use of numbers, letters, or other designations therefore, is not intended to limit the claimed processes and methods to any order except as may be specified in the claims. Although the above disclosure discusses through various examples what is currently considered to be a variety of useful embodiments of the disclosure, it is to be understood that such detail is solely for that purpose, and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover modifications and equivalent arrangements that are within the spirit and scope of the disclosed embodiments. For example, although the implementation of various components described above may be embodied in a hardware device, it may also be implemented as a software only solution, e.g., an installation on an existing server or mobile device.

Similarly, it should be appreciated that in the foregoing description of embodiments of the present disclosure, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the various embodiments. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed subject matter requires more features than are expressly recited in each claim. Rather, claimed subject matter may lie in smaller than all features of a single foregoing disclosed embodiment.

Claims

A system, comprising:

at least one storage device including a set of instructions; and

at least one processor in communication with the at least one storage device, wherein when executing the set of instructions, the at least one processor is configured to cause the system to:

obtain an initial pose of a moving device;

determine a plurality of candidate poses of the moving device based on the initial pose of the moving device according to at least one map associated with the moving device; and

determine a target initial pose of the moving device based on the plurality of candidate poses and laser data acquired by the moving device.
The system of claim 1, wherein to obtain the initial pose of the moving device, the at least one processor is configured to cause the system to:

determine feature information of an image acquired by the moving device;

generate a matching result by matching the feature information of the image and reference feature information of a plurality of reference images, the reference feature information being stored in a feature database; and

obtain the initial pose of the moving device based on the matching result.
The system of claim 2, wherein to obtain the initial pose of the moving device based on the matching result, the at least one processor is configured to cause the system to:

determine a plurality of similarities between the feature information of the image and the reference feature information of the plurality of reference images;

identify, from the plurality of similarities, a similarity exceeding a similarity threshold; and

determine the initial pose of the moving device based on a reference pose or reference feature information corresponding to a reference image with the similarity exceeding the similarity threshold.
The system of claim 3, wherein to obtain the initial pose of the moving device based on the matching result, the at least one processor is configured to cause the system to:

in response to determining that all the plurality of similarities are less than or equal to the similarity threshold, obtain a second image by moving the moving device;

determine second feature information of the second image;

determine a plurality of second similarities between the second feature information of the second image and the reference feature information of the plurality of reference images;

identify, from the plurality of second similarities, a second similarity exceeding the similarity threshold; and

determine the initial pose of the moving device based on a second reference pose or second reference feature information corresponding to a second reference image with the second similarity exceeding the similarity threshold.
The system of claim 3, wherein to obtain the initial pose of the moving device based on the matching result, the at least one processor is configured to cause the system to:

in response to determining that all of the plurality of second similarities are less than or equal to the similarity threshold, determine the initial pose of the moving device based on one or more previous poses of the moving device and odometry data of the moving device.
The system of claim 2, wherein the feature database is generated by:

obtaining a reference map;

obtaining a plurality of reference poses of a reference moving device based on the reference map, two adjacent reference poses of the plurality of reference poses satisfying a preset condition;

determining a plurality of images acquired by the reference moving device from the plurality of reference poses as the plurality of reference images; and

for each of the plurality of reference images, extracting and storing the reference feature information of the reference image, the reference feature information including at least one of a reference feature point, a reference representation of the reference feature point, or a reference coordinate of the reference feature point.
The system of claim 6, wherein the preset condition includes that a time difference between time points corresponding to the two adjacent poses respectively exceeds a time threshold or a difference between the two adjacent poses of the reference moving device exceeds a difference threshold.
The system of claim 2, wherein the at least one processor is configured to cause the system further to:

determine a matching result of the laser data acquired by the moving device and a map associated with the moving device;

in response to determining that the matching result satisfies a preset condition, determine whether a highest similarity between the feature information and corresponding reference feature information in the feature database is smaller than a similarity threshold; and

in response to determining that the similarity is smaller than the similarity threshold, update the feature database by replacing reference feature information corresponding to the reference image with the feature information of the image.
The system of claim 1, wherein the at least one map includes at least two maps with different resolutions associated with the moving device, and to determine the plurality of candidate poses of the moving device based on the initial pose of the moving device according to the at least one map associated with the moving device, the at least one processor is configured to cause the system to:

determine the plurality of candidate poses of the moving device by rotating or translating the initial pose within a predetermined range on the at least two maps.
The system of claim 9, wherein to determine the target initial pose of the moving device based on the plurality of candidate poses and the laser data acquired by the moving device, the at least one processor is configured to cause the system to:

determine, using a branch and bound algorithm, the target initial pose of the moving device based on the at least two maps and the plurality of candidate poses of the moving device.
The system of claim 10, wherein to determine, using the branch and bound algorithm, the target initial pose of the moving device based on the at least two maps and the plurality of candidate poses of the moving device, the at least one processor is configured to cause the system to:

for each of the at least two maps, determine one or more modified maps by down-sampling the map; and

determine, using the branch and bound algorithm, the target initial pose of the moving device based on the at least two maps, modified maps corresponding to the at least maps, and the plurality of candidate poses of the moving device.
The system of claim 10, wherein

the at least two maps include a first map with a first resolution and a second map with a second resolution; and

to determine, using the branch and bound algorithm, the target initial pose of the moving device based on the at least two maps and the plurality of candidate poses of the moving device, the at least one processor is configured to cause the system to:

determine, based on the first map with the first resolution and the initial pose of the moving device, a plurality of first candidate poses of the moving device among the plurality of candidate poses of the moving device;

determine, using the branch and bound algorithm, a first target initial pose of the moving device based on the plurality of first candidate poses;

determine, based on the second map with the second resolution and the first target initial pose of the moving device, a plurality of second candidate poses of the moving device among the plurality of candidate poses of the moving device;

determine, using the branch and bound algorithm, a second target initial pose of the moving device based on the plurality of second candidate poses of the moving device; and

designate the second target initial pose of the moving device as the target initial pose of the moving device.
A system, comprising:

at least one storage device including a set of instructions; and

at least one processor in communication with the at least one storage device, wherein when executing the set of instructions, the at least one processor is configured to cause the system to:

determine a first pose of a moving device based at least in part on odometry data or laser data acquired by the moving device;

obtain at least one local map associated with the moving device;

determine a second pose of the moving device based on the at least one local map and the laser data acquired by the moving device; and

determine a target pose of the moving device based on the first pose and the second pose using a pose adjustment algorithm.
The system of claim 13, wherein to determine the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device, the at least one processor is configured to cause the system to:

determine whether a first difference between the odometry data corresponding to a current time point and previous odometry data corresponding to a previous time point adjacent to the current time point exceeds a first difference threshold or a second difference between the current time point and the previous time point exceeds a second difference threshold; and

in response to determining that the first difference is smaller than or equal to the first difference threshold and the second difference is smaller than or equal to the second difference threshold, determine the first pose of the moving device based on the odometry data acquired by the moving device.
The system of claim 14, wherein to determine the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device, the at least one processor is configured to cause the system to:

in response to determining that the first difference exceeds the first difference threshold or the second difference exceeds the second difference threshold, determine a first candidate pose of the moving device based on the odometry data acquired by the moving device;

determine a second candidate pose of the moving device based on a portion of the laser data and a global map associated with the moving device, the portion of the laser data corresponding to long-term features in a region where the moving device is located; and

determine the first pose of the moving device based on the first candidate pose and the second candidate pose.
The system of claim 13, wherein to determine the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device, the at least one processor is configured to cause the system to:

determine whether at least one marker is detected based on the laser data; and

in response to determining that the at least one marker is detected based on the laser data, determine the first pose of the moving device based on predetermined reference information associated with the detected at least one marker.
The system of claim 13, wherein to determine the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device, the at least one processor is configured to cause the system to:

determine whether a matching result of the laser data and a global map associated with the moving device satisfies a preset condition; and

in response to determining that the matching result satisfies the preset condition, determine the first pose of the moving device based on the laser data and the global map associated with the moving device.
The system of claim 13, wherein each of the at least one local map includes a plurality of grids and a plurality of occupancy rates corresponding to the plurality of grids respectively.
The system of claim 18, wherein the at least one processor is configured to cause the system further to:

update the at least one local map associated with the moving device based at least in part on the first pose of the moving device and the laser data acquired by the moving device by:

for each of the at least one local map, projecting the laser data onto the local map based on the first pose of the moving device; and

updating the local map by updating the plurality of occupancy rates corresponding to the plurality of grids based on the projected laser data.
The system of claim 13, wherein

the at least one local map is constructed or updated based on previous laser data acquired by the moving device at previous time points within a predetermined range from a current time point corresponding to the laser data; and

the at least one local map is dynamically constructed or updated according to a matching result between the previous laser data and a global map associated with the moving device.
The system of claim 13, wherein the at least one local map is dynamically constructed or released according to at least one of:

a predetermined time interval,

a count of data frames included in the at least one local map, or

a matching result between a pose of the moving device determined based on the at least one local map and a global map associated with the moving device.
The system of claim 13, wherein

the at least one local map includes a first local map, a second local map, and a third local map;

the second local map is constructed when a count of data frames in the first local map reaches a first predetermined count;

the third local map is constructed when a count of data frames in the second local map reaches the first predetermined count; and

the first local map is released when the count of data frames in the first local map reaches a second predetermined count or a matching result between a pose of the moving device determined based on the second local map and a global map associated with the moving device satisfies a preset condition.
The system of claim 13, wherein the pose adjustment algorithm includes a sparse pose adjustment (SPA) algorithm.
The system of claim 13, wherein to determine the first pose of the moving device based on the odometry data, the at least one processor is configured to cause the system to:

obtain a target initial pose of the moving device; and

determine the first pose of the moving device based on the target initial pose of the moving device and the odometry data or the laser data acquired by the moving device, wherein the target initial pose of the moving device is determined by:

obtaining an initial pose of the moving device;

determining a plurality of candidate poses of the moving device based on the initial pose of the moving device; and

determining the pose of the moving device at the second time point based on the plurality of candidate poses and laser data acquired by the moving device at the second time point.
A method implemented on a computing device including at least one processor, at least one storage medium, and a communication platform connected to a network, the method comprising:

obtaining an initial pose of a moving device;

determining a plurality of candidate poses of the moving device based on the initial pose of the moving device according to at least one map associated with the moving device; and

determining a target initial pose of the moving device based on the plurality of candidate poses and laser data acquired by the moving device.
The method of claim 25, wherein the obtaining the initial pose of the moving device includes:

determining feature information of an image acquired by the moving device;

generating a matching result by matching the feature information of the image and reference feature information of a plurality of reference images, the reference feature information being stored in a feature database; and

obtaining the initial pose of the moving device based on the matching result.
The method of claim 26, wherein the obtaining the initial pose of the moving device based on the matching result includes:

determining a plurality of similarities between the feature information of the image and the reference feature information of the plurality of reference images;

identifying, from the plurality of similarities, a similarity exceeding a similarity threshold; and

determining the initial pose of the moving device based on a reference pose or reference feature information corresponding to a reference image with the similarity exceeding the similarity threshold.
The method of claim 27, wherein the obtaining the initial pose of the moving device based on the matching result includes:

in response to determining that all the plurality of similarities are less than or equal to the similarity threshold, obtaining a second image by moving the moving device;

determining second feature information of the second image;

determining a plurality of second similarities between the second feature information of the second image and the reference feature information of the plurality of reference images;

identifying, from the plurality of second similarities, a second similarity exceeding the similarity threshold; and

determining the initial pose of the moving device based on a second reference pose or second reference feature information corresponding to a second reference image with the second similarity exceeding the similarity threshold.
The method of claim 27, wherein the obtaining the initial pose of the moving device based on the matching result includes:

in response to determining that all of the plurality of second similarities are less than or equal to the similarity threshold, determining the initial pose of the moving device based on one or more previous poses of the moving device and odometry data of the moving device.
The method of claim 26, wherein the feature database is generated by:

obtaining a reference map;

obtaining a plurality of reference poses of a reference moving device based on the reference map, two adjacent reference poses of the plurality of reference poses satisfying a preset condition;

determining a plurality of images acquired by the reference moving device from the plurality of reference poses as the plurality of reference images; and

for each of the plurality of reference images, extracting and storing the reference feature information of the reference image, the reference feature information including at least one of a reference feature point, a reference representation of the reference feature point, or a reference coordinate of the reference feature point.
The method of claim 30, wherein the preset condition includes that a time difference between time points corresponding to the two adjacent poses respectively exceeds a time threshold or a difference between the two adjacent poses of the reference moving device exceeds a difference threshold.
The method of claim 26, further comprising:

determining a matching result of the laser data acquired by the moving device and a map associated with the moving device;

in response to determining that the matching result satisfies a preset condition, determining whether a highest similarity between the feature information and

corresponding reference feature information in the feature database is smaller than a similarity threshold; and

in response to determining that the similarity is smaller than the similarity threshold, updating the feature database by replacing reference feature information corresponding to the reference image with the feature information of the image.
The method of claim 25, wherein the at least one map includes at least two maps with different resolutions associated with the moving device, and the determining the plurality of candidate poses of the moving device based on the initial pose of the moving device according to the at least one map associated with the moving device includes:

determining the plurality of candidate poses of the moving device by rotating or translating the initial pose within a predetermined range on the at least two maps.
The method of claim 33, wherein the determining the target initial pose of the moving device based on the plurality of candidate poses and the laser data acquired by the moving device includes:

determining, using a branch and bound algorithm, the target initial pose of the moving device based on the at least two maps and the plurality of candidate poses of the moving device.
The method of claim 34, wherein the determining, using the branch and bound algorithm, the target initial pose of the moving device based on the at least two maps and the plurality of candidate poses of the moving device includes:

for each of the at least two maps, determining one or more modified maps by down-sampling the map; and

determining, using the branch and bound algorithm, the target initial pose of the moving device based on the at least two maps, modified maps corresponding to the at least maps, and the plurality of candidate poses of the moving device.
The method of claim 34, wherein

the at least two maps include a first map with a first resolution and a second map with a second resolution; and

the determining, using the branch and bound algorithm, the target initial pose of the moving device based on the at least two maps and the plurality of candidate poses of the moving device includes:

determining, based on the first map with the first resolution and the initial pose of the moving device, a plurality of first candidate poses of the moving device among the plurality of candidate poses of the moving device;

determining, using the branch and bound algorithm, a first target initial pose of the moving device based on the plurality of first candidate poses;

determining, based on the second map with the second resolution and the first target initial pose of the moving device, a plurality of second candidate poses of the moving device among the plurality of candidate poses of the moving device;

determining, using the branch and bound algorithm, a second target initial pose of the moving device based on the plurality of second candidate poses of the moving device; and

designating the second target initial pose of the moving device as the target initial pose of the moving device.
A method implemented on a computing device including at least one processor, at least one storage medium, and a communication platform connected to a network, the method comprising:

determining a first pose of a moving device based at least in part on odometry data or laser data acquired by the moving device;

obtaining at least one local map associated with the moving device;

determining a second pose of the moving device based on the at least one local map and the laser data acquired by the moving device; and

determining a target pose of the moving device based on the first pose and the second pose using a pose adjustment algorithm.
The method of claim 37, wherein the determining the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device includes:

determining whether a first difference between the odometry data corresponding to a current time point and previous odometry data corresponding to a previous time point adjacent to the current time point exceeds a first difference threshold or a second difference between the current time point and the previous time point exceeds a second difference threshold; and

in response to determining that the first difference is smaller than or equal to the first difference threshold and the second difference is smaller than or equal to the second difference threshold, determining the first pose of the moving device based on the odometry data acquired by the moving device.
The method of claim 38, wherein the determining the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device includes:

in response to determining that the first difference exceeds the first difference threshold or the second difference exceeds the second difference threshold, determining a first candidate pose of the moving device based on the odometry data acquired by the moving device;

determining a second candidate pose of the moving device based on a portion of the laser data and a global map associated with the moving device, the portion of the laser data corresponding to long-term features in a region where the moving device is located; and

determining the first pose of the moving device based on the first candidate pose and the second candidate pose.
The method of claim 37, wherein the determining the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device includes:

determining whether at least one marker is detected based on the laser data; and

in response to determining that the at least one marker is detected based on the laser data, determining the first pose of the moving device based on predetermined reference information associated with the detected at least one marker.
The method of claim 37, wherein the determining the first pose of the moving device based at least in part on the odometry data or the laser data acquired by the moving device includes:

determining whether a matching result of the laser data and a global map associated with the moving device satisfies a preset condition; and

in response to determining that the matching result satisfies the preset condition, determining the first pose of the moving device based on the laser data and the global map associated with the moving device.
The method of claim 37, wherein each of the at least one local map includes a plurality of grids and a plurality of occupancy rates corresponding to the plurality of grids respectively.
The method of claim 42, further comprising:

updating the at least one local map associated with the moving device based at least in part on the first pose of the moving device and the laser data acquired by the moving device by:

for each of the at least one local map, projecting the laser data onto the local map based on the first pose of the moving device; and

updating the local map by updating the plurality of occupancy rates corresponding to the plurality of grids based on the projected laser data.
The method of claim 37, wherein

the at least one local map is constructed or updated based on previous laser data acquired by the moving device at previous time points within a predetermined range from a current time point corresponding to the laser data; and

the at least one local map is dynamically constructed or updated according to a matching result between the previous laser data and a global map associated with the moving device.
The method of claim 37, wherein the at least one local map is dynamically constructed or released according to at least one of:

a predetermined time interval,

a count of data frames included in the at least one local map, or

a matching result between a pose of the moving device determined based on the at least one local map and a global map associated with the moving device.
The method of claim 37, wherein

the at least one local map includes a first local map, a second local map, and a third local map;

the second local map is constructed when a count of data frames in the first local map reaches a first predetermined count;

the third local map is constructed when a count of data frames in the second local map reaches the first predetermined count; and

the first local map is released when the count of data frames in the first local map reaches a second predetermined count or a matching result between a pose of the moving device determined based on the second local map and a global map associated with the moving device satisfies a preset condition.
The method of claim 37, wherein the pose adjustment algorithm includes a sparse pose adjustment (SPA) algorithm.
The method of claim 37, wherein the determining the first pose of the moving device based on the odometry data includes:

obtaining a target initial pose of the moving device; and

determining the first pose of the moving device based on the target initial pose of the moving device and the odometry data or the laser data acquired by the moving device, wherein the target initial pose of the moving device is determined by:

obtaining an initial pose of the moving device;

determining a plurality of candidate poses of the moving device based on the initial pose of the moving device; and

determining the pose of the moving device at the second time point based on the plurality of candidate poses and laser data acquired by the moving device at the second time point.
A non-transitory computer readable medium, comprising executable instructions that, when executed by at least one processor, direct the at least one processor to perform a method, the method comprising:

obtaining an initial pose of a moving device;

determining a plurality of candidate poses of the moving device based on the initial pose of the moving device according to at least one map associated with the moving device; and

determining a target initial pose of the moving device based on the plurality of candidate poses and laser data acquired by the moving device.
A non-transitory computer readable medium, comprising executable instructions that, when executed by at least one processor, direct the at least one processor to perform a method, the method comprising:

determining a first pose of a moving device based at least in part on odometry data or laser data acquired by the moving device;

obtaining at least one local map associated with the moving device;

determining a second pose of the moving device based on the at least one local map and the laser data acquired by the moving device; and

determining a target pose of the moving device based on the first pose and the second pose using a pose adjustment algorithm.