WO2023082985A1 - 用于生成电子设备的导航路径的方法和产品 - Google Patents
用于生成电子设备的导航路径的方法和产品 Download PDFInfo
- Publication number
- WO2023082985A1 WO2023082985A1 PCT/CN2022/127124 CN2022127124W WO2023082985A1 WO 2023082985 A1 WO2023082985 A1 WO 2023082985A1 CN 2022127124 W CN2022127124 W CN 2022127124W WO 2023082985 A1 WO2023082985 A1 WO 2023082985A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- map
- scene
- target object
- objects
- path
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/20—Instruments for performing navigational calculations
- G01C21/206—Instruments for performing navigational calculations specially adapted for indoor navigation
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/26—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
- G01C21/34—Route searching; Route guidance
- G01C21/3446—Details of route searching algorithms, e.g. Dijkstra, A*, arc-flags, using precalculated routes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Definitions
- the embodiments of the present disclosure relate to the technical field of route planning, and more specifically, to a method, device, device, medium and program product for navigation route planning of electronic devices.
- this path planning method needs to perform a large number of search tasks, and may have to traverse all points in the map, which brings a large computational overhead.
- this method does not consider the relationship between various objects in the environment, such as tables and chairs are usually placed together, cups are usually placed on the table, and so on.
- Embodiments of the present disclosure provide a method, device, device, medium and program product for generating a navigation path of an electronic device.
- a method for generating a navigation path for an electronic device includes: generating a second map based on the first map, the first map describes the positions of a plurality of objects in the scene in the scene, and the second map describes how to reach a target object in the plurality of objects from the plurality of positions in the scene Predicting the distance; determining a candidate path to the target object from the target location of the plurality of locations based on the second map; and selecting a navigation path from the target location to the target object from the candidate paths.
- a method for training a neural network model comprises: acquiring a training data set including multiple scenes and multiple objects; acquiring a training label, the training label including multiple objects in the multiple scenes in the position of the scene, reaching multiple objects from multiple positions in the scene The true distance of the target object in and the category of the object; the neural network model is trained based on the training data set and the training label, wherein the neural network model output describes the predicted distance from multiple locations in the scene to the target object in the multiple objects map.
- an apparatus for generating a navigation path of an electronic device includes: a map generation module configured to generate a second map based on the first map, the first map describes the positions of multiple objects in the scene in the scene, and the second map describes the locations of multiple objects arriving from multiple positions in the scene.
- the predicted distance of the target object in the objects is configured to determine a candidate path to the target object from the target location in the plurality of locations based on the second map; and the navigation path selection module is configured to select from the candidate path Select the navigation path from the target location to the target object.
- an apparatus for training a neural network model includes: a training data acquisition module configured to acquire a training data set including multiple scenes and multiple objects; a training label acquisition module configured to acquire a training label, the training label including multiple objects in multiple scenes The position in the scene, the real distance from multiple positions in the scene to the target object in the multiple objects, and the category of the object; the training module is configured to train the neural network model based on the training data set and the training label, wherein the neural network The network model outputs a map describing predicted distances from multiple locations in the scene to target objects among the multiple objects.
- an electronic device in a third aspect of the present disclosure, includes: a processor; and a memory for storing one or more computer instructions, wherein the one or more computer instructions, when executed by the processor, cause the electronic device to perform the method according to the first aspect.
- a computer readable storage medium is provided.
- One or more computer instructions are stored on the computer-readable storage medium, wherein the one or more computer instructions are executed by the processor to implement the method according to the first aspect.
- a computer program product comprises one or more computer instructions, wherein the one or more computer instructions are executed by a processor to implement the method according to the first aspect.
- FIG. 1 shows a schematic diagram of a usage environment of a method for generating a navigation path of an electronic device according to some embodiments of the present disclosure
- FIG. 2 shows a flowchart of a method for generating a navigation path of an electronic device according to some embodiments of the present disclosure
- Figure 3A shows a schematic diagram of a second map according to some embodiments of the present disclosure, and wherein predicted distances are shown;
- FIG. 3B shows a schematic diagram of a second map according to some embodiments of the present disclosure, and wherein specific objects are shown;
- Figure 4 shows a flow chart of a method for training a neural network model according to some embodiments of the present disclosure
- Fig. 5 shows a schematic diagram of a sub-scenario according to some embodiments of the present disclosure
- Fig. 6 shows a block diagram of an apparatus for generating a navigation path of an electronic device according to some embodiments of the present disclosure
- FIG. 7 shows a block diagram of an apparatus for training a neural network model according to some embodiments of the present disclosure.
- Figure 8 shows a block diagram of a computing system in which one or more embodiments of the present disclosure may be implemented.
- map used in this disclosure refers to the result of modeling the environment or scene, which is one of the important links in path planning. Its purpose is to establish a model that is convenient for computers to execute path planning, that is, to abstract the actual physical space into an abstract space that can be processed by algorithms, and to realize the mapping between physics and abstraction.
- route refers to a walking path found by applying a corresponding algorithm on the basis of the environment model in the path search phase.
- the walking path enables the predetermined function associated with the goal to obtain an optimal value, and the path does not necessarily refer to a path directly leading to the goal object, but may also lead to a path leading to an intermediate goal selected for reaching the goal object.
- training or “learning” refer to the process of using experience or data to optimize system performance.
- the neural network system can gradually optimize the performance of the predicted distance through a training or learning process, such as improving the accuracy of the predicted distance.
- training or learning are used interchangeably for convenience of discussion.
- method or model for generating a navigation path of an electronic device refers to a method/model based on prior knowledge associated with color information, depth information, object types, etc. in a specific environment or scene.
- the method or model can be used to find a target object and make the electronic device reach the target object in a navigation task of the electronic device.
- the term “comprise” and its variants are open-ended, ie “including but not limited to”.
- the term “based on” is “based at least in part on”.
- the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”. Relevant definitions of other terms will be given in the description below.
- the existing map and navigation planning methods can no longer meet the growing demand for electronic devices to perform autonomous tasks. For example, when a domestic service robot performs the task of pouring water at home for the first time, the robot may not even know where the water glass is. Moreover, in traditional navigation tasks, the environment map is constructed in advance, and the target location of navigation is also given in the form of coordinates on the map. However, in the aforementioned pouring task, there is no pre-built map, and the robot does not know the location of the target, only what the target is (for example, the target is a water glass, because the glass must be found before it can be poured).
- the robot has to set target objects for itself, which can include the final target object (for example, a drinking glass) and the intermediate target object (for example, near the table where the drinking glass is, or the chair next to the table), so that reasonable Plan navigation paths, avoid obstacles and more.
- target objects for itself, which can include the final target object (for example, a drinking glass) and the intermediate target object (for example, near the table where the drinking glass is, or the chair next to the table), so that reasonable Plan navigation paths, avoid obstacles and more.
- the inventor also found that in the traditional navigation planning method, no prior knowledge is used to provide a faster, more accurate and more concise navigation path planning process.
- a specific environment such as an indoor scene, especially a home scene
- certain rules such as chairs are often placed near the table, and cups are usually on the table.
- the robot needs to search for a cup, there is a high probability that it can first find an object that is easier to locate (for example, a more visually obvious table or chair).
- a map (hereinafter also referred to as "first map”) including prior knowledge information of spatial relationships between objects will be regenerated on the basis of a map describing the scene around the robot (hereinafter also referred to as “first map”).
- first map a map describing the scene around the robot
- two maps to provide predicted distances from multiple locations in the scene to a target object among multiple objects. This makes it easier to find a shorter path when determining a candidate path to a target object from a target location in a plurality of locations. That is to say, each embodiment described here beneficially utilizes the spatial relationship of each object in the scene, and directly utilizes the distance from each position to the target object, without performing operations such as searching first. Compared with the traditional scheme, it can provide a better navigation path, so that the robot can move to the target object efficiently.
- Fig. 1 shows a schematic diagram of an environment 100 in which a method for generating a navigation path of an electronic device according to some embodiments of the present disclosure is used.
- an electronic device 101 such as a robot
- color information eg, RGB image
- depth information eg, depth image
- Ways to obtain such information include but are not limited to obtaining from a camera mounted on an electronic device, such as the RGBD camera 102 .
- the camera can capture the depth-of-field distance of the space within the camera's field of view, providing a three-dimensional image.
- the electronic device 101 will be guided to the target object according to the navigation path. Then the electronic device 101 can perform the operation required by the task, for example, the robot picks up the cup and fills up the water, and so on.
- the present disclosure does not limit the subsequent operations or actions to be performed by the electronic device.
- FIG. 2 shows a flowchart of a method 200 for generating a navigation path of an electronic device according to some embodiments of the present disclosure.
- the process of generating the navigation path of the electronic device implemented by the method 200 will be described by taking the robot moving from its current position to the table in an indoor home scene as an example.
- this is exemplary only and is not intended to limit the scope of the present disclosure in any way.
- the embodiment of the method 200 described herein can also be used in the navigation process of any other suitable electronic device.
- a second map is generated based on the first map.
- the first map describes the positions of multiple objects in the living room in the living room.
- the first map may be generated by using the acquired three-dimensional image and predetermined categories of multiple objects in the scene.
- An example of a first map is a semantic map, which takes a map as a carrier and maps semantics into it. It can be understood that the semantics represent the category of each object in the scene. Classes refer to the names of objects, such as tables, chairs, or these names can be encoded as codes such as numbers. Therefore, the first map provides a simplified model. "Semantics" can be learned and acquired from 3D images through models such as classification, detection, and segmentation, but “semantics” can also be defined by humans, as long as the definition is universal and concise enough.
- the first map can be obtained by the methods discussed above. And in some embodiments, the first map is a two-dimensional map obtained by projecting each object onto a plane based on a color image and a depth image of the scene. And more specifically, such a first map can be projected to obtain a two-dimensional map with a bird's-eye view by using the scene in the robot's perspective, combining information such as the position and posture of the robot, the inherent parameters of the camera, and the category of the object. Therefore, this is a more efficient abstraction to represent various types of information in the scene.
- a second map generated based on the first map describes predicted distances from the plurality of locations in the scene to a target object of the plurality of objects.
- FIG. 3A shows a schematic diagram of a second map according to some embodiments of the present disclosure, and wherein predicted distances are shown, wherein numbers in each grid represent predicted distances.
- Fig. 3B shows yet another example of the second map, and in which specific objects are shown. It can be seen that the second map formed in the form of a grid better reflects the state of the scene where the robot is located. Area 301 refers to the boundary between the explored area and the unknown area of the robot.
- the robot finds a specific object (for example, a door, that is, the area between the wall 302 and the wall 303), it can preferentially determine the angle ⁇ to both sides of the door and the corresponding area, so as to better generate candidate paths and navigation paths.
- a specific object for example, a door, that is, the area between the wall 302 and the wall 303
- the second map may describe the minimum distances from multiple locations in the living room to the range of the table.
- locations can refer to the distance from one pixel of the map to a pixel at the target object.
- This distance is obtained by prediction.
- a neural network model can be trained to learn features of various objects of a relevant scene, and the neural network model can predict distances from multiple locations in the scene to a target object among the multiple objects. This distance is referred to herein as the predicted distance.
- the current scene of the robot can be generally known, and the positional relationship between each object in the scene and the target object. Moreover, the positional relationship takes into account the prior knowledge mentioned above, so it is more accurate.
- grids may be set whose size may correlate with the actual size of the scene. These grids can facilitate data processing when generating candidate paths and navigation paths, and also facilitate the balance between real-time performance and economy when the robot explores the scene.
- a grid corresponds to the size of 5cm*5cm in the living room. This can simplify calculations, save computing resources, and improve efficiency. Note that any specific numerical values described here, as well as elsewhere herein, are exemplary only and are not intended to limit the scope of the present disclosure.
- each grid in the second map including the predicted distance stores the predicted path length from the grid to the target object.
- the predicted distance saved by the grid within the range of the target object and the support or container of the target object (for example, if the target object is a cup on the table, then the table is the support) is 0, and the predicted distance saved by the grid within the obstacle range is gigantic.
- the first map and the second map may be updated based on at least one of the moving time, moving distance, and moving angle of the electronic device 101 exceeding a threshold.
- both the first map and the second map are based on the perspective of the robot, which means that when the robot moves, the perspective will also change, and the previously planned path may change and no longer apply. Therefore, in order to find a balance between real-time and computing efficiency, some thresholds can be determined, such as the thresholds of moving time, moving distance, and moving angle. When the thresholds are exceeded, the first map and the second map are updated. In this way, the balance between real-time and economy is achieved.
- the predicted distance may be represented as a continuous distance value or as a discrete distance value, where the discrete distance value corresponds to an interval in the continuous distance value.
- the prediction distance is expressed as discrete values, it is easier to realize the prediction method in which each discrete value corresponds to an interval in the continuous value.
- the predicted distance may be represented as an interval numbered from 0 to 12, 0 representing a predicted distance of 0 meters to 1 meter, 1 representing a predicted distance of 1 meter to 2 meters, and so on. This has advantages in calculation, processing, and storage, and can provide faster calculation speed and reduce storage capacity. At the same time, this also introduces another advantage, that is, the error of prediction can be eliminated, because predicting exact continuous values is difficult.
- generating the second map may further include: dividing the scene in the first map into a plurality of sub-scenes; generating the second map based on a plurality of maps describing positions of a plurality of objects in the sub-scenes Two maps.
- a candidate path to a target object from a target location of a plurality of locations is determined based on a second map.
- the target location may be the robot's current location.
- the robot in order to reach the target object, the robot can move directly to the target object, but when blocked by an obstacle, such as a sofa, then the robot is faced with the option to go around from the left or from the right .
- the robot is faced with the option of leaving the living room and entering another room.
- the predicted distance associated with the boundary of the second map (for example, the exploration boundary, which represents the dividing line between the area where the map has been built and the area where the map has not been built) to the target object can be selected.
- path as a candidate path.
- a candidate path is referred to as a "second path”.
- the first path plans the path with the minimum value of the sum of the predicted distance from the boundary of the explored area in the living room to the target object and the distance from the robot to the boundary of the living room as the target.
- the following formula can be used to determine the target or intermediate target to be selected and the subsequent candidate path:
- p goal represents the intermediate goal
- d(p agent , p) represents the distance from the current position of the robot to the boundary of the second map
- L Dis (p) represents the predicted distance
- B exp represents the explored range in the second map.
- the generated planned path is the theoretical shortest path.
- paths associated with predicted distances from the target location to the target object may be selected as candidate paths.
- this candidate path is also referred to as a "first path". If the scene described by the second map is the living room, the second path plans the path with the predicted distance from the position of the robot to the position of the map boundary (for example, the boundary of the area that has been explored) as the target or intermediate target.
- the following formula can be used to determine the intermediate target to be selected and the subsequent candidate path:
- paths associated with angles or boundaries of the target object to predetermined specific objects in the scene may be selected as candidate paths.
- this candidate path is also referred to as a "third path”.
- the scene described by the second map is a living room, and there is a specific object (for example, a door) on the boundary of the living room.
- the specific object may be predetermined.
- the position with the smallest predicted distance within the range of the gate will be selected as the intermediate target first.
- the door position may be a range of positions, so it will be appreciated that the intermediate target may be associated with an angle ⁇ d (eg, 120 degrees) or a range boundary from the target position to the particular object.
- ⁇ d eg, 120 degrees
- the following formula can be used to determine the target or intermediate target to be selected and the corresponding candidate path:
- B d represents the defined B exp
- p door represents the probability of existence of a door (or a predetermined specific object such as a corridor). This probability is obtained from the neural network model.
- cross entropy loss can be used to train the category of doors (or other specific objects) to more accurately determine their probabilities. It can be seen that this path makes the robot temporarily skip objects that are not related to the target object (eg, other rooms), and preferentially search for objects related to the target object (eg, the room where the target object is located).
- the second map may be generated by a neural network model based on the first map.
- the neural network model acquires the first map and the category of each object in the scene to generate the second map.
- the neural network used to generate the second map may be trained with the first map embodying the spatial relationship of each object in the scene and the data set of the category of each object. An example embodiment in this regard will be described below with reference to FIG. 4 .
- a navigation path from the target location to the target object is selected from the candidate paths.
- a route planning algorithm is used to generate a navigation route of the electronic device.
- a route planning algorithm is used to generate a navigation route of the electronic device.
- a fast marching method (Fast Marching Method) or an A* path planning algorithm may be used to provide a navigation path.
- Other path planning algorithms may also be used to provide the navigation path, which is not limited in the present disclosure.
- the spatial position relationship between objects in the scene is fully considered, so as to obtain a second map.
- the predicted distance of the path that the robot can actually move can also be described in a simplified form (i.e., the predicted distance represented by discrete values), so that when generating candidate paths and navigation paths, there is no need to search for Each point saves a lot of computing resources and improves efficiency.
- it can explore in a local-to-global manner, and realize navigation to the target object. Since the map is updated considering that the moving time, moving distance or moving angle exceeds the threshold, the robot can achieve a better balance in real-time and economical efficiency.
- the second map can be generated based on the first map according to a neural network.
- FIG. 4 shows a flowchart of a method 400 for training such a neural network model according to some embodiments of the present disclosure. It will be appreciated that the training and use of the neural network may occur at the same or different locations. That is, the method 200 and the method 400 may be performed by the same subject, or may be performed by different subjects.
- a training data set including a plurality of scenes and a plurality of objects is obtained.
- these scenarios may be pre-established standard environments, and each scenario is arranged as needed for the neural network model to learn specified features.
- the category of objects may include various items that can be placed in practical applications, such as beds, sofas, tables, and so on.
- a training label is obtained, the training label includes the positions of multiple objects in the multiple scenes, the real distance from the multiple positions in the scene to the target object among the multiple objects, and the category of the objects.
- the values of these positions, distances, and categories may be pre-marked in each scene in 401 .
- training the neural network model may further include: dividing the scene into a plurality of sub-scenes; The training labels of the real distances of the target objects in the objects are used to train the neural network model.
- sub-scenes of a specific size can be used for training first, and then the entire scene can be gradually explored to complete the training for the entire scene.
- a neural network model is trained based on the training data set and the training labels, wherein the neural network model outputs a map describing predicted distances from a plurality of locations in the scene to a target object in the plurality of objects.
- the neural network model may be a fully convolutional neural network, may have 3 downsampling ResBlock layers, 3 upsampling ResBlock layers, and cascade low-level feature maps and upsampling feature maps on each layer.
- the output of the neural network is the predicted distance.
- the output channel can be set as n b *n T , where n b is the side length of the area represented by the discrete prediction distance, for example, 5cm.
- n T represents the number of target categories. Therefore, in this way, each n b channel forms a group and is responsible for predicting the prediction distance of a certain object, so many groups of object types and output prediction distances can be trained and predicted, which improves the efficiency.
- training the neural network model further includes: when the position of the target object is not in the scene, training the neural network model using the real distance from the target position in the plurality of positions to the boundary of the scene; and/or when the target object When the position of the object is not in the sub-scene, the neural network model is trained using the true distance from the target position in the plurality of positions to the boundary of the sub-scene.
- the neural network model generated by training through the method 400 described above can accurately classify each object in the scene, and the prediction of the distance between the target position and the target object is not only accurate, but also eliminates the possibility of continuous values.
- the error caused by inaccuracy increases the robustness of the robot in the actual application environment. Due to the real-time nature required for robot movement, the computational overhead of updating the map can also be reduced due to the higher computational efficiency of the neural network model.
- Fig. 5 shows a schematic diagram of a sub-scenario according to some embodiments of the present disclosure.
- Fig. 6 shows a block diagram of an apparatus 600 for generating a navigation path of an electronic device according to some embodiments of the present disclosure.
- the apparatus includes: a map generation module 601, configured to generate a second map based on the first map at the electronic device 101, the first map describes the positions of multiple objects in the scene in the scene, and the second map describes the location of objects from the scene Predicted distances from multiple locations in the multiple objects to the target object in the multiple objects; the device also includes: a candidate path determination module 602 configured to determine a candidate path from the target location in the multiple locations to the target object based on the second map and the apparatus further includes: a navigation route selection module 603 configured to select a navigation route from the target position to the target object from the candidate routes.
- a map generation module 601 configured to generate a second map based on the first map at the electronic device 101, the first map describes the positions of multiple objects in the scene in the scene, and the second map describes the location of objects from the scene Predicted distances from multiple locations in the multiple
- determining the candidate path may include determining at least one of the following paths: a first path, associated with a predicted distance from the target location to the target object; a second path, associated with the predicted distance from the boundary of the second map to the target object associated; and a third path associated with an angle or boundary of the target object to a predetermined specific object in the scene.
- formula (1) can be used to determine the target or intermediate target related to the first path; formula (2) can be used to determine the target or intermediate target related to the second path; formula (3) can be used to determine A goal or intermediate goal related to the third path.
- selecting the navigation path may include: based on at least one path among the first path, the second path, and the third path, and based on the target location and the target object, using a path planning algorithm to generate the navigation path of the electronic device.
- the first map is a two-dimensional map obtained by projecting each object onto a plane based on a color image and a depth image of the scene.
- the apparatus may further include a map updating module 604 configured to update the first map and the second map based on at least one of the moving time, moving distance, and moving angle of the electronic device 101 exceeding a threshold.
- a map updating module 604 configured to update the first map and the second map based on at least one of the moving time, moving distance, and moving angle of the electronic device 101 exceeding a threshold.
- the predicted distance may be expressed as a continuous distance value or a discrete distance value, wherein the discrete distance value corresponds to an interval in the continuous distance value.
- the second map generating module is further configured to: divide the scene in the first map into multiple sub-scenes; Generate the second map.
- the second map is generated by a neural network model.
- the neural network model acquires the first map and the categories of each object in the scene to generate the second map.
- the second map in the device 600 can be generated by a neural network model trained based on the device 700 .
- the neural network model can be trained using the apparatus 700 in FIG. 7 .
- FIG. 7 shows a block diagram of an apparatus 700 for training a neural network model according to some embodiments of the present disclosure.
- the apparatus 700 includes a training data acquisition module 701 configured to acquire a training data set including multiple scenes and multiple objects.
- the device also includes a training label acquisition module 702, configured to acquire a training label, the training label includes the positions of multiple objects in multiple scenes in the scene, and the distance from multiple positions in the scene to the target object in the multiple objects. The true distance, and the class of the object.
- the apparatus also includes a training module 703 configured to train a neural network model based on the training data set and the training labels, wherein the neural network model outputs a map describing a predicted distance from a plurality of locations in the scene to a target object among the plurality of objects .
- the training data obtaining module 701 is further configured to: divide the scene into multiple sub-scenes, and obtain a training data set including multiple sub-scenes and multiple objects.
- the training label obtaining module 702 is further configured to: obtain a training label including the positions of the multiple objects in the sub-scene and the true distance from the multiple positions in the sub-scene to the target object in the multiple objects.
- the training module 703 is further configured to: when the position of the target object is not in the scene, use the real distance from the target position in the multiple positions to the boundary of the scene to train the neural network model; and/or when the target object When the position of the object is not in the sub-scene, the neural network model is trained using the true distance from the target position in the plurality of positions to the boundary of the sub-scene.
- the neural network model trained by the device 700 described above can not only solve the problem of navigation path planning when the robot performs tasks, but also provide the best route for the robot to explore the scene. This enables it to quickly understand the overall picture of the scene it is in. Accordingly, at least one of the method 400 and other advantages described above may be provided.
- FIG. 8 shows a block diagram of a computing system 800 in which one or more embodiments of the present disclosure may be implemented.
- the method 200 and the method 400 shown in FIG. 2 and FIG. 4 can be implemented by the computing system 800.
- the computing system 800 shown in FIG. 8 is an example only, and should not be construed as limiting the functionality and scope of use of the implementations described herein.
- computing system 800 is in the form of a general-purpose computing device.
- Components of computing system 800 may include, but are not limited to, one or more processors or processing units 800, memory 820, one or more input devices 830, one or more output devices 840, storage 850, and one or more communication Unit 860.
- the processing unit 800 may be an actual or virtual processor and is capable of performing various processes according to persistence stored in the memory 820 . In a multi-processing system, multiple processing units execute computer-executable instructions to increase processing power.
- Computing system 800 typically includes a plurality of computer media. Such media can be any available media that is accessible to computing system 800, including but not limited to, volatile and nonvolatile media, removable and non-removable media.
- Memory 820 can be volatile memory (eg, registers, cache, random access memory (RAM), non-volatile memory (eg, read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM) ), flash memory) or some combination of them.
- Storage 850 may be removable or non-removable, and may include machine-readable media, such as flash drives, magnetic disks, or any other media that may be capable of storing information and that may be accessed within computing system 800 .
- Computing system 800 may further include additional removable/non-removable, volatile/nonvolatile computer system storage media.
- a disk drive for reading from or writing to a removable, nonvolatile disk such as a "floppy disk"
- a disk drive for reading from a removable, nonvolatile disk may be provided.
- CD-ROM drive for reading or writing.
- each drive may be connected to bus 18 by one or more data media interfaces.
- Memory 820 may include at least one program product having (eg, at least one) set of program modules configured to perform the functions of the various embodiments described herein.
- a program/utility tool 822 having a set of one or more execution modules 824 may be stored in memory 820, for example.
- Execution module 824 may include, but is not limited to, an operating system, one or more application programs, other program modules, and operational data. Each of these examples, or certain combinations, can include the implementation of a networked environment. Execution module 824 generally performs the functions and/or methodologies of embodiments of the subject matter described herein, such as method 200.
- the input unit 830 may be one or more various input devices.
- the input unit 839 may include user equipment such as a mouse, keyboard, trackball, and the like.
- Communications unit 860 enables communications to other computing entities over a communications medium.
- the functionality of the components of computing system 800 may be implemented in a single computing cluster or as a plurality of computing machines capable of communicating through communication links. Accordingly, computing system 800 may operate in a networked environment using logical connections to one or more other servers, a network personal computer (PC), or another general network node.
- communication media includes wired or wireless networking technologies.
- Computing system 800 can also communicate with one or more external devices (not shown), such as storage devices, display devices, etc., and one or more devices that allow users to interact with computing system 800, as needed, Or communicate with any device (eg, network card, modem, etc.) that enables computing system 800 to communicate with one or more other computing devices. Such communication may be performed via an input/output (I/O) interface (not shown).
- external devices such as storage devices, display devices, etc.
- I/O input/output
- FPGAs Field Programmable Gate Arrays
- ASICs Application Specific Integrated Circuits
- ASSPs Application Specific Standard Products
- SOCs System on Chips
- CPLD Complex Programmable Logic Devices
- Program code for implementing the methods of the subject matter described herein can be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, a special purpose computer, or other programmable data processing devices, so that the program codes, when executed by the processor or controller, make the functions/functions specified in the flow diagrams and/or block diagrams The operation is implemented.
- the program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
- a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device.
- a machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
- a machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing.
- machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
- RAM random access memory
- ROM read only memory
- EPROM or flash memory erasable programmable read only memory
- CD-ROM compact disk read only memory
- magnetic storage or any suitable combination of the foregoing.
- a method for generating a navigation path for an electronic device includes: generating a second map based on the first map, the first map describes the positions of a plurality of objects in the scene in the scene, and the second map describes how to reach a target object in the plurality of objects from the plurality of positions in the scene Predicting the distance; determining a candidate path to the target object from the target location of the plurality of locations based on the second map; and selecting a navigation path from the target location to the target object from the candidate paths.
- determining the candidate path comprises determining at least one of the following paths: a first path associated with a predicted distance from the target location to the target object; a second path associated with a predicted distance from the boundary of the second map to the target object associated; and a third path associated with an angle or boundary of the target object to a predetermined specific object in the scene.
- selecting a navigation path includes: using a path planning algorithm to generate a navigation path of the electronic device based on at least one path among the first path, the second path, and the third path, and based on the target location and the target object.
- the first map is a two-dimensional map obtained by projecting each object onto a plane based on a color image and a depth image of the scene.
- the method further includes: updating the first map and the second map based on at least one of the moving time, moving distance, and moving angle of the electronic device exceeding a threshold.
- the predicted distance is represented as a continuous distance value or a discrete distance value, wherein the discrete distance value corresponds to an interval in the continuous distance value.
- generating the second map further includes: dividing the scene in the first map into a plurality of sub-scenes; generating the second map based on the plurality of maps describing positions of a plurality of objects in the sub-scenes Two maps.
- the second map is generated by a neural network model.
- the neural network model is trained by the following method.
- the method comprises: acquiring a training data set including multiple scenes and multiple objects; acquiring a training label, the training label including multiple objects in the multiple scenes in the position of the scene, reaching multiple objects from multiple positions in the scene The true distance of the target object in and the category of the object; the neural network model is trained based on the training data set and the training label, wherein the neural network model output describes the predicted distance from multiple locations in the scene to the target object in the multiple objects map.
- training the neural network model further includes: dividing the scene in the first map into a plurality of sub-scenes; A neural network model is trained with training labels of true distances from multiple locations to a target object among multiple objects.
- training the neural network model further includes: when the position of the target object is not in the scene, using the true distance from the target position in the plurality of positions to the boundary of the scene, training the neural network model; and/or when When the position of the target object is not in the sub-scene, the neural network model is trained using the true distance from the target position in the plurality of positions to the boundary of the sub-scene.
- an apparatus for generating a navigation path for an electronic device includes: a map generation module configured to generate a second map based on the first map, the first map describes the positions of multiple objects in the scene in the scene, and the second map describes the locations of multiple objects arriving from multiple positions in the scene.
- the predicted distance of the target object in the objects is configured to determine a candidate path to the target object from the target location in the plurality of locations based on the second map; and the navigation path selection module is configured to select from the candidate path Select the navigation path from the target location to the target object.
- determining the candidate path comprises determining at least one of the following paths: a first path associated with a predicted distance from the target location to the target object; a second path associated with a predicted distance from the boundary of the second map to the target object associated; and a third path associated with an angle or boundary of the target object to a predetermined specific object in the scene.
- selecting a navigation path includes: using a path planning algorithm to generate a navigation path of the electronic device based on at least one path among the first path, the second path, and the third path, and based on the target location and the target object.
- the first map is a two-dimensional map obtained by projecting each object onto a plane based on a color image and a depth image of the scene.
- the apparatus further includes: a map updating module configured to update the first map and the second map based on at least one of the electronic device's moving time, moving distance, and moving angle exceeding a threshold.
- the predicted distance is represented as a continuous distance value or a discrete distance value, wherein the discrete distance value corresponds to an interval in the continuous distance value.
- the second map generation module is further configured to: divide the scene in the first map into multiple sub-scenes; based on multiple maps describing the positions of multiple objects in the sub-scenes to generate the second map.
- the second map is generated by a neural network model.
- the neural network device includes: a training data acquisition module configured to acquire a training data set including a plurality of scenes and a plurality of objects; a training label acquisition module configured by It is configured to obtain a training label, the training label includes a plurality of objects in a plurality of scenes in the scene, the actual distance from the plurality of positions in the scene to the target object in the plurality of objects, and the category of the object; the training module, A neural network model is configured to train based on the training data set and the training labels, wherein the neural network model outputs a map describing predicted distances from a plurality of locations in the scene to a target object in the plurality of objects.
- the training data acquisition module is also configured to: divide the scene into multiple sub-scenes, and acquire a training data set comprising multiple sub-scenes and multiple objects;
- the training label acquisition module is also configured to: acquire the The positions of the plurality of objects in the sub-scene in the sub-scene, and the training labels of the ground-truth distances from the plurality of positions in the sub-scene to the target object in the plurality of objects.
- training module is further configured to: when the position of the target object is not in the scene, use the real distance from the target position in the plurality of positions to the boundary of the scene to train the neural network model; and/or when When the position of the target object is not in the sub-scene, the neural network model is trained using the true distance from the target position in the plurality of positions to the boundary of the sub-scene.
- an electronic device in an embodiment of the third aspect, includes: a processor and a memory; the memory is used to store one or more computer instructions, wherein when the one or more computer instructions are executed by the processor, the electronic device executes the method according to the first aspect.
- a computer readable storage medium is provided.
- One or more computer instructions are stored on the computer-readable storage medium, wherein the one or more computer instructions are executed by the processor to implement the method according to the first aspect.
- a computer program product comprises one or more computer instructions, wherein the one or more computer instructions, when executed by a processor, implement the method according to the first aspect.
Landscapes
- Engineering & Computer Science (AREA)
- Remote Sensing (AREA)
- Radar, Positioning & Navigation (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Automation & Control Theory (AREA)
- Navigation (AREA)
- Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)
Abstract
一种用于生成电子设备的导航路径的方法,该方法包括:基于第一地图来生成第二地图(201),第一地图描述场景中的多个对象在场景中的位置,第二地图描述从场景中的多个位置到达多个对象中的目标对象的预测距离;基于第二地图确定从多个位置中的目标位置到达目标对象的候选路径(202);从候选路径中选择从目标位置到达目标对象的导航路径(203)。通过使用这种方法,可以充分利用场景中先验知识和对象之间的空间关系,帮助电子设备更高效的找到和到达目标对象。还涉及一种用于生成电子设备的导航路径的装置、电子设备、计算机可读存储介质。
Description
本申请要求2021年11月10日递交的,标题为“用于生成电子设备的导航路径的方法和产品”、申请号为CN202111327724.3的中国发明专利申请的优先权。
本公开的各实施例涉及路径规划技术领域,更具体地,涉及电子设备的导航路径规划的方法、装置、设备、介质和程序产品。
随着技术的发展,很多电子设备(例如,机器人)都具有自动执行任务的能力。例如,当接收给定的任务后(例如,向桌子上的杯子加水),机器人将会自动地规划路径,避开障碍物,沿着其规划的路径移动至桌子附近,再执行后续加水的动作。诸如这种类型的任务可以具有一些挑战,因为机器人所在的环境(例如,机器人所在的房间)可能对于机器人是全新的,没有可以直接使用的地图。而且,即使有描述环境的地图,但由于环境中物品的位置的变化,原有的地图已经不适用。
为了解决这些问题,一种想法是先建立描述环境和环境中各个物体的位置的地图,再在该地图中进行路径规划。然而,这种路径规划的方法需要执行大量的搜索任务,可能要遍历地图中的所有点,带来的计算开销较大。同时,这种方法也没有考虑环境中的各个物体之间的关联,例如桌子与椅子通常放置在一起,杯子通常放置在桌子上,等等。
发明内容
本公开的实施例提供了一种生成电子设备的导航路径的方法、装置、设备、介质和程序产品。
在本公开的第一方面中,提供了一种用于生成电子设备的导航路径的方法。该方法包括:基于第一地图来生成第二地图,第一地图描述场景中的多个对象在场景中的位置,第二地图描述从场景中的多个位置到达多个对象中的目标对象的预测距离;基于第二地图确定从多个位置中的目标位置到达目标对象的候选路径;以及从候选路径中选择从目标位置到达目标对象的导航路径。
在本公开的第一方面中,还提供了一种用于训练神经网络模型的方法。该方法包括:获取包括多个场景和多个对象的训练数据集;获取训练标签,训练标签包括多个场景中的多个对象在场景中的位置、从场景中的多个位置到达多个对象中的目标对象的真实距离、以及对象的类别;基于训练数据集和训练标签来训练神经网络模型,其中神经网络模型输出描述从场景中的多个位置到达多个对象中的目标对象的预测距离的地图。
在本公开的第二方面中,提供了一种用于生成电子设备的导航路径的装置。该装置包括:地图生成模块,被配置为基于第一地图来生成第二地图,第一地图描述场景中的多个对象在场景中的位置,第二地图描述从场景中的多个位置到达多个对象中的目标对象的预测距离;候选路径确定模块,被配置为基于第二地图确定从多个位置中的目标位置到达目标对象的候选路径;以及导航路径选择模块,被配置为从候选路径中选择从目标位置到达目标对象的导航路径。
在本公开的第二方面中,还提供了一种用于训练神经网络模型的装置。该装置包括:训练数据获取模块,被配置为获取包括多个场景和多个对象的训练数据集;训练标签获取模块,被配置为获取训练标签,训练标签包括多个场景中的多个对象在场景中的位置、从场景中的多个位置到达多个对象中的目标对象的真实距离、以及对象的类别;训练模块,被配置为基于训练数据集和训练标签来训练神经网络模型,其中神经网络模型输出描述从场景中的多个位置到达多个对象 中的目标对象的预测距离的地图。
在本公开的第三方面中,提供了一种电子设备。该电子设备包括:处理器;以及存储器,该存储器用于存储一个或多个计算机指令,其中一个或多个计算机指令在由处理器执行时,使电子设备执行根据第一方面所述的方法。
在本公开的第四方面中,提供了一种计算机可读存储介质。该计算机可读存储介质上存储有一个或多个计算机指令,其中一个或多个计算机指令被处理器执行以实现根据第一方面所述的方法。
在本公开的第五方面中,提供了一种计算机程序产品。该计算机程序产品包括一个或多个计算机指令,其中一个或多个计算机指令被处理器执行以实现根据第一方面所述的方法。
提供发明内容部分是为了以简化的形式来介绍对概念的选择,它们在下文的具体实施方式中将被进一步描述。发明内容部分无意标识要求保护的主题的关键特征或主要特征,也无意限制要求保护的主题的范围。
结合附图并参考以下具体实施方式,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。在附图中,相同或相似的附图标注表示相同或相似的元素,其中:
图1示出了根据本公开的某些实施例的用于生成电子设备的导航路径的方法的使用环境的示意图;
图2示出了根据本公开的某些实施例的用于生成电子设备的导航路径的方法的流程图;
图3A示出了根据本公开的某些实施例的第二地图的示意图,并且其中预测距离被示出;
图3B示出了根据本公开的某些实施例的第二地图的示意图,并且其中特定对象被示出;
图4示出了根据本公开的某些实施例的用于训练神经网络模型的 方法的流程图;
图5示出了根据本公开的某些实施例的子场景的示意图;
图6示出了根据本公开的某些实施例的用于生成电子设备的导航路径的装置的框图;
图7示出了根据本公开的某些实施例的用于训练神经网络模型的装置的框图;以及
图8示出了其中可以实现本公开的一个或多个实施例的计算系统的框图。
在所有附图中,相同或相似参考数字表示相同或相似元素。
下面将参照附图更详细地描述本公开的实施例。尽管附图中显示了本公开的某些实施例,但是应当理解的是,本公开可以通过各种形式来实现,而且不应被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
在本公开中使用的术语“地图”是指对环境或场景建模的结果,这是路径规划的重要环节之一。其目的是建立一个便于计算机执行路径规划所使用的模型,即将实际的物理空间抽象成通过算法能够处理的抽象空间,实现物理与抽象之间的映射。
在本公开中使用的术语“路径”是指在路径搜索阶段,在环境模型的基础上应用相应算法寻找的行走路径。该行走路径使预定的与目标相关联的函数获得最优值,并且路径不一定指直接通往目标对象的路径,也可以通往为了到达目标对象而选择的中间目标的路径。
在本文中使用的术语“训练”或“学习”是指利用经验或者数据优化系统性能的过程。例如,神经网络系统可以通过训练或学习过程,逐渐优化预测距离的性能,例如提高预测距离的准确性。在本公开的上下文中,为讨论方便之目的,术语“训练”或者“学习”可以互换 使用。
在本文中使用的术语“生成电子设备的导航路径的方法或模型”是指依据与特定环境或场景中的颜色信息、深度信息、物体种类等等相关联的先验知识建立的方法/模型。该方法或模型可以被用于在电子设备的导航任务中找到目标对象并且使电子设备到达目标对象。
在本文中使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”。其他术语的相关定义将在下文描述中给出。
发明人注意到,现有地图和导航规划方法已经不能满足日益增长的电子设备执行自主任务的需求。例如,当家用的生活服务机器人第一次在家里执行倒水任务时,该机器人甚至可能不知道水杯在何处。而且,在传统的导航任务中,环境地图是提前构建好的,导航的目标位置也是在地图上以坐标的形式给出的。但是,在前述倒水任务中,没有预先构建的地图,并且机器人也不知道目标的位置,只知道目标是什么(例如,目标是水杯,因为先要找到水杯才能倒水)。因此,机器人必须为自己设定目标对象,该目标对象可以包括最终目标对象(例如,水杯),也可以包括中期目标对象(例如,水杯所在的桌子附近,或桌子旁边的椅子),以便合理的规划导航路径,避开障碍等等。
发明人还发现,在传统的导航规划方法中,没有利用先验知识提供更快速、更准确、更简洁的导航路径规划过程。由于在特定环境下,例如室内场景,尤其是居家场景中,各个对象之间的空间关系、距离都满足一定的规律,诸如椅子经常放置在桌子附近,杯子通常在桌子上。当机器人需要搜索杯子时,大概率可以先找一个更容易定位的对象(例如,视觉上更明显的桌子或椅子)。
根据本公开的实施例,将在描述机器人周围场景的地图(下文也称“第一地图”)的基础上再生成包括对象之间的空间关系的先验知识信息的地图(下文也称“第二地图”),来提供关于从场景中的多 个位置到达多个对象中的目标对象的预测距离。这样,在确定从多个位置中的目标位置到达目标对象的候选路径时,更容易找到更短的路径。也就是说,在此描述的各实施例有益地利用了场景中各个对象的空间关系,并且直接利用各个位置到目标对象的距离,无需先执行搜索等操作。相较传统方案,能够提供更优的导航路径,从而使得机器人高效地移动到目标对象。
在下文描述中,某些实施例将参考机器人的工作过程来讨论,例如,提供居家生活服务的机器人,等等。但是应当理解,这仅仅是为了更好地阐释本公开实施例的原理和思想,而无意以任何方式限制本公开的范围。在此的描述的实施例在其他场景中也可适用。
图1示出了根据本公开的某些实施例的用于生成电子设备的导航路径的方法的使用环境100的示意图。如图所示,在电子设备101(诸如,机器人)处,获取场景中的颜色信息(例如,RGB图像)和深度信息(例如,深度图像)。获取这些信息的方式包括但不限于从安装于电子设备上的相机获取,诸如RGBD相机102。该相机可以捕获相机视角内的空间的景深距离,提供三维图像。
在电子设备101将根据导航路径被引导至目标对象。随后电子设备101可以执行任务要求的操作,例如,机器人拿起杯子加满水,等等。对于后续电子设备将执行的操作或动作,本公开不做限制。
图2示出了根据本公开的某些实施例的用于生成电子设备的导航路径的方法200的流程图。为了便于描述,将以机器人在室内居家场景中从它的当前位置移动到桌子旁边作为示例来描述方法200所实现的生成电子设备的导航路径的过程。但是正如上文所述的,这仅仅是示例性的,无意以任何方式限制本公开的范围。本文描述的方法200的实施例同样能够用于其他任何适当的电子设备的导航过程。
在201处,基于第一地图生成第二地图。例如,在该实施例中,第一地图描述的是客厅的多个对象在客厅中的位置。
在电子设备101处,可以利用获取的三维图像,以及预先确定的在场景中的多个对象的类别,来生成第一地图。第一地图的一个示例 是语义地图,语义地图以地图为载体,将语义映射到其中。可以理解,语义表示场景中各个对象的类别。类别是指物体的名称,诸如桌子、椅子,或者可以将这些名称编码为数字等代号。因此,第一地图提供了一种简化的模型。“语义”可以通过分类、检测、分割等模型从三维图像中学习和获取,但“语义”也可以由人来定义,只要这种定义足够普适和简洁。
可以通过上文所讨论的方法来获取第一地图。并且在一些实施例中,第一地图是通过基于场景的颜色图像和深度图像将各个对象投影到平面而获得的二维地图。并且更具体地,这样的第一地图,可以通过利用机器人的视角内的场景,结合机器人的位置和姿态、相机的固有参数等信息,以及对象的类别,投影得到俯视视角的二维地图。因此,这是一种比较高效的表示场景中的各类信息的抽象。
基于第一地图而生成的第二地图描述从场景中的多个位置到达多个对象中的目标对象的预测距离。作为示例,图3A示出了根据本公开的某些实施例的第二地图的示意图,并且其中预测距离被示出,其中每个栅格中数字表示预测距离。图3B示出了第二地图的又一示例,并且其中特定对象被示出。可以看出,以栅格形式组成的第二地图较好的反应了机器人所在场景的状态。区域301是指机器人的已经探索过的区域和未知区域的边界。当机器人发现有特定对象(例如,门,即墙302和墙303之间的区域)时,可以优先判断出到门两边的角度θ以及对应的区域,以更好的生成候选路径和导航路径。
仍以客厅为例,此时,第二地图可以描述客厅中多个位置到达桌子范围的最小距离。例如,从沙发到达桌子的距离,从电视到达桌子的距离等。在更广泛的意义上,多个位置可以指从地图的一个像素到达目标对象处的像素的距离。该距离是通过预测来获取的。例如,神经网络模型可以被训练以学习相关场景的各个物体的特征,并且神经网络模型可以预测从场景中的多个位置到达多个对象中的目标对象的距离。该距离在本文中被称为预测距离。
通过描述该预测距离的第二地图,可以总体上知道机器人当前所 处的场景,场景中的各个对象与目标对象的位置关系。并且,该位置关系考虑了上文提及的先验知识,所以更加的准确。
下面继续参考图2来讨论第二地图的生成。在一些实施例中,可以设置一些栅格,这些栅格的大小可以与场景的实际大小相关联。这些栅格可以便于在生成候选路径和导航路径时的数据处理,也便于机器人在探索场景时,实现实时性与经济性的平衡。
例如,一个栅格对应客厅的5cm*5cm的大小。这可以简化计算,节省计算资源,提升效率。注意,在这里以及本文其他地方描述的任何具体数值仅仅是示例性的,无意限制本公开的范围。
相应的,包括预测距离的第二地图中每个栅格保存了该栅格到目标对象的预测路径长度。目标对象以及目标对象的支撑物或容器(比如目标对象是桌子上的水杯,那么桌子就是支撑物)范围内的栅格保存的预测距离为0,障碍物范围内的栅格保存的预测距离为无穷大。
在一些实施例中,可以基于电子设备101的移动时间、移动距离、移动角度中的至少一个超过阈值,更新第一地图和第二地图。
通过上文的描述可以理解,第一地图和第二地图都是以机器人为视角的,这意味着,当机器人移动时,视角也是会变化的,之前规划的路径可能会变化而不再适用。因此,为了在实时性和计算效率上找到一个平衡,可以确定一些阈值,诸如移动时间、移动距离、移动角度的阈值,当超过阈值时,再更新第一地图和第二地图。这样就达到了实时性和经济性的平衡。
在一些实施例中,预测距离可以被表示为连续距离值或离散距离值,其中离散距离值与连续距离值中的一个区间相对应。
由于将预测距离表示为离散值,每个离散值对应连续值中的一个区间的预测方式更容易实现。在一些实施例中,可以将预测距离表示编号为0至12的区间,0代表预测距离为0米至1米,1代表预测距离为1米至2米,以此类推。这样在计算、处理、存储时都具有优点,可以提供计算较快速度,减少存储量。同时,这也引入了另一个优点,即可以消除预测的误差,因为预测精确的连续值是困难的。
在一些实施例中,其中生成第二地图还可以包括:将第一地图中的场景分割为多个子场景;基于描述子场景中的多个对象在子场景中的位置的多个地图来生成第二地图。
当场景较大时,可以考虑先采用先探索局部场景(即在视场范围内探索并且规划中间目标以及路径),再合成每个局部场景得到全局场景的思路,来完成对整个场景的探索。这使得机器人在新的、未知的区域较大、目标对象不在已探索区域时,也能执行任务。
在202处,基于第二地图确定从多个位置中的目标位置到达目标对象的候选路径。
在一些实施例中,目标位置可以是机器人的当前位置。继续考虑客厅场景,为了到达目标对象,机器人可以直接移动到目标对象,但是当被障碍物阻挡时,例如被一个沙发阻挡,这时机器人面对可以从左边绕过也可从右边绕过的选择。又例如,如果目标对象不在客厅中,则机器人面对离开客厅,进入其他房间的选择。这些选择都与候选路径相对应。特别地,由于计算资源和机器人视场范围的限制,有可能机器人不能直接找到目标对象,而是需要进行探索,或者先选择中间目标,再通过中间目标到达目标对象。为此,本公开的实施例利用了候选路径。
可以存在多种方式来确定候选路径。例如,在一些实施例中,可以选择与第二地图的边界(例如,探索边界,即表示已经构建地图的区域和尚未构建地图的区域之间的分界线)到目标对象的预测距离相关联的路径作为候选路径。为了描述方便起见,将这样的候选路径称为“第二路径”。
如果第二地图描述的场景是客厅,则第一路径以从客厅内已经探索区域的边界到目标对象的预测距离和机器人到客厅边界的距离的和的最小值作为目标来规划路径。在一些实施例中,可以使用以下公式来确定要选择的目标或中间目标以及后续的候选路径:
其中,p
goal表示中间目标,d(p
agent,p)表示机器人当前位置到第二 地图边界的距离,L
Dis(p)表示预测距离,B
exp表示第二地图中已经探索的范围。该公式目的是最小化等式右边大括号内的两项的和的值。
通过这种方式,生成的规划路径是理论上的最短路径。
在一些实施例中,可以选择与目标位置到目标对象的预测距离相关联的路径作为候选路径。为描述方便,这种候选路径也被称为“第一路径”。如果第二地图描述的场景是客厅,则第二路径以机器人的位置到地图边界(例如,已经探索的区域的边界)的位置的预测距离作为目标或中间目标来规划路径。在一些实施例中,可以使用以下公式来确定要选择的中间目标以及后续的候选路径:
该公式目的是最小化L
Dis(p)的值。可以看出,这种情况下,对中间目标的选择不考虑机器人当前位置,因此更加高效。
在一些实施例中,可以选择与目标对象到场景中的预先确定的特定对象的角度或边界相关联的路径作为候选路径。为描述方便,这种候选路径也被称为“第三路径”。
假设第二地图描述的场景是客厅,该客厅的边界有一个特定对象(例如,门),注意该特定对象可以是预先确定的。在这种情况下,将优先在门的范围内选择预测距离最小的位置作为中间目标。门的位置可以是在一个范围内的位置,因此可以理解,该中间目标可以与从目标位置到特定对象的角度θ
d(例如,120度)或范围边界相关联。在一些实施例中,可以使用以下公式来确定要选择的目标或中间目标以及对应的候选路径:
其中,B
d表示定义的B
exp,p
door表示门(也可以是走廊等预先确定的特定对象)存在的概率。该概率从神经网络模型获取。在一些实施例中,可以利用交叉熵损失(Cross Entropy Loss)去训练关于门(或其他特定对象)的类别,以更准确地确定其概率。可以看出,该 路径使机器人暂时略过与目标对象无关的对象(例如,其他房间),优先搜索与目标对象相关的对象(例如,目标对象所在的房间)。
由于候选路径是基于不同的策略而生成的,这些策略提供了机器人在面对这些选择时,确定路径的机制。这些机制将提供到达目标对象的中间目标以及到达中间目标的路径。通过不断的变化中间目标,实现到达对象目标。这能使得机器人在新场境中(例如,从未探索过的场景中),依然可以具有提供导航路径的能力。
特别地,在一些实施例中,第二地图可以基于第一地图由神经网络模型生成。神经网络模型获取第一地图和在场景中的各个对象的类别,生成第二地图。
在一些实施例中,用于生成第二地图的神经网络可以通过与体现各个对象的在场景中的空间关系的第一地图和各个对象的类别的数据集而被训练。这方面的示例实施例将在下文参考图4加以描述。
继续参考图2,在203处,从候选路径中选择从目标位置到达目标对象的导航路径。
在一些实施例中,基于第一路径、第二路径、第三路径中的至少一个路径,并且基于目标位置和目标对象,利用路径规划算法生成电子设备的导航路径。然后应当理解,本公开的范围并不限于上文描述的几种确定候选路径的示例。其他适当的方式也可使用。
在一些实施例中,可以以第一路径、第二路径、第三路径中的一个为基础,使用快速行进算法(Fast Marching Method)或A*路径规划算法来提供导航路径。也可以使用其他路径规划算法来提供导航路径,本公开对此不做限制。
可以看出,根据本公开的实施例,考虑场景中各个物体之间的空间位置关系(即,先验知识)被充分考虑,从而得到更符合实际情况的、描述了这些先验知识的第二地图。在第二地图中,还可以以简化的形式(即,以离散值表示的预测距离)描述机器人实际能够移动的路径的预测距离,使得在生成候选路径和导航路径时,无需再搜索地图中的每一个点,节省了大量计算资源,提高了效率。在面对新场景、 未知环境时,能以局部到全局的方式,展开探索,实现到目标对象的导航。由于考虑了移动时间、移动距离或移动角度超过阈值而更新地图,可以使得机器人在实时性和经济性上达到比较好的平衡。
如上文所述,在一些实施例中,第二地图可以根据神经网络基于第一地图生成。图4示出了根据本公开的某些实施例的用于训练这种神经网络模型的方法400的流程图。将会理解,神经网络的训练和使用可以发生在相同或者不同的位置。也即,方法200与方法400可以由相同的主体执行,也可以由不同的主体执行。
在401处,获取包括多个场景和多个对象的训练数据集。
在一些实施例中,这些场景可以是预先建立的标准环境,每个场景都是根据需要为了让神经网络模型学习指定的特征而布置的。对象的类别可以包括各种在实际应用中可以放置的物品,诸如床、沙发、桌子等。
在402处,获取训练标签,训练标签包括多个场景中的多个对象在场景中的位置、从场景中的多个位置到达多个对象中的目标对象的真实距离、以及对象的类别。
在一些实施例中,可以在401的各个场景中,预先标注这些位置、距离、类别的值(在本文被称为真实值或真实距离)。将这些值确定为样本标签,训练神经网络模型。由于这些样本标签是专门设置的,包括了上文中提及的先验知识,蕴含了场景的特性,因此训练得到的神经网络模型可以生成具有该场景特性的第二地图,以促进候选路径和导航路径的准确性。
在一些实施例中,其中训练神经网络模型还可以包括:将场景分割为多个子场景;以及基于包括子场景中的多个对象在子场景中的位置和从子场景中的多个位置到达多个对象中的目标对象的真实距离的训练标签,训练所述神经网络模型。
在一些实施例中,可以先使用特定大小的子场景来训练,再逐步的探索至整个场景,完成针对整个场景的训练。
在403处,基于训练数据集和训练标签来训练神经网络模型,其 中神经网络模型输出描述从场景中的多个位置到达多个对象中的目标对象的预测距离的地图。
在一些实施例中,神经网络模型可以是完全卷积神经网络,可以具有3个下采样ResBlock层,3个上采样ResBlock层,以及在各个层上级联低级别的特征图和上采样特征图。神经网络的输出是预测距离。可以将输出通道设置为n
b*n
T,其中n
b离散预测距离的表示的面积的边长,例如5cm。n
T表示目标类别的数目。因此,通过这种方式,每个n
b通道组成一个组,负责预测某个目标的预测距离,所以可以训练和预测很多组对象类型和输出预测距离,提高了效率。
在一些实施例中,其中训练神经网络模型还包括:当目标对象的位置不在场景中时,使用多个位置中的目标位置到场景的边界的真实距离,训练神经网络模型;和/或当目标对象的位置不在子场景中时,使用多个位置中的目标位置到子场景的边界的真实距离,训练神经网络模型。
可以看出,通过以上描述的方法400而训练产生的神经网络模型,针对场景中各个对象可以准确地分类,针对目标位置和目标对象之间的距离的预测不但准确,还消除了由于连续值可能不准确带来的误差,增加了机器人在实际应用环境下的鲁棒性。由于机器人移动需要的实时性,更新地图的计算开销也可以由于该神经网络模型的较高的计算效率而减少。
图5示出了根据本公开的某些实施例的子场景的示意图。
可以看出,当场景较大时,诸如目标对象一开始不在机器人视场中或还没来得及探索全部场景时,如果机器人想要搜索到达椅子的导航路径,则可以优先到达桌子的旁边,这就是由于第二地图提供的先验知识。因此,在实际应用中,该实现可以体现前述优点。
图6示出了根据本公开的某些实施例的用于生成电子设备的导航路径的装置600的框图。该装置包括:地图生成模块601,被配置为在电子设备101处,基于第一地图来生成第二地图,第一地图描述场景中的多个对象在场景中的位置,第二地图描述从场景中的多个位置 到达多个对象中的目标对象的预测距离;该装置还包括:候选路径确定模块602,被配置为基于第二地图确定从多个位置中的目标位置到达目标对象的候选路径;以及该装置还包括:导航路径选择模块603,被配置为从候选路径中选择从目标位置到达目标对象的导航路径。
在一些实施例中,其中确定候选路径可以包括确定以下至少一个路径:第一路径,与目标位置到目标对象的预测距离相关联;第二路径,与第二地图的边界到目标对象的预测距离相关联;以及第三路径,与目标对象到场景中的预先确定的特定对象的角度或边界相关联。
在一些实施例中,可以使用公式(1)来确定与第一路径有关的目标或中间目标;可以使用公式(2)来确定与第二路径有关的目标或中间目标;公式(3)来确定与第三路径有关的目标或中间目标。具体对公式的描述可以参考方法200的相关描述。
在一些实施例中,其中选择导航路径可以包括:基于第一路径、第二路径、第三路径中的至少一个路径,并且基于目标位置和目标对象,利用路径规划算法生成电子设备的导航路径。
在一些实施例中,其中第一地图是通过基于场景的颜色图像和深度图像将各个对象投影到平面而获得的二维地图。
在一些实施例中,该装置还可以包括地图更新模块604,被配置为基于电子设备101的移动时间、移动距离、移动角度中的至少一个超过阈值,更新第一地图和第二地图。
在一些实施例中,其中预测距离可以被表示为连续距离值或离散距离值,其中离散距离值与连续距离值中的一个区间相对应。
在一些实施例中,其中第二地图生成模块还被配置为:将第一地图中的场景分割为多个子场景;基于描述子场景中的多个对象在子场景中的位置的多个地图来生成第二地图。
在一些实施例中,第二地图由神经网络模型生成。神经网络模型获取第一地图,以及在场景中的各个对象的类别,生成第二地图。
关于装置600的具体实现过程,可以参考关于方法200的描述,本公开在此不再赘述。可以理解,通过本公开的装置600,可以实现 与方法200相同的技术效果,从而可以达到如上述的生成电子设备的导航路径的方法200一样的至少一个优点。
在一些实施例中,装置600中的第二地图可以通过基于装置700而训练的神经网络模型来生成。神经网络模型可以使用图7中的装置700来训练。图7示出了根据本公开的某些实施例的用于训练神经网络模型的装置700的框图。该装置700包括训练数据获取模块701,被配置为获取包括多个场景和多个对象的训练数据集。该装置还包括训练标签获取模块702,被配置为获取训练标签,训练标签包括多个场景中的多个对象在场景中的位置、从场景中的多个位置到达多个对象中的目标对象的真实距离、以及对象的类别。该装置还包括训练模块703,被配置为基于训练数据集和训练标签来训练神经网络模型,其中神经网络模型输出描述从场景中的多个位置到达多个对象中的目标对象的预测距离的地图。
在一些实施例中,训练数据获取模块701还被配置为:将场景分割为多个子场景,获取包括多个子场景和多个对象的训练数据集。训练标签获取模块702还被配置为:获取包括子场景中的多个对象在子场景中的位置,和从子场景中的多个位置到达多个对象中的目标对象的真实距离的训练标签。
在一些实施例中,训练模块703还被配置为:当目标对象的位置不在场景中时,使用多个位置中的目标位置到场景的边界的真实距离,训练神经网络模型;和/或当目标对象的位置不在子场景中时,使用多个位置中的目标位置到子场景的边界的真实距离,训练神经网络模型。
可以理解,通过以上描述的装置700而训练的神经网络模型,不但可以解决机器人执行任务时的导航路径规划问题,还可以提供机器人探索场景的最佳路线。这使得其能够快速的了解自己所处场景的全貌。因此,可以提供如方法400和前述其他优点中的至少一个。
图8示出了其中可以实现本公开的一个或多个实施例的计算系统800的框图。图2、图4所示的方法200和方法400可以由计算系统 800实现。图8示出的计算系统800仅是示例,其不应当构成对本文所描述的实现的使用的功能和范围的限制。
如图8所示,计算系统800是通用计算设备的形式。计算系统800的组件可以包括但不限于一个或多个处理器或处理单元800,存储器820,一个或多个输入设备830,一个或多个输出设备840,存储装置850,和一个或多个通信单元860。处理单元800可以是实际或虚拟处理器并且能够根据存储器820中存储的持续来执行各种处理。在多处理系统中,多处理单元执行计算机可执行指令,以增加处理能力。
计算系统800通常包括多个计算机介质。这样的介质可以是计算系统800可访问的任何可以获得的介质,包括但不限于易失性和非易失性介质、可拆卸和不可拆卸介质。存储器820可以是易失性存储器(例如寄存器、高速缓存、随机访问存储器(RAM))、非非易失性存储器(例如,只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、闪存)或它们的某种组合。存储装置850可以是可拆卸或不可拆卸,并且可以包括机器可读介质,诸如闪存驱动、磁盘或者任何其他介质,其可以能够用于存储信息并且可以在计算系统800内被访问。
计算系统800可以进一步包括另外的可拆卸/不可拆卸、易失性/非易失性计算机系统存储介质。尽管未在图8中示出,可以提供用于从可拆卸、非易失性磁盘(例如“软盘”)进行读取或写入的磁盘驱动和用于从可拆卸、非易失性光盘进行读取或写入的光盘驱动。在这些情况中,每个驱动可以由一个或多个数据介质接口被连接至总线18。存储器820可以包括至少一个程序产品,具有(例如至少一个)程序模块集合,这些程序模块被配置为执行本文所描述的各种实施例的功能。
具有一个或多个执行模块824的集合的程序/实用程序工具822可以被存储在例如存储器820中。执行模块824可以包括但不限于操作系统、一个或多个应用程序、其他程序模块和操作数据。这些示例中的每个示例或特定组合可以包括联网环境的实现。执行模块824通 常执行本文所描述的主题的实施例的功能和/或方法,例如方法200。
输入单元830可以是一个或多个各种输入设备。例如,输入单元839可以包括用户设备、诸如鼠标、键盘、追踪球等。通信单元860实现在通信介质上向另外的计算实体进行通信。附加地,计算系统800的组件的功能可以以单个计算集群或多个计算机器来实现,这些计算机器能够通过通信连接来通信。因此,计算系统800可以使用与一个或多个其他服务器、网络个人计算机(PC)或者另一个一般网络节点的逻辑连接来在联网环境中进行操作。例如但不限于,通信介质包括有线或无线联网技术。
计算系统800还可以根据需要与一个或多个外部设备(未示出)进行通信,外部设备诸如存储设备、显示设备等等,与一个或多个使得用户与计算系统800交互的设备进行通信,或者与使得计算系统800与一个或多个其他计算设备通信的任何设备(例如,网卡、调制解调器等)进行通信。这样的通信可以经由输入/输出(I/O)接口(未示出)来执行。
本文中所描述的功能可以至少部分地由一个或多个硬件逻辑组件来执行。例如但不限于,可以使用的硬件逻辑组件的示意性类型包括现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑器件(CPLD)等。
用于实施本文所描述的主题的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。
在本公开内容的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系 统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
此外,尽管采用特定次序描绘了各操作,但是这应当理解为要求这样操作以所示出的特定次序或以顺序次序执行,或者要求所有图示的操作应被执行以取得期望的结果。在一定环境下,多任务和并行处理可能是有利的。同样地,尽管在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本文所描述的主题的范围的限制。在单独的实现的上下文中描述的某些特征还可以组合地实现在单个实现中。相反地,在单个实现的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实现中。
以下列出了本公开的一些示例实现。
在第一方面的某些实施例中,提供了一种用于生成电子设备的导航路径的方法。该方法包括:基于第一地图来生成第二地图,第一地图描述场景中的多个对象在场景中的位置,第二地图描述从场景中的多个位置到达多个对象中的目标对象的预测距离;基于第二地图确定从多个位置中的目标位置到达目标对象的候选路径;以及从候选路径中选择从目标位置到达目标对象的导航路径。
在某些实施例中,其中确定候选路径包括确定以下至少一个路径:第一路径,与目标位置到目标对象的预测距离相关联;第二路径,与第二地图的边界到目标对象的预测距离相关联;以及第三路径,与目标对象到场景中的预先确定的特定对象的角度或边界相关联。
在某些实施例中,其中选择导航路径包括:基于第一路径、第二 路径、第三路径中的至少一个路径,并且基于目标位置和目标对象,利用路径规划算法生成电子设备的导航路径。
在某些实施例中,其中第一地图是通过基于场景的颜色图像和深度图像将各个对象投影到平面而获得的二维地图。
在某些实施例中,该方法还包括:基于电子设备的移动时间、移动距离、移动角度中的至少一个超过阈值,更新第一地图和第二地图。
在某些实施例中,其中预测距离被表示为连续距离值或离散距离值,其中离散距离值与连续距离值中的一个区间相对应。
在某些实施例中,其中生成第二地图还包括:将第一地图中的场景分割为多个子场景;基于描述子场景中的多个对象在子场景中的位置的多个地图来生成第二地图。
在某些实施例中,其中第二地图由神经网络模型生成。
在某些实施例中,其中神经网络模型由以下方法训练。该方法包括:获取包括多个场景和多个对象的训练数据集;获取训练标签,训练标签包括多个场景中的多个对象在场景中的位置、从场景中的多个位置到达多个对象中的目标对象的真实距离、以及对象的类别;基于训练数据集和训练标签来训练神经网络模型,其中神经网络模型输出描述从场景中的多个位置到达多个对象中的目标对象的预测距离的地图。
在某些实施例中,其中训练神经网络模型还包括:将第一地图中的场景分割为多个子场景;以及基于包括子场景中的多个对象在子场景中的位置和从子场景中的多个位置到达多个对象中的目标对象的真实距离的训练标签,训练神经网络模型。
在某些实施例中,其中训练神经网络模型还包括:当目标对象的位置不在场景中时,使用多个位置中的目标位置到场景的边界的真实距离,训练神经网络模型;和/或当目标对象的位置不在子场景中时,使用多个位置中的目标位置到子场景的边界的真实距离,训练神经网络模型。
在第二方面的实施例中,提供了一种用于生成电子设备的导航路 径的装置。该装置包括:地图生成模块,被配置为基于第一地图来生成第二地图,第一地图描述场景中的多个对象在场景中的位置,第二地图描述从场景中的多个位置到达多个对象中的目标对象的预测距离;候选路径确定模块,被配置为基于第二地图确定从多个位置中的目标位置到达目标对象的候选路径;以及导航路径选择模块,被配置为从候选路径中选择从目标位置到达目标对象的导航路径。
在某些实施例中,其中确定候选路径包括确定以下至少一个路径:第一路径,与目标位置到目标对象的预测距离相关联;第二路径,与第二地图的边界到目标对象的预测距离相关联;以及第三路径,与目标对象到场景中的预先确定的特定对象的角度或边界相关联。
在某些实施例中,其中选择导航路径包括:基于第一路径、第二路径、第三路径中的至少一个路径,并且基于目标位置和目标对象,利用路径规划算法生成电子设备的导航路径。
在某些实施例中,其中第一地图是通过基于场景的颜色图像和深度图像将各个对象投影到平面而获得的二维地图。
在某些实施例中,该装置还包括:地图更新模块,被配置为基于电子设备的移动时间、移动距离、移动角度中的至少一个超过阈值,更新第一地图和第二地图。
在某些实施例中,其中预测距离被表示为连续距离值或离散距离值,其中离散距离值与连续距离值中的一个区间相对应。
在某些实施例中,其中第二地图生成模块还被配置为:将第一地图中的场景分割为多个子场景;基于描述子场景中的多个对象在子场景中的位置的多个地图来生成第二地图。
在某些实施例中,其中第二地图由神经网络模型生成。
在某些实施例中,其中神经网络模型由神经网络装置训练,神经网络装置包括:训练数据获取模块,被配置为获取包括多个场景和多个对象的训练数据集;训练标签获取模块,被配置为获取训练标签,训练标签包括多个场景中的多个对象在场景中的位置、从场景中的多个位置到达多个对象中的目标对象的真实距离、以及对象的类别;训 练模块,被配置为基于训练数据集和训练标签来训练神经网络模型,其中神经网络模型输出描述从场景中的多个位置到达多个对象中的目标对象的预测距离的地图。
在某些实施例中,其中训练数据获取模块还被配置为:将场景分割为多个子场景,获取包括多个子场景和多个对象的训练数据集;训练标签获取模块还被配置为:获取包括子场景中的多个对象在子场景中的位置,和从子场景中的多个位置到达多个对象中的目标对象的真实距离的训练标签。
在某些实施例中,其中训练模块还被配置为:当目标对象的位置不在场景中时,使用多个位置中的目标位置到场景的边界的真实距离,训练神经网络模型;和/或当目标对象的位置不在子场景中时,使用多个位置中的目标位置到子场景的边界的真实距离,训练神经网络模型。
在第三方面的实施例中,提供了一种电子设备。该电子设备包括:处理器以及存储器;该存储器用于存储一个或多个计算机指令,其中一个或多个计算机指令在由处理器执行时,使电子设备执行根据第一方面所述的方法。
在第四方面的实施例中,提供了一种计算机可读存储介质。该计算机可读存储介质上存储有一个或多个计算机指令,其中一个或多个计算机指令被处理器执行以实现根据第一方面所述的方法。
在第五方面的实施例中,提供了一种计算机程序产品。该计算机程序产品包括一个或多个计算机指令,其中一个或多个计算机指令在被处理器执行时,实现根据第一方面所述的方法。
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本公开,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。
Claims (20)
- 一种生成电子设备的导航路径的方法,包括:基于第一地图来生成第二地图,所述第一地图描述场景中的多个对象在所述场景中的位置,所述第二地图描述从所述场景中的多个位置到达所述多个对象中的目标对象的预测距离;基于所述第二地图确定从所述多个位置中的目标位置到达所述目标对象的候选路径;以及从所述候选路径中选择从所述目标位置到达所述目标对象的导航路径。
- 根据权利要求1所述的方法,其中确定所述候选路径包括确定以下至少一个路径:第一路径,与所述目标位置到所述目标对象的所述预测距离相关联;第二路径,与所述第二地图的边界到所述目标对象的所述预测距离相关联;以及第三路径,与所述目标对象到所述场景中的预先确定的特定对象的角度或边界相关联。
- 根据权利要求2所述的方法,其中选择所述导航路径包括:基于所述第一路径、所述第二路径、所述第三路径中的至少一个路径,并且基于所述目标位置和所述目标对象,利用路径规划算法生成所述导航路径。
- 根据权利要求1所述的方法,其中所述第一地图是通过基于所述场景的颜色图像和深度图像将所述各个对象投影到平面而获得的二维地图。
- 根据权利要求1所述的方法,还包括:基于所述电子设备的移动时间、移动距离、移动角度中的至少一个超过阈值,更新所述第一地图和所述第二地图。
- 根据权利要求1所述的方法,其中所述预测距离被表示为连续 距离值或离散距离值,其中所述离散距离值与所述连续距离值中的一个区间相对应。
- 根据权利要求1所述的方法,其中生成所述第二地图还包括:将所述第一地图中的所述场景分割为多个子场景;基于描述所述子场景中的多个对象在所述子场景中的位置的多个地图来生成所述第二地图。
- 根据权利要求1所述的方法,其中所述第二地图由神经网络模型生成。
- 根据权利要求8所述的方法,其中所述神经网络模型由以下方法训练:获取包括多个场景和多个对象的训练数据集;获取训练标签,所述训练标签包括所述多个场景中的多个对象在所述场景中的位置、从所述场景中的多个位置到达所述多个对象中的目标对象的真实距离、以及所述对象的类别;基于所述训练数据集和所述训练标签来训练所述神经网络模型。
- 根据权利要求9所述的方法,其中所述神经网络模型还由以下方法训练:将所述第一地图中的所述场景分割为多个子场景;以及基于包括所述子场景中的多个对象在所述子场景中的位置和从所述子场景中的多个位置到达所述多个对象中的目标对象的所述真实距离的训练标签,训练所述神经网络模型。
- 根据权利要求9所述的方法,其中所述神经网络模型还由以下方法训练:当所述目标对象的位置不在所述场景中时,使用所述多个位置中的目标位置到所述场景的边界的所述真实距离,训练所述神经网络模型;和/或当所述目标对象的位置不在所述子场景中时,使用所述多个位置中的目标位置到所述子场景的边界的所述真实距离,训练所述神经网络模型。
- 一种用于生成电子设备的导航路径的装置,包括:地图生成模块,被配置为基于第一地图来生成第二地图,所述第一地图描述场景中的多个对象在所述场景中的位置,所述第二地图描述从所述场景中的多个位置到达所述多个对象中的目标对象的预测距离;候选路径确定模块,被配置为基于所述第二地图确定从所述多个位置中的目标位置到达所述目标对象的候选路径;以及导航路径选择模块,被配置为从所述候选路径中选择从所述目标位置到达所述目标对象的导航路径。
- 根据权利要求12所述的装置,其中确定所述候选路径包括确定以下至少一个路径:第一路径,与所述目标位置到所述目标对象的所述预测距离相关联;第二路径,与所述第二地图的边界到所述目标对象的所述预测距离相关联;以及第三路径,与所述目标对象到所述场景中的预先确定的特定对象的角度或边界相关联。
- 根据权利要求13所述的装置,其中选择所述导航路径包括:基于所述第一路径、所述第二路径、所述第三路径中的至少一个路径,并且基于所述目标位置和所述目标对象,利用路径规划算法生成所述导航路径。
- 根据权利要求12所述的装置,其中所述第一地图是通过基于所述场景的颜色图像和深度图像将所述各个对象投影到平面而获得的二维地图。
- 根据权利要求12所述的装置,还包括:地图更新模块,被配置为基于所述电子设备的移动时间、移动距离、移动角度中的至少一个超过阈值,更新所述第一地图和所述第二地图。
- 根据权利要求12所述的装置,其中所述第二地图由神经网络 模型生成,所述神经网络模型由神经网络装置训练,所述神经网络装置包括:训练数据获取模块,被配置为获取包括多个场景和多个对象的训练数据集;训练标签获取模块,被配置为获取训练标签,所述训练标签包括所述多个场景中的多个对象在所述场景中的位置、从所述场景中的多个位置到达所述多个对象中的目标对象的真实距离、以及所述对象的类别;训练模块,被配置为基于所述训练数据集和所述训练标签来训练所述神经网络模型。
- 一种电子设备,包括:处理器;以及存储器,用于存储一个或多个计算机指令,其中所述一个或多个计算机指令在由所述处理器执行时,使所述电子设备执行根据权利要求1至11中任一项所述的方法。
- 一种计算机可读存储介质,其上存储有一个或多个计算机指令,其中所述一个或多个计算机指令被处理器执行以实现根据权利要求1至11中任一项所述的方法。
- 一种计算机程序产品,包括一个或多个计算机指令,其中所述一个或多个计算机指令在被处理器执行时,实现根据权利要求1至11中任一项所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111327724.3 | 2021-11-10 | ||
CN202111327724.3A CN114061586B (zh) | 2021-11-10 | 2021-11-10 | 用于生成电子设备的导航路径的方法和产品 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023082985A1 true WO2023082985A1 (zh) | 2023-05-19 |
Family
ID=80274651
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/127124 WO2023082985A1 (zh) | 2021-11-10 | 2022-10-24 | 用于生成电子设备的导航路径的方法和产品 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114061586B (zh) |
WO (1) | WO2023082985A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116798030A (zh) * | 2023-08-28 | 2023-09-22 | 中国建筑第六工程局有限公司 | 曲面观光雷达高塔验收方法、系统、装置及存储介质 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114061586B (zh) * | 2021-11-10 | 2024-08-16 | 北京有竹居网络技术有限公司 | 用于生成电子设备的导航路径的方法和产品 |
CN115046559A (zh) * | 2022-06-13 | 2022-09-13 | 阿波罗智联(北京)科技有限公司 | 信息处理方法和装置 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106289285A (zh) * | 2016-08-20 | 2017-01-04 | 南京理工大学 | 一种关联场景的机器人侦察地图及构建方法 |
CN109931942A (zh) * | 2019-03-13 | 2019-06-25 | 浙江大华技术股份有限公司 | 机器人路径生成方法、装置、机器人和存储介质 |
US20190354781A1 (en) * | 2018-05-17 | 2019-11-21 | GM Global Technology Operations LLC | Method and system for determining an object location by using map information |
CN111340766A (zh) * | 2020-02-21 | 2020-06-26 | 北京市商汤科技开发有限公司 | 目标对象的检测方法、装置、设备和存储介质 |
CN111982094A (zh) * | 2020-08-25 | 2020-11-24 | 北京京东乾石科技有限公司 | 导航方法及其装置和系统以及可移动设备 |
US20210281977A1 (en) * | 2020-03-05 | 2021-09-09 | Xerox Corporation | Indoor positioning system for a mobile electronic device |
CN114061586A (zh) * | 2021-11-10 | 2022-02-18 | 北京有竹居网络技术有限公司 | 用于生成电子设备的导航路径的方法和产品 |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8554464B2 (en) * | 2008-04-30 | 2013-10-08 | K-Nfb Reading Technology, Inc. | Navigation using portable reading machine |
WO2019236588A1 (en) * | 2018-06-04 | 2019-12-12 | The Research Foundation For The State University Of New York | System and method associated with expedient determination of location of one or more object(s) within a bounded perimeter of 3d space based on mapping and navigation to a precise poi destination using a smart laser pointer device |
CN113039563B (zh) * | 2018-11-16 | 2024-03-12 | 辉达公司 | 学习生成用于训练神经网络的合成数据集 |
US11454978B2 (en) * | 2019-11-07 | 2022-09-27 | Naver Corporation | Systems and methods for improving generalization in visual navigation |
US11244470B2 (en) * | 2020-03-05 | 2022-02-08 | Xerox Corporation | Methods and systems for sensing obstacles in an indoor environment |
CN113048980B (zh) * | 2021-03-11 | 2023-03-14 | 浙江商汤科技开发有限公司 | 位姿优化方法、装置、电子设备及存储介质 |
CN113284240B (zh) * | 2021-06-18 | 2022-05-31 | 深圳市商汤科技有限公司 | 地图构建方法及装置、电子设备和存储介质 |
CN113570664B (zh) * | 2021-07-22 | 2023-03-24 | 北京百度网讯科技有限公司 | 增强现实导航显示方法和装置、电子设备、计算机介质 |
-
2021
- 2021-11-10 CN CN202111327724.3A patent/CN114061586B/zh active Active
-
2022
- 2022-10-24 WO PCT/CN2022/127124 patent/WO2023082985A1/zh unknown
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106289285A (zh) * | 2016-08-20 | 2017-01-04 | 南京理工大学 | 一种关联场景的机器人侦察地图及构建方法 |
US20190354781A1 (en) * | 2018-05-17 | 2019-11-21 | GM Global Technology Operations LLC | Method and system for determining an object location by using map information |
CN109931942A (zh) * | 2019-03-13 | 2019-06-25 | 浙江大华技术股份有限公司 | 机器人路径生成方法、装置、机器人和存储介质 |
CN111340766A (zh) * | 2020-02-21 | 2020-06-26 | 北京市商汤科技开发有限公司 | 目标对象的检测方法、装置、设备和存储介质 |
US20210281977A1 (en) * | 2020-03-05 | 2021-09-09 | Xerox Corporation | Indoor positioning system for a mobile electronic device |
CN111982094A (zh) * | 2020-08-25 | 2020-11-24 | 北京京东乾石科技有限公司 | 导航方法及其装置和系统以及可移动设备 |
CN114061586A (zh) * | 2021-11-10 | 2022-02-18 | 北京有竹居网络技术有限公司 | 用于生成电子设备的导航路径的方法和产品 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116798030A (zh) * | 2023-08-28 | 2023-09-22 | 中国建筑第六工程局有限公司 | 曲面观光雷达高塔验收方法、系统、装置及存储介质 |
CN116798030B (zh) * | 2023-08-28 | 2023-11-14 | 中国建筑第六工程局有限公司 | 曲面观光雷达高塔验收方法、系统、装置及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN114061586A (zh) | 2022-02-18 |
CN114061586B (zh) | 2024-08-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023082985A1 (zh) | 用于生成电子设备的导航路径的方法和产品 | |
US10297070B1 (en) | 3D scene synthesis techniques using neural network architectures | |
US20220237885A1 (en) | Systems and methods for extracting information about objects from scene information | |
US10705525B2 (en) | Performing autonomous path navigation using deep neural networks | |
Stachniss et al. | Simultaneous localization and mapping | |
JP2021509215A (ja) | 地面テクスチャ画像に基づくナビゲーション方法、装置、デバイス、および記憶媒体 | |
US11281221B2 (en) | Performing autonomous path navigation using deep neural networks | |
Chaplot et al. | Differentiable spatial planning using transformers | |
Wang et al. | Autonomous 3-d reconstruction, mapping, and exploration of indoor environments with a robotic arm | |
Mortari et al. | " Improved geometric network model"(IGNM): A novel approach for deriving connectivity graphs for indoor navigation | |
Liu et al. | Bird's-Eye-View Scene Graph for Vision-Language Navigation | |
CN110806211A (zh) | 机器人自主探索建图的方法、设备及存储介质 | |
Guizilini et al. | Dynamic hilbert maps: Real-time occupancy predictions in changing environments | |
Kaufman et al. | Bayesian occupancy grid mapping via an exact inverse sensor model | |
Li et al. | Stereovoxelnet: Real-time obstacle detection based on occupancy voxels from a stereo camera using deep neural networks | |
Belavadi et al. | Frontier exploration technique for 3d autonomous slam using k-means based divisive clustering | |
Warburg et al. | Sparseformer: Attention-based depth completion network | |
CN109389677B (zh) | 房屋三维实景地图的实时构建方法、系统、装置及存储介质 | |
Loo et al. | Scene Action Maps: Behavioural Maps for Navigation without Metric Information | |
Qiu et al. | 3D scene graph prediction on point clouds using knowledge graphs | |
CN114943785A (zh) | 地图构建方法、装置、设备及存储介质 | |
Steenbeek | CNN based dense monocular visual SLAM for indoor mapping and autonomous exploration | |
Xie et al. | A survey of filtering based active localization methods | |
Felicioni et al. | Goln: Graph object-based localization network | |
WO2024187403A1 (zh) | 寻路和寻路图神经网络的训练方法、设备、介质和计算机程序产品 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22891782 Country of ref document: EP Kind code of ref document: A1 |