CN111582352A

CN111582352A - Object-based sensing method and device, robot and storage medium

Info

Publication number: CN111582352A
Application number: CN202010363102.5A
Authority: CN
Inventors: 张伟义; 沈孝通; 秦宝星; 程昊天
Original assignee: Shanghai Gaussian Automation Technology Development Co Ltd
Current assignee: Shanghai Gaussian Automation Technology Development Co Ltd
Priority date: 2020-04-30
Filing date: 2020-04-30
Publication date: 2020-08-25
Anticipated expiration: 2040-04-30
Also published as: CN111582352B

Abstract

The invention discloses a perception method, a perception device, a robot and a storage medium based on an object, wherein the method comprises the following steps: establishing a graph structure according to the collected at least one point cloud data; clustering point cloud data according to the weight value of the edges in the graph structure, and acquiring at least one clustered class for marking a semantic label; determining the number of objects with the distance from the robot being smaller than a preset first distance threshold according to the clustered class of the at least one labeled semantic tag; and controlling the movement of the robot according to the number of the objects with the distance from the robot being less than a preset first distance threshold. According to the method, when object sensing is carried out, the collected point cloud data are clustered according to the graph structure established by the point cloud data, semantic segmentation is realized, so that the motion of the robot is controlled, the clustering process is simple and efficient, the object sensing efficiency is improved, and the safety of the robot and the safety of the surrounding environment in the moving process of the robot are ensured.

Description

Object-based sensing method and device, robot and storage medium

Technical Field

The embodiment of the invention relates to the technical field of robot perception, in particular to a perception method and device based on an object, a robot and a storage medium.

Background

With the continuous development of computer technology, the development and perfection of robot sensor equipment and the popularization of robot operating systems, the trend of robot exchange is coming. The perception technology in the robot movement is to analyze and calculate a series of sensor data to obtain the understanding of the robot to the surrounding environment. At present, the perception technology is successfully applied to various fields such as unmanned driving, ports and docks, family service and the like. The robust and accurate perception capability is one of core technologies of the robot and is a basic requirement for realizing autonomous navigation of the robot. When the robot executes the operation task, the robot can timely react to the surrounding environment according to the information of the sensor, and the requirement of safety is met.

At present, when a robot senses in a moving process, clustering can be carried out according to the characteristic attributes of collected point cloud data, and whether an object influencing the operation of the robot exists or not is determined according to a clustering result. The characteristic attributes of the point cloud data may include texture, normal vector, color information, euclidean distance, and the like. The specific process can be as follows: determining the characteristic attributes of the point cloud data, clustering according to the attributes, and segmenting the point cloud data with different attributes.

However, in the above process, since the characteristic attribute of the point cloud data needs to be determined first, the sensing efficiency is low, and the safety of the robot in the moving process is low.

Disclosure of Invention

The invention provides a perception method, a perception device, a robot and a storage medium based on an object, and aims to solve the technical problem that the perception efficiency of the object is low due to the existing perception method.

In a first aspect, an embodiment of the present invention provides an object-based sensing method, including:

establishing a graph structure according to the collected at least one point cloud data;

clustering the point cloud data according to the weight value of the edges in the graph structure, and acquiring at least one clustered labeled semantic label class;

determining the number of objects with the distance from the robot being smaller than a preset first distance threshold according to the clustered class of the at least one labeled semantic tag;

and controlling the robot according to the number of the objects with the distance from the robot being smaller than a preset first distance threshold.

In the method as shown above, after the controlling the robot according to the number of objects whose distance from the robot is less than a preset first distance threshold, the method further includes:

according to the point cloud data, determining the ratio of the number of the point cloud data with the distance from the robot to the robot being smaller than a preset second distance threshold value to the total number of the point cloud data in one frame;

and controlling the motion of the robot according to the ratio.

In the implementation mode, the two judgment conditions are combined with the judgment mode, so that the misjudgment probability can be reduced, the accuracy of object perception is improved, and further, the motion of the robot can be accurately controlled.

In the method, the set of vertices in the graph structure is a set of coordinate values of the point cloud data, the set of edges in the graph structure is a set of edges connecting adjacent point cloud data, and the weight of the edges of the adjacent point cloud data is determined according to the distance between the adjacent point cloud data;

the clustering the point cloud data according to the weight values of the edges in the graph structure comprises:

sorting the edges according to the order of the weights from small to large to obtain a sorted edge set;

if the dissimilarity degree between two sides which do not contain the common vertex in the sorted edge set is smaller than a preset dissimilarity degree threshold value, clustering point cloud data connected with the two sides into one class to obtain each class after at least one primary clustering;

and clustering the at least one primarily clustered class again according to the farthest distance between the point cloud data included in the at least one primarily clustered class to form a class marked with a semantic label.

In the implementation mode, based on a graph structure, each class after at least one primary clustering is obtained, then each class after the primary clustering is clustered again, and finally the obtained class is a class marked with a semantic label. On one hand, the clustering process is simple and efficient, and the object sensing efficiency is improved; on the other hand, each class after at least one initial clustering is clustered again according to the farthest distance between the point cloud data included in each class after the initial clustering, and the obtained class marked with the semantic label can facilitate the robot to sense the object, so that the accuracy of object sensing is improved, the motion of the robot is better controlled, and the accuracy of object sensing is improved.

In the method shown above, the clustering the at least one primarily clustered class again according to the farthest distance between the point cloud data included in the at least one primarily clustered class to form a class labeled with a semantic label includes:

determining the farthest distance between the point cloud data included in each class after the primary clustering;

determining the class after the initial clustering with the farthest distance between preset distance ranges as an effective class;

and clustering the effective classes again based on a semantic segmentation mode to form classes for marking semantic labels.

In the implementation mode, effective classes are screened out and clustered to form classes of the labeled semantic tags, so that the accuracy of the clustered classes of the labeled semantic tags can be improved, and further, the accuracy of robot perception can be improved.

In the above method, the clustering the valid classes again based on the semantic segmentation method to form classes labeled with semantic labels includes:

taking the minimum number of the point cloud data included in the effective class as the number of the effective class; the serial number of the point cloud data is a serial number formed by numbering at least one collected point cloud data in sequence from left to right;

sorting the effective classes according to the sequence of the serial numbers of the effective classes from small to large to obtain a sorted effective class set;

determining the distance between adjacent effective classes in the sorted effective class set;

if the distance between the adjacent effective classes is smaller than a preset third distance threshold value, the adjacent effective classes are classified into a class marked with semantic labels;

and if the distance between the adjacent effective classes is greater than or equal to the preset third distance threshold, taking the adjacent effective classes as the classes of the two marked semantic labels.

In the implementation mode, each class after primary clustering is clustered again to form a class marked with the semantic tags, on one hand, the clustering process is simple and efficient, the efficiency of object perception is further improved, and on the other hand, the obtained class marked with the semantic tags can be conveniently perceived by the robot, so that the accuracy of object perception is improved.

In the method, the object is a person, the semantic tag labeled class is a person class, the valid class is a person leg class, the preset distance range is a preset width range of a person leg, and the preset third distance threshold is a preset width between two legs.

This implementation can realize that the perception that the robot encloses the appearance to the crowd, when detecting to be enclosed by the crowd and watch, in time stop motion to, personal safety and self safety around the protection.

In the method shown above, the determining, according to the clustered class of the at least one labeled semantic tag, the number of objects whose distance from the robot is less than a preset first distance threshold includes:

determining the number of the classes of the labeled semantic tags with the distance to the robot being smaller than a preset first distance threshold value as the number of the objects with the distance to the robot being smaller than the preset first distance threshold value.

The realization mode can determine the surrounding environment of the robot according to the number of the classes of the marked semantic tags, the distance between the marked semantic tags and the robot is smaller than the preset first distance threshold value, the realization process is simple and efficient, and the perception efficiency of the robot is improved.

In the method as shown above, the controlling the motion of the robot according to the number of objects whose distance from the robot is less than a preset first distance threshold includes:

and if the number of the objects with the distance to the robot smaller than the preset first distance threshold value is larger than or equal to the preset object number threshold value, controlling the robot to stop moving.

In this implementation manner, it may be implemented that when the number of objects whose distance from the robot is smaller than the preset first distance threshold is greater than or equal to the preset object number threshold, the robot is controlled to stop moving, so that the safety of the robot and the safety of the surrounding environment are protected.

In the method as shown above, if the number of objects whose distance from the robot is smaller than the preset first distance threshold is greater than or equal to the preset object number threshold, controlling the robot to stop moving includes:

if the number of the objects with the distance from the robot to the robot smaller than a preset first distance threshold is larger than or equal to a preset object number threshold, setting an object flag bit of a frame corresponding to the at least one point cloud data as an effective value;

and if the number of the frames with the object zone bits continuously being effective values is larger than a preset frame threshold value, controlling the robot to stop moving.

According to the implementation mode, the motion of the robot can be controlled by combining the sensing results of multiple frames, the accuracy of object sensing is improved, and therefore the motion of the robot can be accurately controlled.

In the method as shown above, before the building a graph structure from at least one point cloud data collected by a robot, the method further includes:

acquiring at least one piece of collected original point cloud data;

and deleting the original point cloud data, of which the distance from the original point cloud data to the robot is not within a preset robot detection distance range, to form the filtered original point cloud data.

By the implementation mode, invalid point cloud data can be deleted, so that the accuracy of object perception is improved.

In a second aspect, an embodiment of the present invention provides an object-based sensing apparatus, including:

the establishing module is used for establishing a graph structure according to the collected at least one point cloud data;

the clustering module is used for clustering the point cloud data according to the weight value of the edge in the graph structure to obtain at least one clustered labeled semantic label class;

the determining module is used for determining the number of the objects with the distance from the robot being smaller than a preset first distance threshold according to the clustered class of the at least one labeled semantic tag;

and the control module is used for controlling the motion of the robot according to the number of the objects with the distance from the robot being smaller than a preset first distance threshold value.

In the apparatus as described above, the apparatus further comprises:

the determining control module is used for determining the ratio of the number of the point cloud data with the distance from the robot being smaller than a preset second distance threshold value to the total number of the point cloud data in one frame according to the point cloud data; and controlling the motion of the robot according to the ratio.

In the apparatus as described above, the set of vertices in the graph structure is a set of coordinate values of the point cloud data, the set of edges in the graph structure is a set of edges connecting adjacent point cloud data, weights of the edges of adjacent point cloud data are determined according to distances between the adjacent point cloud data, and the clustering module specifically includes:

the obtaining submodule is used for sequencing the edges according to the order of the weights from small to large and obtaining a sequenced edge set;

the first clustering submodule is used for clustering point cloud data connected with two sides into one class to obtain each class after at least one primary clustering if the dissimilarity between the two sides which do not contain a common vertex in the sorted edge set is smaller than a preset dissimilarity threshold;

and the second clustering submodule is used for clustering each class after the at least one primary clustering again according to the farthest distance between the point cloud data included in each class after the at least one primary clustering to form a class for marking the semantic label.

In the apparatus as shown above, the second clustering submodule is specifically configured to:

In the apparatus as described above, in terms of clustering the valid classes again based on the semantic segmentation method to form classes labeled with semantic tags, the second clustering submodule is specifically configured to:

In the apparatus as shown above, the semantic-tag-labeled class is a human class, the valid class is a human leg class, the preset distance range is a preset width range of a human leg, and the preset third distance threshold is a preset width between two legs.

In the apparatus as shown above, the determining module is specifically configured to determine the number of classes of the labeled semantic tags whose distance from the robot is smaller than a preset first distance threshold as the number of objects whose distance from the robot is smaller than the preset first distance threshold.

In the apparatus as described above, the control module includes:

and the control submodule is used for controlling the robot to stop moving if the number of the objects with the distance from the robot smaller than a preset first distance threshold value is larger than or equal to a preset object number threshold value.

In the above-described apparatus, the control sub-module is specifically configured to:

In the above apparatus, before the building the graph structure according to the at least one point cloud data collected by the robot, the apparatus further includes:

the acquisition module is used for acquiring at least one piece of acquired original point cloud data;

and the deleting module is used for deleting the original point cloud data, of the at least one original point cloud data, of which the distance from the robot is not within a preset robot detection distance range, so as to form the filtered at least one point cloud data.

In a third aspect, an embodiment of the present invention further provides a robot, including:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement an object-based perception method as provided in the first aspect.

In a fourth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the object-based sensing method as provided in the first aspect.

The embodiment of the invention provides a perception method, a device, a robot and a storage medium based on an object, wherein the method comprises the following steps: establishing a graph structure according to the collected at least one point cloud data; clustering point cloud data according to the weight value of the edges in the graph structure, and acquiring at least one clustered class for marking a semantic label; determining the number of objects with the distance from the robot being smaller than a preset first distance threshold according to the clustered class of the at least one labeled semantic tag; and controlling the movement of the robot according to the number of the objects with the distance from the robot being less than a preset first distance threshold. When the method is used for object perception, the collected point cloud data can be clustered according to a graph structure established by the point cloud data, semantic segmentation is realized, the number of objects with the distance from the robot being smaller than a preset first distance threshold value is determined, and the motion of the robot is controlled. Therefore, the object-based sensing method provided by the embodiment ensures the safety of the robot in the moving process, and simultaneously ensures the safety of the surrounding environment in the moving process of the robot.

Drawings

Fig. 1 is a schematic diagram of an application scenario of an object-based sensing method according to an embodiment of the present invention;

fig. 2 is a schematic flowchart of an object-based sensing method according to an embodiment of the present invention;

fig. 3 is a schematic flow chart of primary clustering of point cloud data in the object-based sensing method according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a process of primary clustering of point cloud data in the object-based sensing method according to an embodiment of the present invention;

FIG. 5 is a schematic flow chart illustrating the determination of classes of labeled semantic tags in the object-based sensing method according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an object-based sensing apparatus according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of a robot according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Fig. 1 is a schematic diagram of an application scenario of an object-based sensing method according to an embodiment of the present invention. As shown in fig. 1, when the robot 11 is moving, for example, when the robot 11 navigates autonomously in a known map, it is necessary to sense an object in the surrounding environment that affects the movement of the robot, so as to control the movement of the robot according to the sensing result. The object in this embodiment may be a still object in the environment, such as the table 12, or may be a moving object in the environment, such as the person 13, or other creatures, such as dogs, etc. According to the object-based sensing method provided by the embodiment, the point cloud data collected by the robot are clustered, semantic segmentation is realized, and the number of objects with the distance from the robot being smaller than the preset first distance threshold is determined, so that the motion of the robot is controlled.

Fig. 2 is a schematic flowchart of an object-based sensing method according to an embodiment of the present invention. The embodiment is suitable for a scene in which the robot senses objects in the surrounding environment in the moving process. The object-based perceiving method may be performed by an object-based perceiving apparatus, which may be implemented by software and/or hardware, and may be integrated in a robot. As shown in fig. 2, the object-based sensing method provided in this embodiment includes the following steps:

step 201: and establishing a graph structure according to the acquired at least one point cloud data.

Specifically, in this embodiment, the robot may collect point cloud data through a laser sensor disposed thereon. The point cloud data in this embodiment may be two-dimensional or three-dimensional point cloud data.

Optionally, the vertex set in the graph structure is a set of coordinate values of the point cloud data, the edge set in the graph structure is a set of edges connecting adjacent point cloud data, and the weight of the edges of the adjacent point cloud data is determined according to the distance between the adjacent point cloud data.

The coordinate values of the point cloud data may be coordinate values of the collected points in the robot coordinate system. In the scene of the two-dimensional point cloud data, the robot coordinate system refers to a coordinate system constructed by using one point on the robot as an origin, using the robot forward direction as an X-axis direction, and using the robot left direction as a Y-axis direction. In the scene of the three-dimensional point cloud data, the robot coordinate system refers to a coordinate system which is constructed by taking one point on the robot as an origin, taking the advancing direction of the robot as an X-axis direction, taking the left direction of the robot as a Y-axis direction and taking the direction vertical to an X-Y plane as a Z-axis direction. Here, a point on the robot may be any point on the robot, for example, a middle point of a driving wheel line.

In one implementation, in order to improve the accuracy of object sensing, before step 201, at least one piece of collected original point cloud data may be obtained, and the original point cloud data, which is not located in a preset robot detection distance range from the robot, in the at least one piece of original point cloud data is deleted to form the at least one piece of filtered point cloud data related in step 201. That is, the original point cloud data beyond the detection distance range of the robot in the original point cloud data is deleted, and the deleted element point cloud data is invalid point cloud data.

The range of the detection distance of the robot referred to herein may be (range _ min, range _ max), where range _ min is the nearest detection distance of the robot, and range _ max is the farthest detection distance of the robot. The range _ min and range _ max can be set according to the parameters of the laser sensor arranged on the robot. Illustratively, range _ min may be a value between 3 cm and 5 cm.

The original point cloud data, which is not located in the preset detection distance range of the robot, is the point cloud data, which is located at a distance less than range _ min or greater than range _ max from the robot.

In this embodiment, the distance between the point cloud data and the robot refers to an euclidean distance between the point cloud data and the robot. The coordinate value of the point cloud data 1 is assumed to be (x)₁，y₁) The distance between the point cloud data 1 and the robot is

The distance between the adjacent point cloud data in the graph structure refers to the euclidean distance between the adjacent point cloud data. Suppose that the adjacent point cloud data are point cloud data 2 and point cloud data 3, respectively, and the coordinate value of the point cloud data 2 is (x)₂，y₂) The coordinate value of the point cloud data 3 is (x)₃，y₃) The distance between the point cloud data 2 and the point cloud data 3 is

When the point cloud data is collected, a corresponding sequence exists among the point cloud data based on the scanning sequence of the laser sensor. In this embodiment, in at least one point cloud data, adjacent point cloud data are connected, and these connecting lines are edges in the graph structure. If n point cloud data collected by the robot are assumed, n vertexes are concentrated in the vertexes and n-1 sides are concentrated in the sides in the graph structure. Each edge has a weight, which is the distance between two point cloud data connected by this edge. The meaning of the weight is the dissimilarity between the two point cloud data connected by the edge.

Step 202: and clustering the point cloud data according to the weight value of the edge in the graph structure, and acquiring at least one clustered class for marking the semantic label.

Specifically, after the graph structure is determined, clustering is performed on point cloud data according to the weight value of the edges in the graph structure, semantic segmentation is achieved, and at least one clustered class for marking semantic tags is obtained. Illustratively, the class of tagged semantic tags may be a human class.

In one implementation, the point cloud data connected to edges whose weight values are within the same threshold range may be grouped into one type based on the weight values of the edges.

In another implementation, the point cloud data may be clustered based on a graph partitioning (Graphseg) algorithm and based on a weight value of the edge. An important feature of the Graphseg algorithm is that it is able to preserve regions in low variability image regions while ignoring regions in high variability regions.

In another implementation, the clustering may be performed twice based on the Graphseg algorithm, and the specific process may be as follows: sorting the edges in the graph structure according to the order of the weights from small to large, and acquiring a sorted edge set; if the dissimilarity degree between two sides which do not contain the common vertex in the sorted edge set is smaller than a preset dissimilarity degree threshold value, clustering point cloud data connected with the two sides into one class to obtain each class after at least one primary clustering; and clustering each class after the at least one primary clustering again according to the farthest distance between the point cloud data included in each class after the at least one primary clustering to form a class for marking the semantic label.

More specifically, when performing clustering again, it can be realized based on the following manner: determining the farthest distance between the point cloud data included in each class after primary clustering; determining the class after the primary clustering with the farthest distance between preset distance ranges as an effective class; and clustering the effective classes again based on a semantic segmentation mode to form classes for marking semantic labels. And screening effective classes, clustering to form classes of the labeled semantic tags, and improving the accuracy of the clustered classes of the labeled semantic tags, thereby improving the accuracy of robot perception.

The distance between each point cloud data and other point cloud data in the class after the initial clustering can be calculated in an enumeration manner, and the farthest distance in the distances is selected. In one particular scenario, when the object is a person, the active class may be a human leg class. The preset distance range may be a preset width range of the human leg. The preset distance range may be leg _ width ± thres, where leg _ width may be 0.2 meter, and thres may be 0.05 meter.

In the implementation mode, based on a graph structure, each class after at least one primary clustering is obtained, then each class after the primary clustering is clustered again, and finally the obtained class is a class marked with a semantic label. On one hand, the clustering process is simple and efficient, and the object sensing efficiency is improved; on the other hand, each class after at least one initial clustering is clustered again according to the farthest distance between the point cloud data included in each class after the initial clustering, and the obtained class marked with the semantic label can facilitate the robot to sense the object, so that the accuracy of object sensing is improved, the motion of the robot is better controlled, and the accuracy of object sensing is improved. The clustering method based on the Graphseg algorithm, and the process of forming the class of the labeled semantic tags will be described in detail later.

Each of the classes after the primary clustering may include at least one point cloud data, and the number of the point cloud data included in each class may be the same or different. Each category of the labeled semantic tags may include at least one point cloud data, and the number of the point cloud data included in each category may be the same or different.

Step 203: and determining the number of the objects with the distance from the robot being smaller than a preset first distance threshold according to the clustered class of the at least one labeled semantic tag.

Specifically, after clustering, the number of objects whose distance from the robot is less than a preset first distance threshold may be determined based on the class of the labeled semantic tags. More specifically, the number of classes of tagged semantic tags whose distance from the robot is less than a preset first distance threshold may be determined as the number of objects whose distance from the robot is less than the preset first distance threshold.

The distance between the class of the semantic tag and the robot may be a distance between any point cloud data in the class of the semantic tag and the robot, or a maximum value or a minimum value in the distances between all point cloud data in the class of the semantic tag and the robot, and the like, which is not limited in this embodiment.

Step 204: and controlling the movement of the robot according to the number of the objects with the distance from the robot being less than a preset first distance threshold.

In one implementation, if the number of objects whose distance from the robot is less than the preset first distance threshold is greater than or equal to the preset object number threshold, the robot is controlled to stop moving.

In another implementation manner, in order to improve the accuracy of object sensing and avoid false sensing, an object flag may be set in the robot. Each frame corresponds to an object flag. If the number of the objects with the distance from the robot smaller than the preset first distance threshold is larger than or equal to the preset object number threshold, setting an object flag bit of a frame corresponding to at least one point cloud data as an effective value; and if the number of the frames with the object zone bits continuously being effective values is larger than a preset frame threshold value, controlling the robot to stop moving. Illustratively, the preset frame threshold may be 10, in other words, if the object flag bits of consecutive 10 frames are all valid values, the robot is controlled to stop moving.

In another implementation manner, after step 204, the object-based sensing method provided in this embodiment may further determine, according to the point cloud data, a ratio of the number of the point cloud data having a distance from the robot smaller than a preset second distance threshold to the total number in the frame of point cloud data, and control the motion of the robot according to the ratio. It should be noted that the total number in one frame of point cloud data in this embodiment refers to the total number of point cloud data included in one frame of point cloud data.

In this implementation, more specifically, when the ratio of the number of point cloud data having a distance from the robot smaller than a preset second distance threshold to the total number of point cloud data of one frame is greater than a preset ratio threshold, the robot is controlled to stop moving.

Optionally, the second distance threshold may be less than the first distance threshold.

In this implementation manner, when the ratio of the number of point cloud data with a distance from the robot smaller than the preset second distance threshold to the total number of point cloud data of one frame is greater than the preset ratio threshold, it is indicated that the laser sensor is shielded to a greater extent. If the robot is controlled to move only by depending on the condition that the laser sensor is blocked or only depending on the number of the objects with the distance from the robot being smaller than the preset first distance threshold, the probability of misjudgment is high. For example, in the case of only depending on the laser sensor being blocked, the degree of blocking of the laser sensor is relatively small, that is, the ratio of the number of point cloud data having a distance from the robot smaller than the preset second distance threshold to the total number of point cloud data of one frame is smaller than or equal to the preset ratio threshold, but when the number of objects having a distance from the robot smaller than the preset first distance threshold is greater than or equal to the preset object number threshold, misjudgment occurs, which causes the robot to continue to move under the condition that there are objects, and reduces the safety of the surrounding environment and itself. In another example, in the case that the robot movement is controlled only by depending on the number of objects whose distance from the robot is smaller than the preset first distance threshold, if the number of objects whose distance from the robot is smaller than the preset first distance threshold is smaller than the preset object number threshold, but the ratio of the number of point cloud data whose distance from the robot is smaller than the preset second distance threshold to the total number of point cloud data of one frame is greater than the preset ratio threshold, a false judgment may occur, which may cause the robot to continue to move under the condition that the laser sensor is severely shielded, thereby reducing the safety of the surrounding environment and itself.

Therefore, the two judgment conditions are combined with a judgment mode, so that the misjudgment probability can be reduced, the accuracy of object perception is improved, and further, the motion of the robot can be accurately controlled.

In a more specific scenario, the object in the present embodiment may be a person. The embodiment can realize the perception of the robot on the surrounding of people, and timely stops moving when the robot detects the surrounding, thereby protecting the personal safety and the self safety around.

The object-based perception method provided by the embodiment comprises the following steps: establishing a graph structure according to the collected at least one point cloud data; clustering point cloud data according to the weight value of the edges in the graph structure, and acquiring at least one clustered class for marking a semantic label; determining the number of objects with the distance from the robot being smaller than a preset first distance threshold according to the clustered class of the at least one labeled semantic tag; and controlling the movement of the robot according to the number of the objects with the distance from the robot being less than a preset first distance threshold. When the method is used for object perception, the collected point cloud data can be clustered according to a graph structure established by the point cloud data, semantic segmentation is realized, the number of objects with the distance from the robot being smaller than a preset first distance threshold value is determined, and the motion of the robot is controlled. Therefore, the object-based sensing method provided by the embodiment ensures the safety of the robot in the moving process, and simultaneously ensures the safety of the surrounding environment in the moving process of the robot.

The present embodiment also provides another object-based perception method. In this embodiment, a process of clustering point cloud data for the first time is described in detail on the basis of the embodiment shown in fig. 1 and various alternatives. Fig. 3 is a schematic flow chart of primary clustering of point cloud data in the object-based sensing method according to an embodiment of the present invention. As shown in fig. 3, if the dissimilarity between two edges not including the common vertex in the sorted edge set is smaller than the preset dissimilarity threshold, the point cloud data connected between the two edges are grouped into one class, and each class after at least one primary clustering is obtained, including the following steps:

step 301: and acquiring the first edge of the connected point cloud data which is not clustered and arranged at the first position in the sorted edge set.

Step 302: and acquiring a second edge which is arranged at a second position and is not clustered by the connected point cloud data in the sorted edge set, wherein the connected point cloud data is different from the point cloud data connected with the first edge.

Step 303: a dissimilarity of the first edge with the second edge is determined.

Step 304: and if the dissimilarity degree of the first edge and the second edge is smaller than a preset dissimilarity degree threshold value, the point cloud data connected with the second edge and the point cloud data connected with the first edge are gathered into one type.

Step 305: and in the sorted edge set, the edge which is arranged behind the second edge and is not clustered by the connected point cloud data, is different from the point cloud data connected with the first edge, is used as a new second edge, the step of determining the dissimilarity degree of the first edge and the second edge is returned to be executed until the edge which is arranged behind the second edge and is not clustered by the connected point cloud data and is different from the point cloud data connected with the first edge does not exist in the sorted edge set, and the return execution is stopped.

Step 306: and taking the edge which is not clustered by the connected point cloud data and is arranged behind the first edge in the sorted edge set as a new first edge, returning to execute the step of acquiring a second edge which is not clustered by the connected point cloud data and is arranged at the second position, wherein the connected point cloud data is not clustered, and the connected point cloud data is different from the point cloud data connected with the first edge, until the first edge which is not clustered by the connected point cloud data and is arranged at the first position does not exist in the sorted edge set, or the number of the edges which are not clustered by the connected point cloud data in the sorted edge set is 1, or the second edge which is not clustered by the connected point cloud data and is arranged at the second position and is different from the point cloud data connected with the first edge does not exist in the sorted edge set, stopping returning to execute, and acquiring each class after primary clustering.

Fig. 4 is a schematic diagram of a process of primarily clustering point cloud data in the object-based sensing method according to an embodiment of the present invention. The detailed implementation process of steps 301 to 306 is described in detail below with reference to fig. 4. As shown in FIG. 4, it is assumed that 7 point cloud data are collected, and the data are respectively marked as P1-P7 from left to right. The edge between P1 and P2 is denoted as e12, the edge between P2 and P3 is denoted as e23, the edge between P3 and P4 is denoted as e34, the edge between P4 and P5 is denoted as e45, the edge between P5 and P6 is denoted as e56, and the edge between P6 and P7 is denoted as e 67. That is, in the graph structure established from the point cloud data, the vertex sets are (P1, P2, P3, P4, P5, P6, P7), and the edge sets are (e12, e23, e34, e45, e56, e 67).

Assume the sorted set of edges is (e45, e34, e23, e12, e56, e 67). The sorted edge sets may also be renumbered, for example, renumbered as (e1, e2, e3, e4, e5, e6), e45 corresponds to e1, e34 corresponds to e2, e23 corresponds to e3, e12 corresponds to e4, e56 corresponds to e5, and e67 corresponds to e 6.

Step 301 is performed. At the initial time of clustering, the first edge of the connected point cloud data not clustered at the first position is e1, i.e., e 45.

Step 302 is then performed. At the initial clustering, the connected point cloud data is not clustered, and the second edge ranked second, which is different from the point cloud data connected to the first edge, of the connected point cloud data is e3, i.e., e23 (because e2 ranked after e1, i.e., e34, has the same vertex P4 as e1 in the sorted edge set).

Step 303 is performed next. Illustratively, the dissimilarity of the first edge with the second edge may be determined by: respectively determining the distance between the two point cloud data connected with the first edge and each point cloud data connected with the second edge; and determining the minimum distance as the dissimilarity degree of the first edge and the second edge. And determining the minimum distance between the two point cloud data connected with the first edge and each point cloud data connected with the second edge as the dissimilarity degree of the first edge and the second edge, so that the accuracy and the clustering efficiency of subsequent clustering can be improved. Specifically, in this example, the two point cloud data connected by the first edge e45 are P4 and P5, and the two point cloud data connected by the second edge e23 are P2 and P3. The distance between P4 and P2, and the distance between P4 and P3 are determined respectively, and the distance between P5 and P2, and the distance between P5 and P3 are determined respectively. The smallest distance of the four distances is taken as the dissimilarity of the first edge e45 and the second edge e 23.

Step 304 is then performed. And if the dissimilarity degree of the first edge and the second edge is smaller than a preset dissimilarity degree threshold value, the point cloud data connected with the second edge and the point cloud data connected with the first edge are gathered into one type. The preset dissimilarity threshold herein may be determined according to the performance of the laser sensor. The better the performance of the laser sensor, the smaller the preset dissimilarity threshold. Assuming that the dissimilarity between the first edge e45 and the second edge e23 is less than a preset dissimilarity threshold, the point cloud data connected by the second edge is grouped with the point cloud data connected by the first edge, i.e., P2, P3, P4 and P5 are grouped.

In another case, if the dissimilarity between the first edge and the second edge is greater than or equal to the preset dissimilarity threshold, the point cloud data connected with the second edge and the point cloud data connected with the first edge are not clustered into one class.

Thereafter, step 305 is performed. And in the sorted edge set, the edge which is arranged behind the second edge and is not clustered by the connected point cloud data and is different from the point cloud data connected with the first edge is used as a new second edge. After step 304 is performed, P2, P3, P4, and P5 have implemented clustering, and P2 connected by e12 ranked after e23 in the sorted edge set has implemented clustering, so e12 cannot be regarded as a new second edge. e56 has the same vertex as the first edge e45, and therefore e67 is taken as the new second edge. Returning to step 303, the dissimilarity between e45 and e67 is determined. Step 304 is then performed. Assuming that the dissimilarity between the first edge e45 and the second edge e67 is greater than or equal to the predetermined dissimilarity threshold, e67 and e45 do not converge into one class. Then, step 305 is executed, and since there is no edge arranged behind the second edge where the connected point cloud data is not clustered and the connected point cloud data is different from the point cloud data connected to the first edge, the return execution is stopped.

Step 306 is then performed. And taking the edge which is not clustered by the connected point cloud data and is arranged behind the first edge in the sorted edge set as a new first edge. In the sorted edge set, clustering has been performed on the point cloud data P3 and P4 connected by e34, clustering has been performed on the P2 connected by e12, and clustering has been performed on the P5 connected by e 56. Therefore, e67 is taken as the new first edge. However, the number of connected point cloud data non-clustered edges is 1. The return stop condition of step 306 is satisfied, so that the return execution is stopped, and each class after the primary clustering is acquired. The classes after primary clustering are (P2, P3, P4, P5), (P1), (P6) and (P7), that is, 4 classes are obtained.

Further, in a specific application, when step 306 is executed, if the return stop condition of step 306 is not satisfied, the process needs to return to step 302.

In step 304, if the dissimilarity between the first edge and the second edge is smaller than the preset dissimilarity threshold, the number of the second edge may be updated to the number of the first edge, and finally, when each class after the initial clustering is obtained, the point cloud data connected by the edges with the same number is taken as one class.

More specifically, a flag indicating whether clustering is performed or not may be set for each point cloud data. And after the clustering is realized, the position valid bit of the zone bit of the point cloud data is determined. When the first edge and the second edge are obtained, the flag bit of each point cloud data can be inquired first, and then whether the point cloud data are clustered or not can be determined according to the flag bit.

In this embodiment, the edge where the connected point cloud data is not clustered refers to an edge where none of the connected point cloud data is clustered.

According to the object-based sensing method provided by the embodiment, the point cloud data is subjected to primary clustering through the Graphseg algorithm, and based on the characteristics of the Graphseg algorithm, the clustering method in the embodiment can keep the region in the low-variability image region, so that accurate clustering is realized, and the object sensing accuracy is further improved.

The embodiment also provides another object-based perception method. The embodiment describes in detail how to form the class of the labeled semantic tags based on the embodiments shown in fig. 1 and fig. 3 and various alternatives. Fig. 5 is a schematic flowchart illustrating a process of determining a class of a labeled semantic tag in an object-based sensing method according to an embodiment of the present invention. As shown in fig. 5, based on the semantic segmentation, the effective classes are clustered again to form classes labeled with semantic labels, which includes the following steps:

step 501: and taking the minimum number of the point cloud data included in the effective class as the number of the effective class.

The serial number of the point cloud data is formed by numbering at least one point cloud data acquired by the robot according to a left-to-right sequence.

Step 502: and sequencing the effective classes according to the sequence of the serial numbers of the effective classes from small to large to obtain a sequenced effective class set.

Step 503: and determining the distance between adjacent effective classes in the sorted effective class set.

Step 504: and if the distance between the adjacent effective classes is smaller than a preset third distance threshold value, the adjacent effective classes are classified into a class marked with semantic labels.

Step 505: and if the distance between the adjacent effective classes is greater than or equal to a preset fourth distance threshold, taking the adjacent effective classes as two classes for marking the semantic labels.

Specifically, in step 503, the distance between the intermediate point cloud data of the adjacent effective classes may be used as the distance between the adjacent effective classes. The intermediate point cloud data of the adjacent effective classes refers to point cloud data which is in an intermediate position in the effective classes after the point cloud data is sequenced according to the number of the point cloud data.

When the object is a person, the preset third distance threshold may be a preset width between the legs.

Secondary clustering is realized on the effective classes through the steps 501 to 505, and at least one class marked with a semantic label is obtained.

Thereafter, the distance of the class of each tagged semantic tag from the robot is determined. Determining the number of the classes of the marked semantic tags with the distance to the robot smaller than a preset first distance threshold value as the number of the objects with the distance to the robot smaller than the preset first distance threshold value, and controlling the motion of the robot according to the number of the objects with the distance to the robot smaller than the preset first distance threshold value.

According to the object-based sensing method provided by the embodiment, each class after primary clustering is clustered again to form the class marked with the semantic tag, on one hand, the clustering process is simple and efficient, and the object sensing efficiency is further improved, and on the other hand, the obtained class marked with the semantic tag can be convenient for a robot to sense the object, so that the object sensing accuracy is improved.

Fig. 6 is a schematic structural diagram of an object-based sensing device according to an embodiment of the present invention. As shown in fig. 6, the object-based sensing apparatus provided in this embodiment includes the following modules: an establishing module 61, a clustering module 62, a determining module 63, and a control module 64.

And the establishing module 61 is used for establishing a graph structure according to at least one point cloud data acquired by the robot.

And the clustering module 62 is configured to cluster the point cloud data according to the weight values of the edges in the graph structure, and obtain at least one clustered class labeled with the semantic tag.

And the determining module 63 is configured to determine, according to the clustered class of the at least one labeled semantic tag, the number of objects whose distance from the robot is smaller than a preset first distance threshold.

Optionally, the determining module 63 is specifically configured to: and determining the number of the classes of the marked semantic tags with the distance to the robot smaller than a preset first distance threshold value as the number of the objects with the distance to the robot smaller than the preset first distance threshold value.

And the control module 64 is used for controlling the movement of the robot according to the number of the objects with the distance from the robot being less than the preset first distance threshold.

Optionally, in one implementation, the control module 64 includes: and the control submodule is used for controlling the robot to stop moving if the number of the objects with the distance from the robot smaller than the preset first distance threshold value is larger than or equal to the preset object number threshold value.

More specifically, the control sub-module is specifically configured to: if the number of the objects with the distance from the robot smaller than the preset first distance threshold is larger than or equal to the preset object number threshold, setting an object flag bit of a frame corresponding to at least one point cloud data as an effective value; and if the number of the frames with the object zone bits continuously being effective values is larger than a preset frame threshold value, controlling the robot to stop moving.

In another implementation, the apparatus further includes: and the determining control module is used for determining the ratio of the number of the point cloud data with the distance from the robot being smaller than a preset second distance threshold value to the total number of the point cloud data in one frame according to the point cloud data, and controlling the motion of the robot according to the ratio.

Optionally, the apparatus may further include: the device comprises an acquisition module and a deletion module.

And the acquisition module is used for acquiring at least one piece of acquired original point cloud data.

And the deleting module is used for deleting the original point cloud data, of which the distance from the robot is not within the preset detection distance range of the robot, from the at least one original point cloud data to form at least one point cloud data after filtration.

The object-based sensing device provided by the embodiment of the invention can execute the object-based sensing method provided by any shown embodiment and various optional modes of the invention, and has corresponding functional modules and beneficial effects of the execution method.

The invention also provides another object-based sensing device. In this embodiment, a detailed description is made of a specific structure of the clustering module 62 based on the embodiment shown in fig. 6 and various optional schemes. Referring to fig. 6, in the object-based sensing apparatus provided in this embodiment, the clustering module 62 specifically includes: an acquisition sub-module 621, a first clustering sub-module 622, and a second clustering sub-module 623.

The obtaining sub-module 621 is configured to sort the edges in the order from small to large according to the weight, and obtain a sorted edge set.

The first clustering submodule 622 is configured to, if the dissimilarity between two edges that do not include a common vertex in the sorted edge set is smaller than a preset dissimilarity threshold, cluster point cloud data connected to the two edges into one class, and obtain each class after at least one primary clustering.

Optionally, the first clustering submodule 622 is specifically configured to:

acquiring a first edge arranged at the first position, in the sorted edge set, of the connected point cloud data which is not clustered;

acquiring a second edge which is arranged at a second position and is not clustered by the connected point cloud data in the sorted edge set, wherein the connected point cloud data is different from the point cloud data connected with the first edge;

determining the dissimilarity of the first edge and the second edge;

if the dissimilarity degree of the first edge and the second edge is smaller than a preset dissimilarity degree threshold value, the point cloud data connected with the second edge and the point cloud data connected with the first edge are gathered into a same type;

taking an edge which is arranged behind a second edge and is not clustered by the connected point cloud data and is different from the point cloud data connected with the first edge in the sorted edge set as a new second edge, returning to execute the step of determining the dissimilarity degree of the first edge and the second edge until no edge which is arranged behind the second edge and is not clustered by the connected point cloud data and is different from the point cloud data connected with the first edge exists in the sorted edge set, and stopping returning to execute;

and taking the edge which is not clustered by the connected point cloud data and is arranged behind the first edge in the sorted edge set as a new first edge, returning to execute the step of acquiring a second edge which is not clustered by the connected point cloud data and is arranged at the second position, wherein the connected point cloud data is not clustered, and the connected point cloud data is different from the point cloud data connected with the first edge, until the first edge which is not clustered by the connected point cloud data and is arranged at the first position does not exist in the sorted edge set, or the number of the edges which are not clustered by the connected point cloud data in the sorted edge set is 1, or the second edge which is not clustered by the connected point cloud data and is arranged at the second position and is different from the point cloud data connected with the first edge does not exist in the sorted edge set, stopping returning to execute, and acquiring each class after primary clustering.

And the second clustering submodule 623 is configured to cluster each class subjected to the at least one primary clustering again according to the farthest distance between the point cloud data included in each class subjected to the at least one primary clustering, so as to form a class labeled with a semantic label.

Optionally, the second clustering submodule 623 is specifically configured to: determining the farthest distance between the point cloud data included in each class after primary clustering; determining the class after the primary clustering with the farthest distance between preset distance ranges as an effective class; and clustering the effective classes again based on a semantic segmentation mode to form classes for marking semantic labels.

More specifically, in the aspect of clustering the valid classes again based on the semantic segmentation mode to form classes labeled with semantic labels, the second clustering submodule 623 is specifically configured to:

taking the minimum number of the point cloud data included in the effective class as the number of the effective class, wherein the number of the point cloud data is formed by numbering at least one collected point cloud data in sequence from left to right;

and if the distance between the adjacent effective classes is greater than or equal to a preset third distance threshold, taking the adjacent effective classes as two classes for marking the semantic labels.

Optionally, the semantic tag is marked as a human class, the valid class is a human leg class, the preset distance range is the preset width range of the human leg, and the preset third distance threshold is the preset width between the two legs.

Fig. 7 is a schematic structural diagram of a robot according to an embodiment of the present invention. As shown in fig. 7, the robot comprises a processor 70 and a memory 71. The number of the processors 70 in the robot can be one or more, and one processor 70 is taken as an example in fig. 7; the processor 70 and the memory 71 of the robot may be connected by a bus or other means, as exemplified by the bus connection in fig. 7.

The memory 71 is a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, and modules, such as program instructions and modules corresponding to the object-based sensing method in the embodiment of the present invention (for example, the establishing module 61, the clustering module 62, the determining module 63, and the control module 64 in the object-based sensing apparatus). The processor 70 executes various functional applications and data processing of the robot, i.e., implements the above-described object-based sensing method, by running software programs, instructions, and modules stored in the memory 71.

The memory 71 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the robot, and the like. Further, the memory 71 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory 71 may further include memory remotely located from the processor 70, which may be connected to the robot through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

Optionally, the robot may further include: a power component 72, an audio component 73, a communication component 74, and a sensor component 75. The power component 72, audio component 73, communication component 74, and sensor component 75 may all be connected to the processor 70 via a bus.

The power supply assembly 72 provides power to the various components of the robot. The power components 72 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the robot.

The audio component 73 is configured to output and/or input audio signals. For example, the audio component 73 comprises a microphone configured to receive external audio signals when the robot is in an operation mode, such as a recording mode and a speech recognition mode. The received audio signal may further be stored in the memory 71 or transmitted via the communication component 74. In some embodiments, audio assembly 73 also includes a speaker for outputting audio signals.

The communication component 74 is configured to facilitate wired or wireless communication between the robot and other devices. The robot may access a wireless network based on a communication standard. In an exemplary embodiment, the communication component 74 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the Communication component 74 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association technology, ultra wideband technology, bluetooth technology, and other technologies.

The sensor assembly 75 includes one or more sensors for providing various aspects of status assessment for the robot. The sensor assembly 75 may include a laser sensor for collecting point cloud data. In some embodiments, the sensor assembly 75 may also include an acceleration sensor, a magnetic sensor, a pressure sensor, a temperature sensor, or the like.

Fig. 8 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention. As shown in fig. 8, the present invention also provides a computer-readable storage medium 82 containing computer-executable instructions 81, the computer-executable instructions 81 when executed by a processor 83 for performing a method of object-based perception, the method comprising:

Of course, the storage medium containing the computer-executable instructions provided by the embodiments of the present invention is not limited to the method operations described above, and may also perform related operations in the object-based sensing method provided by any embodiment of the present invention.

From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk, or an optical disk of a computer, and includes instructions for enabling a robot (which may be a personal computer, a vehicle, or a network device) to execute the object-based sensing method according to the embodiments of the present invention.

It should be noted that, in the embodiment of the object-based sensing apparatus, the included units and modules are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. An object-based perception method, comprising:

2. The method of claim 1, wherein after controlling the robot according to the number of objects whose distance from the robot is less than a preset first distance threshold, further comprising:

and controlling the motion of the robot according to the ratio.

3. The method of claim 1, wherein the set of vertices in the graph structure is a set of coordinate values of the point cloud data, the set of edges in the graph structure is a set of edges connecting adjacent point cloud data, and the weights of the edges of adjacent point cloud data are determined according to the distances between the adjacent point cloud data;

4. The method of claim 3, wherein the clustering the at least one primarily clustered class again according to the farthest distance between the point cloud data included in the at least one primarily clustered class to form a semantic tag labeled class comprises:

5. The method of claim 4, wherein the clustering the valid classes again based on semantic segmentation to form classes labeled with semantic labels comprises:

6. The method of claim 5, wherein the labeled semantic tag class is a human class, the valid class is a human leg class, the predetermined distance range is a predetermined human leg width range, and the predetermined third distance threshold is a predetermined width between two legs.

7. The method according to claim 1, wherein the determining, according to the clustered class of at least one labeled semantic tag, the number of objects whose distance from the robot is less than a preset first distance threshold comprises:

8. The method according to any one of claims 1-7, wherein said controlling the robot in dependence of the number of objects at a distance from the robot being smaller than a preset first distance threshold comprises:

9. The method of claim 8, wherein controlling the robot to stop moving if the number of objects at a distance from the robot that is less than a preset first distance threshold is greater than or equal to a preset number of objects threshold comprises:

10. The method of any one of claims 1-7, wherein prior to establishing a graph structure from the acquired at least one point cloud data, the method further comprises:

acquiring at least one piece of collected original point cloud data;

11. An object-based perception apparatus, comprising:

12. A robot, characterized in that the robot comprises:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement the object-based perception method as claimed in any one of claims 1-10.

13. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the object-based perception method according to any one of claims 1 to 10.