CN116337068A

CN116337068A - Robot synchronous semantic positioning and mapping method and system based on humanoid thought

Info

Publication number: CN116337068A
Application number: CN202310207141.XA
Authority: CN
Inventors: 许庆阳; 丁凯旋; 宋勇; 袁宪锋; 庞豹; 李贻斌
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2023-02-28
Filing date: 2023-02-28
Publication date: 2023-06-27

Abstract

The invention discloses a robot synchronous semantic positioning and mapping method and system based on humanoid thought, comprising the following steps: acquiring an environment picture of each frame and space structure point cloud data of the environment; identifying the environment picture to obtain an observation line for positioning the road sign; carrying out cluster division on all the observation straight lines, and dividing the observation straight lines belonging to the same positioning road sign into a cluster; analyzing the observation straight lines in each cluster by a least square method to determine the space coordinates of each positioning road sign; optimizing the space coordinates of the positioning road sign to obtain the optimized coordinates of the positioning road sign; obtaining a positioning road sign graph according to the optimized coordinates of all the positioning road signs; projecting each frame of point cloud data on the local path to the horizontal direction to generate a line segment graph; correlating all the line segment graphs to obtain an environment structure diagram; combining the positioning road map and the environment structure map to form a semantic map. The synchronous semantic positioning and mapping of the robot are realized.

Description

Robot synchronous semantic positioning and mapping method and system based on humanoid thought

Technical Field

The invention relates to the technical field of map construction, in particular to a robot synchronous semantic positioning and map construction method and system based on a humanoid thought.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

Current robotic synchronous localization and mapping (SLAM) schemes focus on low-level features of the environment, generating a large amount of point cloud data for robotic localization and mapping. Different mapping schemes adopt different environment sensing devices and also correspond to different low-level features. By using devices such as laser sensors and depth cameras capable of sensing the spatial structure, point clouds representing the spatial structure can be obtained, and the organization relationship among the point clouds forms low-level features. An ambient image may be captured using a camera, from which various pixel-level low-level features such as ORB, SIFT, etc. may be extracted. Based on the low-level characteristics, the synchronous positioning and mapping of the robot are realized by using a corresponding mapping algorithm and combining methods such as back-end optimization, loop detection and the like. Since each point of the point cloud formed by the low-level features contains a small amount of information, a large amount of point cloud is required to represent the environmental information. The denser the point cloud, the finer the structure of the environment can be characterized, but excessive pursuit of modeling refinement will bring about enormous computational and storage costs. The map is built for positioning without considering the use of a map, the environment structure is not modeled, and the number of point clouds is rare in the mode. However, these robot positioning and mapping methods are different from the human self-positioning and mapping modes. When people self-locate and spatially describe, the human body pays attention to the more outstanding objects or the organization structures among the objects in the environment, takes the objects as a semantic symbol for self location, pays attention to places and passable relations among places to weaken complex environment structures, and further forms an abstract semantic map.

Early SLAM techniques were unable to quickly and accurately identify specific objects from the environment, and it was difficult to obtain scene semantic information. With the development and application of deep learning, extracting semantic information from the environment is no longer difficult, and more visual semantic SLAM methods are proposed. Semantic information is typically obtained using object detection or semantic segmentation. The spatial location of the semantic roadmap of interest can be obtained using object detection, represented using ellipsoids, cuboids, or known models. The manner of using semantic segmentation is generally combined with dense point clouds, and semantic information is added to the dense point clouds to form labeled semantic point clouds. Most of the methods use the design thought of the traditional SLAM, semantic information is added on the basis of low-level characteristics, so that the method is used for improving the image construction precision, enhancing the loop detection performance, expanding the original map function and the like, and the unique advantages of the semantic information are not exerted, so that the image construction is inaccurate.

Disclosure of Invention

In order to solve the problems, the invention provides a robot synchronous semantic locating and mapping method and system based on a humanoid thought, and realizes synchronous semantic locating and space mapping of a robot.

In order to achieve the above purpose, the invention adopts the following technical scheme:

in a first aspect, a robot synchronous semantic locating and mapping method based on a humanoid thought is provided, including:

acquiring an environment picture of each frame and space structure point cloud data of the environment;

identifying each frame of environment picture to obtain an observation straight line of a corresponding positioning road sign;

carrying out cluster division on all the observation straight lines, and dividing the observation straight lines belonging to the same positioning road sign into a cluster;

analyzing the observation straight lines in each cluster by a least square method, and determining the intersection point of the observation straight lines in each cluster as the space coordinate of each corresponding positioning road sign;

optimizing the space coordinates of the positioning road sign to obtain the optimized coordinates of the positioning road sign;

obtaining a positioning road sign graph according to the optimized coordinates of all the positioning road signs;

projecting each frame of point cloud data on the local path to the horizontal direction to generate a line segment graph;

correlating all the line segment graphs to obtain an environment structure diagram;

combining the positioning road map and the environment structure map to form a semantic map.

In a second aspect, a robot synchronous semantic locating and mapping system based on a humanoid thought is provided, comprising:

the data acquisition module is used for acquiring the environmental picture of each frame and the space structure point cloud data of the environment;

the positioning road sign graph acquisition module is used for identifying the environment pictures of each frame to obtain the corresponding observation straight line of the positioning road sign; carrying out cluster division on all the observation straight lines, and dividing the observation straight lines belonging to the same positioning road sign into a cluster; analyzing the observation straight lines in each cluster by a least square method, and determining the intersection point of the observation straight lines in each cluster as the space coordinate of each corresponding positioning road sign; optimizing the space coordinates of the positioning road sign to obtain the optimized coordinates of the positioning road sign; obtaining a positioning road sign graph according to the optimized coordinates of all the positioning road signs;

the environment structure diagram acquisition module is used for projecting each frame of point cloud data on the local path to the horizontal direction to generate a line segment diagram; correlating all the line segment graphs to obtain an environment structure diagram;

the semantic map acquisition module is used for combining the positioning road map and the environment structure map to form a semantic map.

In a third aspect, an electronic device is provided, including a memory, a processor, and computer instructions stored in the memory and running on the processor, where the computer instructions, when executed by the processor, complete the steps described in the robot synchronous semantic localization and mapping method based on humanoid thought.

In a fourth aspect, a computer readable storage medium is provided for storing computer instructions that, when executed by a processor, perform the steps described in a robot synchronous semantic localization and mapping method based on humanoid ideas.

Compared with the prior art, the invention has the beneficial effects that:

1. the invention imitates the self-positioning and space map drawing modes of human beings, and acquires a target with high-grade semantics from the environment to be used as a positioning road sign; and acquiring the space structure point cloud data of the environment, abstracting and simplifying the point cloud, and constructing the semantic map with dense information. The positioning road sign is used as a basis for the robot to determine the pose of the robot, and the property of the robot directly influences the positioning effect. The invention uses the camera to capture the semantic object in the environment, synthesizes the multi-view observation result to estimate the space position of the semantic object, and forms a positioning road sign. The semantic objects in different types and different positions form a positioning landmark graph together, and the robot can position according to the types and positions of surrounding positioning landmarks. Since a planar map is generally used for mobile robot navigation, in order to more simply represent an environment structure, the present invention abstracts an acquired point cloud into line segments on a plane, and the line segments represent obstacles which do not allow a robot to pass through, so as to form an environment structure diagram. Finally, an abstract semantic map is formed by the positioning road map and the environment structure map, synchronous semantic positioning and space map building of the robot are realized, and the constructed semantic map is closer to a real environment map.

2. According to the invention, the observation straight lines are clustered according to the angle variation of the observation straight lines in the front and rear frames, the observation straight lines with the angle variation smaller than a set value are clustered, the observation straight lines are arranged in time sequence, the newly generated observation straight line clusters for observing the same type of objects are combined with the historical observation straight line clusters one by one, then the residual error of the least square intersection point is calculated, and whether the observation straight lines are associated into the same group is determined according to the residual error size, thereby solving the problems that the observation of the same positioning road sign can not be completely continuous on a path due to factors such as shielding, illumination difference, instability of a target detector and the like, leading the front and rear observation line to be discontinuous and inaccurate cluster division, and eliminating redundant intersection points and realizing accurate acquisition of the space coordinates of the positioning path.

3. The invention also optimizes the space coordinates of the positioning road sign, so that the coordinates after the positioning road sign is optimized, and the observation error and the mileage error are the smallest as a whole.

4. When an environment structure diagram is constructed, the center of a local path corresponding to each line segment diagram is used as a seed to construct a Voronoi polygon; deleting the line segment exceeding the polygon with the center to obtain a reserved line segment; and each reserved line segment is associated to obtain an environment structure diagram, so that the environment structure information can be reserved more completely, and the line segments can be more continuous and complete.

Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application.

FIG. 1 is a block diagram of the method disclosed in example 1;

FIG. 2 is a schematic diagram of the operation and observation of the robot disclosed in example 1;

FIG. 3 is a raw point cloud as disclosed in example 1;

FIG. 4 is a grid map disclosed in example 1;

FIG. 5 is a skeleton extraction diagram disclosed in example 1;

FIG. 6 is a line segment diagram disclosed in example 1;

fig. 7 is a voronoi polygon disclosed in example 1;

FIG. 8 is a laser grid map as disclosed in example 1;

FIG. 9 is a wheel odometer trajectory and laser odometer trajectory as disclosed in example 1;

fig. 10 is a posture of the fisheye camera disclosed in embodiment 1;

FIG. 11 is a circular view of the fisheye camera disclosed in example 1;

FIG. 12 is a view of the circle disclosed in example 1 expanded into a rectangular shape;

FIG. 13 is a plot point result generated by the wheel odometer disclosed in example 1;

FIG. 14 is a graph showing the wheel odometer waypoint results after loop detection and optimization as disclosed in example 1;

FIG. 15 is a plot of laser odometer generation landmark point results disclosed in example 1;

FIG. 16 is a diagram of an environment based on optimized trajectory generation as disclosed in example 1;

FIG. 17 is a semantic map generated based on optimized trajectories as disclosed in example 1;

fig. 18 is a semantic map generated throughout the experimental plot disclosed in example 1.

Detailed Description

The invention will be further described with reference to the drawings and examples.

It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the present application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

Example 1

In this embodiment, a method for robot synchronous semantic localization and mapping based on humanoid thought is disclosed, as shown in fig. 1, including:

Where the location markers are derived from semantic objects in the environment that need to have properties that are easily recognizable, spatially fixed, and exist for a long time.

According to the embodiment, the camera carried on the robot is used for acquiring an environment picture, and the process of acquiring the observation straight line of the positioning road sign according to the environment map is as follows:

s21: and identifying the environment picture to obtain the pixel coordinates of the positioning road sign in the picture.

Identifying the environment picture by using a target detector to obtain a semantic object y _j Is to detect the pixel coordinates u of the center point of the frame _I ,v _I ] ^T I.e. the pixel coordinates of the locating roadmap in the picture, wherein u _I ,v _I Respectively representing the horizontal and vertical pixel coordinates of the center point of the detection frame.

S22: the pixel coordinates are converted into an observation vector in the camera coordinate system.

Pixel coordinates u of locating road sign in picture by internal reference of camera _I ，v _I ] ^T Converts into an observation vector G= [ u ] under a camera coordinate system _c ，v _c ，w _c ] ^T ，u _c ，v _c ，w _c The components of the next unit vector of the camera coordinate system on three axes of x, y and z are represented, the modes of obtaining the observation vector of the semantic object by different camera models are different, and for a general pinhole model camera, the internal reference matrix is K:

s23: and converting the observation vector from the camera coordinate system to the world coordinate system, and translating the observation vector to the world coordinate of the camera to obtain the observation straight line for positioning the road sign.

Down-converting the observation vector G from the camera coordinate system to the world coordinate system, and translating to the world coordinate of the camera to obtain an observation straight line l _i，j For observing straight lines, e.g. in the form of [ x, y, z, u, v, w]Wherein x, y, z represent three components of the spatial coordinates of the camera's optical center when forming the line of observation and u, v, w represent three components of the direction vector of the line in the world coordinate system.

The camera is used for aiming at semantic objects y in different poses _j Is a linear cluster L for observation _j Collecting all the observation lines at one place, dividing the observation lines belonging to the same positioning road sign into a cluster, analyzing the observation lines in each cluster by a least square method, determining the intersection point of the observation lines in each cluster, and obtaining the space coordinates of each corresponding positioning road sign

Wherein I represents an identity matrix with dimensions three, (l) _i，j ) _(1：3) Representing vector l _i，j The first 3 components of (l) _i，j ) _(4：6) Representing vector l _i，j Is included in the last 3 components of (c).

The positioning signposts comprise position information indicating the space coordinates of the positioning signposts and category information of corresponding semantic objects, and the positioning signposts scattered in the space jointly form a positioning signpost graph.

The camera is continuously moving, the observation angle of the adjacent frames to the same semantic object is little in change, and the semantic objects are sparsely arranged in the environment, so that different observation lines in the adjacent frames can be associated by using the angle change relation to form an observation line cluster, as shown in fig. 2. However, due to occlusion, poor illumination, and instability of the object detector, observations of the same locating roadmap may not be completely continuous along the path. As in figure 2In (c), the x-marked line of observation is lost for some reason, so that the line of observation is discontinuous before and after. Thus, the observation straight line cluster L corresponding to the same semantic object _j May be split into a plurality of respective consecutive sub-clusters L _j，1 、L _j，2 … each sub-cluster will also correspond to a least squares intersection. In order to obtain a more accurate locating of the road marking points while eliminating redundant intersection points, data correlation needs to be performed on the sub-clusters. According to the angle change quantity of the observation straight line in the front frame and the rear frame, clustering the observation straight line, and dividing the observation straight line with the angle change quantity smaller than a set value into a cluster to obtain an observation straight line sub-cluster; for the observation straight line sub-clusters obtained by observing the targets of the same category, the newly generated observation straight line sub-clusters and the historical observation straight line sub-clusters are combined one by one according to time sequence, then the residual error of the least square intersection point is calculated, whether the same group is associated or not is determined according to the residual error size, the similar targets at the same place are identified by the target detector and marked as the objects of the same category, the objects are designated in advance as positioning road signs, and when the space coordinates of the positioning road signs are calculated, the mentioned observation straight line sub-clusters obtained by observing the targets of the same category refer to the observation straight line sub-clusters obtained by observing the positioning road signs of the same category.

Specifically, in this embodiment, according to the angle variation of the observation straight line in the front and rear frames, the observation straight line is clustered, the observation straight line with the angle variation smaller than the set value is clustered to obtain an observation straight line sub-cluster, and according to the environmental picture acquisition time sequence, the observation straight line sub-cluster is clustered to a historical observation straight line sub-cluster and a newly generated observation straight line sub-cluster; the cluster containing the observation straight line generated by the environmental picture at the current moment is a newly generated observation straight line sub-cluster, and the cluster at the time before the newly generated observation straight line sub-cluster is a historical observation straight line sub-cluster;

calculating the intersection point of each observation straight line in the historical observation straight line sub-cluster by a least square method;

combining the newly generated observation straight line clusters obtained by observing the targets of the same category with the historical observation straight line clusters one by one, and calculating the intersection point of each observation straight line in the combined clusters by a least square method;

calculating residual errors of intersection points of all the observation lines in the historical observation line sub-clusters and intersection points of all the observation lines in the combined clusters;

when the residual error is larger than or equal to a set value, the newly generated observation straight line sub-cluster and the history observation straight line sub-cluster cannot be divided into a group;

and when the residual error is smaller than the set value, dividing the newly generated observation straight line sub-cluster and the historical observation straight line sub-cluster into a group, and obtaining the observation straight line cluster of the same positioning road sign.

In order to ensure that the coordinate error of the obtained positioning road sign is minimum, the image is built more accurately, and the space coordinates of the positioning road sign are optimized through a positioning road sign optimization equation, so that the coordinates of the positioning road sign after optimization are obtained.

The positioning landmark optimization equation is obtained by carrying out least square calculation construction on the transformation relation error of the adjacent pose of the robot odometer and the error between the observation straight line and the positioning landmark.

Specific: generating edges between adjacent poses by taking the poses of a robot as nodes, constructing a pose graph, wherein each edge in the graph represents the transformation relationship between two poses connected by the edge, the adjacent relationship is represented by a set C formed by positive integers, and the position x is represented by _i And positive integer c.epsilon.C represents x _i And x _i+c And (5) connection. The number of elements in set C determines the density of edges in the pose graph and the number of elements determines the extent of the proximity relationship.

And constructing a transformation relation error equation of the proximity pose of the robot odometer according to the pose graph. The transformation relation error equation of the proximity pose of the robot odometer is as follows:

wherein e _od，i Is the transformation relation error of the proximity pose of the robot odometer,

representing the initial pose, inThe optimization is treated as constant,/->

Represents x _i The corresponding pose is inversely transformed.

The error equation between the observed line and the located road sign is:

wherein e _ob，i，j To observe errors between the straight line and the locating roadmap.

Performing least square calculation on the transformation relation error of the adjacent pose of the robot odometer and the error between the observation straight line and the loop positioning road sign to obtain a positioning road sign optimization equation, wherein the positioning road sign optimization equation specifically relates to the positioning road sign y and the odometer pose x is as follows:

wherein x is ^* ，y ^* The optimal solution for the optimization problem is shown, namely, the optimal pose of the robot and the space coordinates of the positioning road sign are respectively shown, alpha is an odometer confidence factor, and if the accuracy of the used odometer is higher, the numerical value of the factor can be increased, so that the attention of an optimization algorithm on the odometer error is improved, and excessive adjustment on the pose provided by the odometer is avoided.

The pose of the camera is provided by the odometer and long error accumulation will make the map unusable. The optimization and loop detection can effectively eliminate accumulated errors, so the embodiment also carries out loop detection on the positioning road signs. In conventional SLAM, each frame has a large number of low-level features, providing rich information for optimization and loop-back detection. In the map generation mode, the information amount of each frame is very rare, and the positioning road mark points in the map are very sparse. Therefore, strategies for optimization and loop detection need to be adjusted. Because the observed data are too rare, the observed data are not sufficiently constrained with each other, and the camera pose is simply adjusted by taking the error between the reduced observed straight line and the road sign as a target, the observed error eliminated in the least square can be transferred into the camera pose. The camera pose information comes from the odometer, and the change between the poses provided by the odometer has small error in a local range, so that enough constraint can be provided for the camera pose.

Before the spatial coordinates of the locating roadmap are optimized, the error of the observation part comes from the accumulation of errors of the odometer and noise in the observation process, and the error of the odometer part is zero. When the optimization starts, in order to reduce the error of the observation part, the pose of the robot is readjusted, and the error of the mileage part is no longer zero. The more the error of the observation part is reduced, the greater the degree to which the pose of the robot is adjusted, and the greater the error of the mileage part. When the optimization is completed, a state of minimum overall and mutual balance is achieved between the observation error and the mileage error.

In the locating road map, different kinds of locating road signs are scattered in the space, and the kind and position organization relations between the road signs form unique information, and the organization relations provide matching information for loop detection. The embodiment also carries out loop detection on the positioning road signs, and divides the positioning road sign map into two parts of an activity map and a history map according to the generation sequence of the positioning road signs. The moving map takes the current position of the camera as a starting point, the latest generated landmark points are added according to the space coordinates and against the time sequence until the number of the landmark points of the moving map reaches the specified number or the size reaches the specified size, and the historical map refers to a map formed by the landmark points which are not included in the moving map. Since the error of the odometer is gradually accumulated on the path, the error between the relative position relation between the positioning road mark points and the real relative position relation in the environment is smaller on the local range of the path. Therefore, according to the organization relation between the positioning landmark points in the moving map, the corresponding positioning landmark point sets can be matched in the historical map. And carrying out data association on the matched loop mark, and then carrying out optimization correction.

In the embodiment, each frame of point cloud data on the local path is projected to the horizontal direction to generate a line segment graph, and all the line segment graphs are associated to obtain an environment structure graph.

Specific: the environment structure does not participate in the positioning in the map building, but only serves as an obstacle for dividing the passable area in the semantic map, so that the acquisition and the processing of the point cloud can be more flexible and open, and the environment structure can be better represented. The present embodiment uses a line segment graph to represent an abstract environment structure, where a line segment represents an obstacle that cannot be traversed in the environment, for constraining the range of motion of the robot. The line segment diagram comes from the abstraction of the point cloud, the ground and the part exceeding the height of the robot in one frame of the point cloud are removed, and only the part affecting the movement of the robot is reserved. Dividing the spatial path into a plurality of sections of local paths; and projecting the point cloud on the local path to the horizontal direction, and generating a grid map. And carrying out line segmentation on occupied grid map to obtain a frame of line segment map. Fig. 3-6 illustrate the point cloud to line segment process.

Because the environment structure does not participate in positioning, each frame line segment graph can only be related by relying on the pose provided by the odometer. In the case of non-ideal odometers, observations at the current location at a distance and at the current time do not coincide, and only the environmental structure acquired at the current location can be trusted. There is often significant overlap and misalignment between the line segment graphs, resulting in the line segment graphs being quite messy and unusable. In this way, the system only keeps the line segments near the acquisition points at all positions, so that the system can ensure that more reliable observation data can be acquired at all positions on the acquired path, and can also eliminate the overlapping and dislocation of the line segments.

Because each acquisition point is discretely distributed on the plane, any point in the plane can find one or more acquisition points nearest to the acquisition point; instead, for each acquisition point there is an area where all points within the area are no closer to other acquisition points than to the acquisition point. The region definition is performed using a voronoi polygon as in fig. 7. A voronoi polygon is seeded with a series of dots, where each dot in the figure corresponds to a convex polygon and all points inside each polygon are closest to the dot to which the polygon corresponds. And generating Voronoi polygons on the plane by taking the acquisition points as seeds, wherein each polygon corresponds to the observation data retention range of each acquisition point, and line segments exceeding the respective polygon of the acquisition points are deleted.

Therefore, when all the line segment graphs are associated, the embodiment obtains an environment structure diagram;

firstly, cutting a line segment diagram through a Voronoi polygon to obtain a reserved line segment;

and then, correlating the reserved line segments to obtain an environment structure diagram.

Single-frame point clouds tend to lose view due to obstruction, while successive multi-frame point clouds can complement each other's lost view. Although the odometer has errors, the line segment graphs with the shorter frames before and after have better coincidence ratio, so that multi-frame point clouds on a small path can be combined to form a local line segment graph of one frame. And (3) taking the center of each small segment path as a seed, generating a Voronoi polygon, and finally cutting line segments. Therefore, the environmental structure information can be more completely reserved, and the line segment can be more continuous and complete.

Therefore, the clipping process of the line graph by the voronoi polygon in this embodiment is as follows:

taking the central point of the local path corresponding to the line segment diagram as a seed to construct a Voronoi polygon;

and deleting the line segment exceeding the polygon with the center to obtain a reserved line segment.

After the camera pose is optimized and adjusted, the pose of the point cloud acquisition is also adjusted, but the historical point cloud information is deleted and a new line segment map cannot be regenerated. Because the internal change of the local path is very small before and after the pose of the camera is optimized, the change of the line segment diagrams before and after the optimization is very small, and therefore, the system reserves all the line segment diagrams, only the line segment representation of the local environment structure is reserved in the line segment diagrams, and the data volume is very small and does not cause burden to the system. When the pose of the camera is adjusted, only the pose of the line segment graph is correspondingly adjusted, so that the global line segment graph is updated.

And finally, combining the positioning road map and the environment structure diagram to form a semantic map.

According to the method disclosed by the embodiment, the robot synchronous semantic positioning and mapping are performed by imitating the self-positioning and space map drawing modes of human beings. Humans map and self-locate by semantic information in space, semantic objects include landmark information for locating and environment structures that obstruct traffic. The positioning road sign is used for positioning the person, and the environmental structure restricts the action range of the person. Simulating the human positioning and space description behavior mode, generating a positioning road sign by using a visual sensor and a target detection tool, wherein the positioning road sign is represented by a point with space position information and a semantic label; acquiring obstacle information by using an environment structure sensor, wherein the point cloud obstacle information acquired by the environment structure sensor is abstracted into a straight line segment on a plane; and finally, realizing synchronous semantic positioning and space mapping of the robot.

The method disclosed in this example was verified by experiments.

The office corridor is selected as a robot experimental environment, and doors, safety indicator lights, fire extinguishers and the like contained in the office corridor can be used as semantic objects. A set of data sets is acquired in the corridor, including both daylight and light scenes, each frame of data acquired including wheel odometer data, 2D laser odometer, fisheye and pinhole camera images, and color images and point clouds of the RGBD camera, and laser grid maps in both scenes. Fig. 8 shows a laser grid map of an experimental site, and fig. 9 shows odometer traces of parts nos. 30 to 1130 in a daylight scene, wherein a dashed line is a laser odometer trace, and a dotted line is a wheel-type odometer trace, and it can be seen that a great drift occurs through a round trip of the rear wheel odometer. The source of mileage information required to generate a locating roadmap may select either a wheel-type odometer or a laser odometer. The wheel type odometer has larger accumulated error, and loop detection and optimization are carried out in the subsequent loop detection experiment.

When the robot passes through the long and narrow corridor, semantic objects on two sides are not easy to observe, and the fisheye camera has a wide visual angle and is more beneficial to observation, so that experiments are carried out by adopting the fisheye camera. However, the circular view image generated by the fisheye camera is inconvenient for object detection, and for this purpose, the optical axis of the fisheye camera is vertically and upwardly placed in this embodiment, so that the scene image around the camera uniformly falls on the periphery of the circular view, and finally the circular view is cylindrical-unfolded to obtain a rectangular view suitable for the object detection model, as shown in fig. 11-12.

The image shot by the fisheye camera is seriously distorted, and after cylindrical expansion, the horizontal observation angle of the semantic object and the abscissa v of the pixel coordinate of the center of the detection frame of the semantic object are obtained _I In a linear relationship (the panoramic view is cylindrical unfolded in the direction of the negative half axis of the x axis, and the direction corresponding to the center of the image is taken as the positive direction of the x axis). Due to the limitation of the use mode of the fish-eye lens and the cylindrical expansion, the use of a plane model is more suitable, so that the pitch angle component w of the observation vector is obtained _c Setting zero, then:

a reciprocating trace (sunlight 30-1130) was selected from the dataset for testing, as shown in fig. 13-15, showing the results of generating a locating roadmap at two odometer data, where the black short arrow represents the line of observation, the circular and square markers represent two different kinds of advanced semantic locating roadmap, which are distributed along the odometer trace. The positioning landmarks in the two figures have not been data correlated by loop detection, so that the points observed before and after are independent. Compared with each other, the wheel type odometer has obvious drift, so that the road mark points generated for the same semantic object are greatly different from each other; and the road mark points generated before and after the laser odometer are close to each other, so that the real situation is met. When the loop exists in the path, firstly, the loop detection is carried out to carry out data association on the related road mark points, and then the position of the road mark points and the pose of the camera are adjusted through a positioning road mark optimization equation, so that a map with the same global state is obtained. After loop detection and pose optimization, the track of the wheel type odometer and the generated road mark point are more similar to the real situation.

The environmental structure diagram is generated according to the track after looping, as shown in fig. 16, the line segments represent obstacles similar to the shape of the grid map, but the data volume is greatly reduced. The locating road map and the environment structure map are fused together to generate a complete semantic map, as shown in fig. 17. Fig. 18 shows a semantic map generated throughout the experimental venue. The finally obtained semantic map accords with the actual situation, and the accuracy of the method disclosed by the embodiment is verified.

Example 2

In this embodiment, a robot synchronous semantic locating and mapping system based on humanoid ideas is disclosed, comprising:

Example 3

In this embodiment, an electronic device is disclosed that includes a memory, a processor, and computer instructions stored on the memory and running on the processor, where the computer instructions, when executed by the processor, perform the steps described in the robot synchronous semantic localization and mapping method based on humanoid ideas disclosed in embodiment 1.

Example 4

In this embodiment, a computer readable storage medium is disclosed for storing computer instructions that, when executed by a processor, perform the steps described in the robot synchronous semantic localization and mapping method based on humanoid ideas disclosed in embodiment 1.

Finally, it should be noted that: the above embodiments are only for illustrating the technical aspects of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those of ordinary skill in the art that: modifications and equivalents may be made to the specific embodiments of the invention without departing from the spirit and scope of the invention, which is intended to be covered by the claims.

Claims

1. The robot synchronous semantic positioning and mapping method based on the humanoid thought is characterized by comprising the following steps of:

2. The robot synchronous semantic localization and mapping method based on the humanoid thought of claim 1, wherein the environment picture is identified to obtain pixel coordinates of a localization landmark in the picture;

converting the pixel coordinates into observation vectors under a camera coordinate system;

and converting the observation vector from the camera coordinate system to the world coordinate system, and translating the observation vector to the world coordinate of the camera to obtain the observation straight line for positioning the road sign.

3. The robot synchronous semantic localization and mapping method based on the humanoid thought according to claim 1 is characterized in that the observation straight lines are clustered according to the angle variation of the observation straight lines in front and back frames, and the observation straight lines with the angle variation smaller than a set value are clustered to obtain observation straight line sub-clusters;

combining the newly generated observation straight line clusters obtained by observing the similar targets with the historical observation straight line clusters one by one, and calculating the intersection point of each observation straight line in the combined clusters by a least square method;

and when the residual error is smaller than the set value, dividing the newly generated observation straight line sub-cluster and the historical observation straight line sub-cluster into a group.

4. The robot synchronous semantic localization and mapping method based on the humanoid thought of claim 1, wherein the space coordinates of the positioning roadmap are optimized through a positioning roadmap optimization equation to obtain the coordinates after positioning roadmap optimization;

the positioning landmark optimization equation is obtained by carrying out least square calculation on the transformation relation error of the adjacent pose of the robot speedometer and the error between the observation straight line and the positioning landmark.

5. The robot synchronous semantic localization and mapping method based on the humanoid thought according to claim 4, wherein the robot pose is taken as a node, edges are generated between adjacent poses, a pose graph is constructed, and each edge in the graph represents a transformation relationship between two poses connected by the edge;

and obtaining a transformation relation error of the proximity pose of the robot odometer according to the pose graph.

6. The robot synchronous semantic localization and mapping method based on the humanoid thought according to claim 1, wherein when all line segment graphs are associated, an environment structure diagram is obtained;

7. The robot synchronous semantic localization and mapping method based on the humanoid thought according to claim 6, wherein the process of clipping the line segment map by voronoi polygons is as follows:

8. Robot synchronous semantic location and build drawing system based on imitative people thinking, characterized by comprising:

9. An electronic device comprising a memory and a processor, and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the steps of the humanoid-thought-based robot synchronous semantic localization and mapping method of any one of claims 1-7.

10. A computer readable storage medium storing computer instructions which, when executed by a processor, perform the steps of the humanoid thought-based robot synchronous semantic localization and mapping method of any one of claims 1-7.