CN118031976A

CN118031976A - Man-machine cooperative system for exploring unknown environment

Info

Publication number: CN118031976A
Application number: CN202410445433.1A
Authority: CN
Inventors: 申冰可; 乔笑宁; 谢文明; 彭晓东; 牛文龙; 李运; 王迪; 吴涛
Original assignee: National Space Science Center of CAS
Current assignee: National Space Science Center of CAS
Priority date: 2024-04-15
Filing date: 2024-04-15
Publication date: 2024-05-14
Anticipated expiration: 2044-04-15
Also published as: CN118031976B

Abstract

The application provides a man-machine cooperative system for exploring an unknown environment, which comprises the following steps: the system comprises an unmanned equipment end, a server end and an augmented reality equipment end; the unmanned equipment end is used for acquiring a point cloud frame of an environment, generating a three-dimensional semantic point cloud map and a positioning track, acquiring a conversion matrix between a camera and Apriltag codes, and releasing the semantic point cloud map, the positioning track and the conversion matrix; the server side is used for receiving the information issued by the unmanned equipment side, converting the semantic point cloud map and the positioning track into a Apriltag code coordinate system according to the conversion matrix, and issuing the converted map and track; and the augmented reality equipment end is used for identifying Apriltag codes to perform space transformation, acquiring the coordinate positions of the point cloud points, drawing the point cloud according to the positions of the point cloud points and semantic tag information, and rendering. The application has the advantages that: the efficiency of exploring and sensing an unknown environment is improved; the device can assist in enhancing accurate understanding and display of the actual environment.

Description

Man-machine cooperative system for exploring unknown environment

Technical Field

The application belongs to the field of unknown environment sensing, and particularly relates to a man-machine cooperative system for exploring an unknown environment.

Background

With the continuous development of unmanned systems, robot applications are gradually expanding to various industries, gradually replacing humans to complete some work or tasks. The robot has the advantages of flexibility, small size, limitation, strong environmental adaptability and the like, so the robot is widely applied to the fields of security protection, disaster relief, rescue, commercial logistics and the like. The method not only can realize informatization and intelligence under specific application scenes, but also can improve the environment perception speed and expand the perception range. However, human assistance and collaboration is still required in the face of rescue and detection tasks in different complex environments.

The synchronous mapping and positioning (Simultaneous Localization AND MAPPING, SLAM) is to collect and sense surrounding position environment data through sensors carried by the robot in the moving process of the robot, so that the map is constructed and pose information of the robot is obtained. The classical framework of SLAM algorithms mainly includes a front-end odometer, closed-loop detection and a back-end optimization section. And the front end obtains pose transformation between continuous frames through inter-frame registration according to the data obtained by the sensor. Closed loop detection eliminates accumulated drift errors by identifying whether the robot has arrived at the current scene, finding a potential loop. And the rear end generates an environment map and a robot pose by optimizing the robot pose and observing constraint obtained by the front end. The point cloud map directly constructed by SLAM can provide the position information of objects in the scene, but only can embody the structure of the scene, but cannot realize higher-level understanding of the environment, such as the type information of each object in the scene.

Disclosure of Invention

The application aims to overcome the defect that the prior art only can embody the structure of a scene and cannot realize higher-level understanding of the environment.

In order to achieve the above object, the present application proposes a human-computer collaboration system for exploring an unknown environment, the system comprising: the system comprises an unmanned equipment end, a server end and an augmented reality equipment end; wherein,

The unmanned equipment end is used for acquiring a point cloud frame of an environment by using the environment information acquisition equipment, generating a three-dimensional semantic point cloud map and a positioning track, identifying Apriltag codes, acquiring a conversion matrix between a camera and Apriltag codes, and releasing the semantic point cloud map, the positioning track and the conversion matrix;

The server side is used for receiving the information issued by the unmanned equipment side, converting the semantic point cloud map and the positioning track into a Apriltag code coordinate system according to the conversion matrix and transmitting the semantic point cloud map and the positioning track to the augmented reality equipment side;

and the augmented reality equipment end is used for identifying Apriltag codes to perform space transformation, acquiring the coordinate positions of the point cloud points transmitted by the server end, drawing the point cloud according to the positions of the point cloud points and semantic tag information, and rendering.

As an improvement of the system, the unmanned equipment end comprises an environmental information acquisition device, a camera, an SLAM mapping module and an IMU inertial module;

the working process of the unmanned equipment end comprises the following steps:

Step S1: the SLAM mapping module takes a point cloud frame acquired by the environment information acquisition equipment as input, generates a three-dimensional semantic point cloud map and the pose of the unmanned equipment end in real time, performs point cloud segmentation optimization and feature extraction, performs feature matching and pose estimation on two frames by performing planar points and edge points of the point cloud frame extracted according to curvature values, constructs a distance residual equation, and performs iterative optimization on associated features by using an LM method to obtain and optimize the relative pose;

Step S2: calculating a matrix of the semantic point cloud map converted into a Apriltag code coordinate system; and identifying Apriltag codes by a camera, and obtaining a conversion matrix from the world coordinate system to the camera coordinate system according to homography conversion calculation.

As an improvement of the system, the SLAM mapping module comprises a point cloud de-distortion sub-module, a point cloud semantic acquisition sub-module, a point cloud segmentation sub-module, a characteristic point extraction sub-module, a laser odometer sub-module and a semantic map construction sub-module;

The step S1 includes:

Step S1-1: the point cloud de-distortion submodule carries out motion distortion correction on each point in the point cloud, the unmanned equipment end pose during scanning is estimated through integration of the IMU inertial module, the rotation angle of each point relative to the starting point is obtained, and the position of each point is compensated to the starting moment; the compensation relationship is as follows:

wherein: Respectively representing the initial moment and the pose corresponding to the current moment of each point Representing a transformation matrix;

Step S1-2: the point cloud semantic acquisition sub-module receives a real-time point cloud frame acquired by the environment information acquisition equipment, semantic images are acquired through semantic segmentation of camera image frames, semantic information is mapped to the point cloud frames to acquire category information of each object in the point cloud, semantic tag information is given to the point cloud points, and meanwhile, position information, strength information and semantic category information of the points in the point cloud frames are stored to realize construction of the semantic point cloud frames;

Step S1-3: the point cloud segmentation sub-module projects the semantic point cloud frame as a distance image, calculates the included angle between the point in the image and the adjacent point according to the loop line of the point, judges the point cloud as the ground if the included angle is smaller than a second set threshold value, and separates the ground point cloud; traversing the point cloud after the ground is segmented through depth priority to recursively find points, comparing the relative angles of four adjacent points, if the angles are larger than a third set threshold value, considering the point cloud clusters as the same point cloud cluster, eliminating the point cloud clusters with the number of points smaller than a fourth set threshold value, realizing point cloud denoising, and marking the same point cloud cluster;

Step S1-4: the feature point extraction submodule equally divides the 360-degree direction of the point cloud frame into a plurality of areas, calculates the curvature of the midpoint of each area, and if the curvature is larger than a fifth set threshold value, the line feature is the line feature, and if the curvature is smaller than the fifth set threshold value, the plane feature is the plane feature; selecting a plurality of face features and a plurality of line features within each region;

The curvature of the point is calculated by taking depth difference values of a plurality of points from the left and right of the current point and the current point, and the formula is as follows:

Wherein, Representing a set of points,/>Representing the depth of the current point,/>Representing depth of adjacent points,/>Representing the curvature of the current point;

In the process of selecting the feature points, the number and the category of the feature points in each direction are fixed, and the surrounding points which are already selected as the feature points are not selected any more;

Step S1-5: the laser odometer sub-module is used for carrying out inter-frame matching on the line characteristics and the surface characteristics extracted by the characteristic point extraction sub-module to obtain a transformation matrix between two frames;

Step S1-6: the semantic map construction sub-module is used for obtaining a frame-to-map matching pose through matching the feature set with a surrounding point cloud map, and then obtaining a transformation matrix by using an LM method on constraints among nodes; the method specifically comprises the following steps:

Fusing and splicing feature point sets with semantic tag information within the current position requirement range to obtain a semantic point cloud map, selecting a group of feature sets closest to the feature sets as new nodes and nodes in the point cloud map to calculate space distance constraint, and obtaining pose transformation through LM optimization;

According to the corresponding time stamp of the IMU inertial module data and the scanning time stamp of the environmental information acquisition equipment, calculating the roll angle and pitch angle of the IMU inertial module at the current moment through linear interpolation, setting weights according to the square root of the LM residual error, fusing the angle of the IMU inertial module data with the angle matched with the scanning of the environmental information acquisition equipment, and updating to obtain a final transformation matrix ：

Wherein,The fusion coefficient is represented by a number of coefficients,Representing the transformation matrix of the inertial module of the IMU,Representing the transform matrix that the environmental information acquisition device scans for matches.

As an improvement of the above system, the step S1-2 includes:

Processing the image frames acquired by the camera by using a real-time semantic segmentation network to acquire semantic images of each frame, and performing time stamp alignment on the point cloud frames after de-distortion and the semantic image frames after segmentation;

According to the conversion matrix, projecting the point cloud frame onto a corresponding image frame, and filtering out points which are not in the field of view of the camera; calculating row and column indexes of points in a distance image according to coordinates of the points, reserving points with closer distances, converting the points into 2D coordinate points according to a camera model, and obtaining corresponding semantic point labels according to the 2D coordinates;

Calculating whether a point which is different from the semantic label of the current point exists in the neighborhood range of each point cloud point mapped into the image frame, and if so, marking the point as an edge point, namely the point is a boundary between two kinds of semantics; traversing all pixel points in the image, judging whether edge points exist in adjacent ranges, if so, judging whether the intensity information between the two points is in a first set threshold range, and if so, assigning semantic labels of the points to the edge points.

As an improvement of the above system, the S1-2 further includes:

And setting a distance threshold for mapping the point cloud, and preventing the point cloud frame from being mapped to a sky area in the image frame by mistake.

As an improvement of the above system, the step S1-5 includes:

According to the result of the point cloud clustering in the step S1-3, the tag information is used as constraint conditions of inter-frame matching, and only the points with the same tag can be matched; obtaining a frame pose by an LM method, and optimizing the distance from two adjacent frame points to the surface by surface feature matching to obtain: Wherein, the method comprises the steps of, wherein, Representing the distance of movement of two frames in the z-axis direction,The angle of roll for two frames is shown,Representing the angle of pitch of two frames; by line feature matching, the distance from two adjacent frame points to the line is optimized to obtainWherein, the method comprises the steps of, wherein,Representing the distance of movement of two frames in the x-axis direction,Representing the distance of movement of two frames in the y-axis direction,An angle representing the heading of two frames; will beAndFusion is carried out to obtain a transformation matrix between two frames;

Distance of surface feature points The calculation formula of (2) is as follows:

wherein: Represent the first Coordinates of an i-th feature point in +1 frame; Represent the first Coordinates of a point closest to the i-th feature point in the frame, the point being called j; Represent the first At a point in a frameOn the scanning beamCoordinates of the nearest point; representing the point in the kth frame On adjacent beams of the scanned beamCoordinates of the nearest point;

distance of line feature points The calculation formula of (2) is as follows:

。

As an improvement of the above system, the step S2 includes:

Step S2-1: calibrating the camera to obtain an internal reference matrix and distortion parameters of the camera as prior;

step S2-2: and forming four point pairs according to four corner points extracted from Apriltag codes and four corresponding assumed points in a world coordinate system, solving by a direct linear method to obtain a homography matrix, and obtaining a conversion matrix from the world coordinate system to a camera coordinate system by the homography matrix and camera internal parameters.

As an improvement of the system, the server side comprises a message acquisition module and a point cloud processing module;

The working process of the server side comprises the following steps:

Step S3: the method comprises the steps that a message acquisition module of a server side acquires a message from an unmanned equipment side in a manner that the unmanned equipment side and the server side are provided with an ROS master-slave machine; setting acquisition frequency according to the ROS topics, and reading and storing a conversion matrix, a semantic point cloud map constructed by the unmanned equipment end and real-time positioning track information in the message;

Step S4: the point cloud processing module converts the acquired ROS message data into a rotation matrix, generates a conversion matrix by combining the translation vector, and performs matrix multiplication on the conversion matrix and the point coordinate position of the point cloud to obtain the coordinate of the point cloud under the Apriltag code coordinate system:

The coordinate system transformation formula is expressed as follows:

Wherein, The conversion matrix is represented by a representation of the conversion matrix,The rotation matrix is represented by a matrix of rotations,The translation vector is represented as a function of the translation vector,Representing a point cloud midpointCoordinates in the unmanned device coordinate system,Representing a point cloud midpointCoordinates in Apriltag coordinate system.

As an improvement of the system, the augmented reality equipment end comprises a coordinate transformation module and a display module;

The working process of the augmented reality equipment end comprises the following steps:

Step S5: the coordinate transformation module and the unmanned equipment end recognize Apriltag codes at the same position, calculate a transformation matrix and place the augmented reality equipment end under a Apriltag code coordinate system;

Step S6: the display module obtains the coordinate position of the point cloud midpoint and the position of the current unmanned equipment end sent by the server end, draws the point on the augmented reality equipment, and displays the map constructed by the unmanned equipment end in real time.

As an improvement of the system, the display module comprises a model drawing sub-module and a virtual-real fusion sub-module;

the step S6 includes:

step S6-1: the model drawing submodule expresses the points in the form of blocks through the Unity engine and presents different colors according to different semantics;

Step S6-2: the virtual-real fusion sub-module renders the modeled point cloud, and realizes the positioning registration of the virtual scene and the real scene according to the coordinate transformation among the model matrix, the view matrix and the projection matrix, and the virtual information is accurately presented in the real environment.

As an improvement of the system, the display module further comprises a man-machine interaction sub-module which is used for realizing the interaction function with a user, and the scene is controlled by identifying the user operation and generating a control command.

Compared with the prior art, the application has the advantages that:

1. The system for exploring and sensing the unknown environment through the human-computer cooperation is provided, and through an intelligent interaction framework of the cooperation of a designer and unmanned equipment, the cooperation of the unmanned equipment and a plurality of augmented reality equipment is realized based on multi-equipment pose calculation and semantic information fusion, so that the efficiency of exploring and sensing the unknown environment is improved;

2. The edge point cloud semantic optimization method is provided, and based on the designed multi-sensor fusion positioning and semantic mapping frame, unmanned equipment can provide more accurate complex environment information, so that the accurate understanding and the actual display of the equipment can be assisted and enhanced;

3. the plurality of augmented reality devices can display the same scene at the same time, and different directions of the scene can be displayed according to the positions of different augmented reality devices.

Drawings

FIG. 1 is a diagram of a human-computer collaboration system architecture for exploring an unknown environment;

FIG. 2 is a diagram of a framework of a human-computer collaboration system module for exploring an unknown environment;

FIG. 3 is a diagram showing the semantic segmentation effect of an image frame;

FIG. 4 is a diagram showing the effect of mapping semantic point cloud frames;

FIG. 5 is a graph showing edge optimized region effects;

FIG. 6 (a) is a view of an effect of an environmental point cloud constructed by an unmanned device;

Fig. 6 (b) shows a second environmental point cloud effect diagram constructed by the unmanned device;

fig. 7 is a diagram showing a virtual-real fusion effect exhibited by the augmented reality device.

Detailed Description

The technical scheme of the application is described in detail below with reference to the accompanying drawings.

Augmented reality (Augmented Reality, AR) is to fuse the real world with the virtual, present virtual information into the real physical world through location matching, augment the real world, and enable real-time interaction while achieving visualization. Currently, representative products for augmented reality are Google Glass by Google 2012 and holonens by microsoft 2015. The augmented reality technology is to render a virtual scene to generate a virtual object scene. Meanwhile, the real scenes collected by the camera and the like are measured through tracking and positioning, so that the superposition of the virtual scenes on the real world is finally realized, the augmented reality scenes are formed and presented to the user, and the user can generate a series of control instructions through information such as gestures. Compared with the traditional graphical presentation mode at the PC end, the method not only improves the perception degree of the environment, but also improves portability.

Human-computer cooperation refers to that in a certain cooperation area, people and unmanned equipment interact cooperatively to achieve a target. The advantages of the human and the machine are complemented by the cooperation of the human and the unmanned equipment, so that tasks such as sensing, control and decision making can be more efficiently completed in a complex and changeable environment. Man-machine cooperation not only utilizes the advantage of machine automation, but also avoids the problem that the machine cannot cope with complex environments, and improves the decision making ability of people, thereby making decisions more in line with the current state.

The map construction effect of the robot in an unknown environment by utilizing the SLAM technology can be intuitively perceived through the augmented reality equipment, so that the environment information is identified, and meanwhile, the robot map reconstruction change, the robot motion trail and the like can be observed in real time. In addition, interaction with the three-dimensional map may be performed by means of gesture control or the like. Finally, the scene understanding is achieved through man-machine interaction collaboration.

Therefore, the application provides a man-machine cooperative system for exploring an unknown environment, which is used for controlling an unmanned vehicle to explore an unknown region of interest through a master-slave man-machine cooperative mode to realize real-time positioning and mapping, and presenting a semantic map of an explored region and the real-time position of the unmanned vehicle in augmented reality equipment through an augmented reality technology, so that man-machine interactive cooperation can be realized under the scenes of disaster relief, rescue, remote environment investigation and the like.

The invention provides a man-machine collaborative system for exploring an unknown environment, which can acquire and present a semantic map constructed by an unmanned equipment exploration position area in augmented reality equipment, and realize interaction fusion between man-machine environments, so that tasks such as target identification, situation analysis and the like can be carried out subsequently.

From a logic perspective, the human-computer collaborative system for exploring the unknown environment comprises a perception layer, a decision layer and an execution layer, and the system architecture is shown in figure 1. The sensing layer acquires environment information by using a sensor on the unmanned equipment and realizes the construction of a semantic map, provides real-time feedback of environment change for a user, the decision layer processes and feeds back the environment change according to the information of the sensing layer, and the executing layer executes the environment change according to a command sent by the decision layer.

From a physical perspective, the man-machine cooperative system for exploring the unknown environment comprises an unmanned equipment end, a server end and an augmented reality equipment end. At the unmanned equipment end, a three-dimensional semantic point cloud map and a positioning track are generated through a SLAM mapping module, apriltag codes (a visual reference system which is widely applied to various tasks such as augmented reality, robot navigation and camera calibration, and the like, and is used for rapidly detecting a tag through a specific mark and calculating the accurate 3D position, direction and identification of the tag relative to the camera) to calculate and obtain a conversion matrix between the camera and Apriltag codes are identified, and the semantic point cloud map, the positioning track and the conversion matrix are issued through an ROS message (Robot Operating System, which is a data format used for communication between nodes in the ROS system); and at the server side, acquiring the information issued by the unmanned equipment side through the ROS information, and converting the map and the positioning track into a Apriltag code coordinate system according to the conversion matrix. After the map is stored, the map is transmitted to an augmented reality device end through TCP; at the augmented reality equipment end, first, apriltag codes are identified to perform space transformation, then, the coordinate positions of point cloud points transmitted by the server end are obtained, and the point cloud is drawn and rendered according to the positions of the point cloud points and semantic tag information. The augmented reality equipment end can have one or more, and a plurality of augmented reality equipment can show same scene simultaneously to can show the different directions of scene according to the position of different augmented reality equipment. Unmanned aerial vehicle equipment, server and augmented reality equipment realize wireless connection through WIFI. The system module composition is shown in fig. 2.

The unmanned equipment end is provided with an environment information acquisition device, a camera, an SLAM image building module and an IMU inertial module, and the environment information acquisition device is used for scanning the environment of a scene and acquiring position information.

The environmental information collection device may be a lidar, a camera, or the like.

The unmanned equipment end comprises the following modules, and the specific functional steps comprise:

Step A: the SLAM map building module takes a point cloud frame acquired by the environment information acquisition equipment as input, and realizes sub-modules comprising point cloud de-distortion, point cloud semantic acquisition, point cloud segmentation, feature extraction, laser odometer, laser map building and the like, and generates a three-dimensional semantic point cloud map and the pose of the robot in real time. And optimizing feature extraction for point cloud segmentation, and carrying out feature matching and pose estimation on two frames by carrying out feature matching and pose estimation on plane points and edge points of the point cloud frames extracted according to curvature values. And (3) constructing a distance residual equation, adopting a Levenberg-Marquardt (LM) method to iteratively optimize the associated features, obtaining and optimizing the relative pose, and improving the convergence rate. In addition, a loop detection module is added to correct drift problems caused by error accumulation. The specific process comprises the following steps:

Step A1: and a point cloud de-distortion sub-module. And correcting motion distortion of each point in the point cloud according to the IMU data. And estimating the pose of the robot during scanning through IMU integration, obtaining the rotation of each point relative to the starting point, and compensating the position of each point to the starting moment. The conversion relationship is shown in formula (1):

（1）

wherein: respectively representing the pose corresponding to each starting point moment and the current moment, Representing the transformation matrix.

Step A2: and a point cloud semantic acquisition sub-module. According to the 3D laser radar installed on the unmanned equipment end, a point cloud frame is acquired in real time, semantic images are obtained through semantic segmentation of camera image frames, semantic information is mapped to the point cloud frame to acquire category information of each object in the point cloud, semantic tag information is given to the point cloud points, and meanwhile, position information, strength information and semantic category information of the middle points of the point cloud frame are stored, so that construction of the semantic point cloud frame is achieved. FIG. 3 is a graph of the semantic segmentation effect of an image frame; fig. 4 shows a map semantic point cloud frame effect diagram.

First, image frames acquired by a camera are processed using a real-time semantic segmentation network to acquire semantic images of each frame. However, since the de-distortion processing of the laser radar point cloud frame and the semantic segmentation processing of the image frame affect the time stamp synchronization of the image frame and the point cloud frame, the time stamp alignment operation needs to be performed on the de-distorted point cloud frame and the segmented semantic image frame.

And after the time stamps are aligned, projecting the point cloud frames onto the corresponding image frames according to a conversion relation matrix of the camera and the radar, which is acquired in real time. Because of the difference in camera and radar visibility, the point cloud frames need to be processed prior to projection. The point cloud frames under the lidar coordinate system are converted to be represented under the camera FLU coordinate system, and then points that are not within the camera field of view are filtered out. And calculating the row-column index of the points in the distance image according to the coordinates of the points, and reserving the points with closer distances. And then, converting the 3D coordinate points into 2D coordinate points according to the camera model, and obtaining corresponding semantic point labels according to the 2D coordinates.

Due to the sensitivity of the mapping to the image frames and the point cloud frame time stamps, an edge point cloud semantic optimization method is designed. The method calculates whether a point which is different from the semantic label of the current point exists in the neighborhood range of each point cloud point mapped into the image frame, and marks the point as an edge point if the point exists, namely the point is the boundary between two kinds of semantics. Considering the characteristic that the intensities of the adjacent points are basically consistent, traversing the points of all pixels in the image, judging whether edge points exist in adjacent ranges, if so, judging whether the intensity information between the two points is in a threshold range, and if so, assigning the semantic label of the point to the edge point.

In addition, for the problem of mismapping of a laser point cloud frame to a sky region in an image frame, a distance threshold for mapping the point cloud is specified.

Step A3: and the point cloud segmentation sub-module. Firstly, projecting a semantic point cloud frame into a distance image, calculating the included angle between the point in the image and the adjacent point according to the loop line of the point, and judging the ground if the included angle is smaller than a threshold value, so that the ground point cloud is separated. And traversing the point cloud after the ground is segmented through depth priority to recursively find points, comparing the relative angles of the four adjacent points, if the angles are larger than a threshold value, considering the point cloud clusters as the same point cloud cluster, removing the point cloud clusters with smaller quantity, realizing point cloud denoising, and marking the same point cloud cluster.

Step A4: and the characteristic point extraction sub-module. And equally dividing the 360-degree direction of the point cloud frame into 6 areas, calculating the curvature of the middle point of each area, and if the curvature is larger than a threshold value, the line characteristic is a line characteristic, and if the curvature is smaller than the threshold value, the plane characteristic is a plane characteristic. 4 face features and 2 line features are selected within each region. The curvature is calculated by taking the depth difference between the current point and each of 5 points on the left and right sides of the current point, and the formula is shown in formula 2.

（2）

In the formula (2): A set of points is represented and, Representing the depth of the current point,Representing the depth of the adjacent point(s),Representing the curvature of the current point.

In the feature point selection process, the number and the category of feature points in each direction are fixed. At the same time, the surrounding points that have been selected as feature points are not selected any more.

Step A5: and a laser odometer sub-module. And carrying out inter-frame matching on the line features and the surface features extracted by the feature point extraction module to obtain a transformation matrix between two frames.

Firstly, according to the result of point cloud clustering in the step A1, the label information is used as constraint conditions of inter-frame matching, and only the points with the same label can be matched. Then, the LM method is adopted to obtain the pose of the frame. Optimizing the distance from two adjacent frame points to the surface through the surface feature matching, thereby obtaining（Indicating the distance of movement in the z-axis direction,The angle of the roll is indicated,Representing the pitch angle), optimizing the distance from two adjacent frame points to the line through line feature matching to obtain（Indicating the distance of movement in the x-axis direction,Representing the distance of movement in the y-axis direction,An angle representing a heading). Finally, willAndAnd (5) fusing to obtain a transformation matrix.

The distance calculation formula of the surface feature points is shown as (3):

（3）

wherein: The coordinates are represented by a set of coordinates, Represent the firstThe frame of the frame is a frame of a frame,Is shown in the firstOne point in the frame is indicated as a "point" in the frame,Is shown in the firstFrame in-frame departureThe point of the closest approach is,Is indicated at the pointOn the scanning beamThe point of the closest approach is,Is indicated at the pointOn adjacent beams of the scanned beamThe nearest point, L, represents a characteristic point on the radar,Representing the coordinates of the points to be matched,Representing the coordinates of the matched points.

The distance calculation formula of the line characteristic points is shown as (4):

（4）

wherein: The coordinates are represented by a set of coordinates, Represent the firstThe frame of the frame is a frame of a frame,Is shown in the firstOne point in the frame is indicated as a "point" in the frame,Is shown in the firstFrame in-frame departureThe point of the closest approach is,Is shown in the firstAt a point in a frameAt the scanning beamThe nearest point.

Step A6: and a semantic map construction sub-module. And obtaining a matching pose from the frame to the map by matching the feature set with the surrounding point cloud map, and obtaining a transformation matrix for constraint among the nodes by using an LM method.

And carrying out fusion and splicing on feature point sets with semantic tag information within the current position requirement range to obtain a semantic point cloud map, selecting a group of feature sets closest to the feature sets as new nodes and nodes in the point cloud map to calculate space distance constraint, and obtaining pose transformation through LM optimization.

And simultaneously, according to the correspondence of the IMU data time stamp and the laser scanning time stamp, calculating the roll angle and the pitch angle of the IMU at the current moment through linear interpolation, setting weight according to the square root of the LM residual error, fusing the angle of the IMU data with the angle matched with the laser scanning, and updating to obtain final transformation.

（5）

Wherein: The fusion coefficient is represented by a number of coefficients, Representing the transformation of the IMU,Representing a transformation of the laser scan matching.

In addition, matching the current frame and the historical point cloud frame using an ICP algorithm with a loop detection module, if matching is successful, pose optimization is performed using iSAM (delta smoothing and mapping, INCREMENTAL SMOOTHING AND MAPPING). The historical point cloud frame is the point cloud frame used for extracting the characteristic points.

The mapping effect of SLAM is shown in FIG. 5.

And (B) step (B): and calculating a semantic point cloud map, and converting the semantic point cloud map into a matrix under a Apriltag-code coordinate system. And identifying Apriltag codes through a camera at the unmanned equipment end, and obtaining camera external parameters according to homography transformation calculation.

Step B1: and calibrating the camera to obtain an internal reference matrix and distortion parameters of the camera as prior.

Step B2: and forming four point pairs according to four corner points extracted from Apriltag codes and four corresponding assumed points in a world coordinate system, and solving by a direct linear method to obtain a homography matrix. And obtaining a conversion matrix from the world coordinate system to the camera coordinate system through the homography matrix and the camera internal parameters.

The server side comprises the following modules, and the specific functional steps comprise:

The method comprises the steps of obtaining a semantic point cloud map and a conversion matrix message issued by an unmanned equipment end at a server end, converting the point cloud from a radar coordinate system to a Apriltag coordinate system according to the conversion matrix, and storing the point cloud to the server end. And then, the point cloud coordinate position information is sent to an augmented reality device end through TCP.

Step C: and a message acquisition module. The ROS support distributed multi-machine communication, a node obtained by running a message from the unmanned equipment end at the server end is set according to a specific ROS topic by setting an ROS master-slave machine mode between the unmanned equipment end and the server end, the obtaining frequency is set, and a conversion matrix, a point cloud map constructed by the unmanned equipment end, a real-time positioning track and other information are read and stored in the message.

Step D: and the point cloud processing module. And converting the quaternion obtained according to the ROS message into a rotation matrix, generating a conversion matrix by combining the translation vector, and performing matrix multiplication on the conversion matrix and the point coordinate position of the point cloud to obtain the coordinate of the point cloud under the Apriltag code coordinate system.

The coordinate system transformation formula is shown as 6 and 7:

(6)

(7)

Wherein, In order to transform the matrix,In order to rotate the matrix is rotated,Is a translation vector.Is the midpoint of the point cloudIn the coordinates of the radar coordinate system,Is the midpoint of the point cloudCoordinates in Apriltag coordinate system.

The augmented reality equipment end comprises the following modules, and the specific functional steps comprise:

Step E: and a coordinate transformation module. The augmented reality equipment and the unmanned equipment end recognize Apriltag codes at the same position, a conversion matrix is calculated, and the augmented reality equipment end is placed under a Apriltag code coordinate system, so that the augmented reality equipment end and the unmanned equipment end are kept under the same coordinate system.

Step F: and a display module. And acquiring the coordinate position of the point cloud midpoint sent by the server side through TCP, and acquiring the position of the current unmanned equipment side. And drawing the point on the augmented reality equipment, and displaying the map constructed by the unmanned equipment end in real time. The effect of the environmental point cloud constructed by the unmanned device is shown in fig. 6 (a) and 6 (b), and the display result is shown in fig. 7.

Step F1: and a model drawing submodule. The points are represented by the form of blocks through the Unity engine, and different colors are presented according to different semantics.

Step F2: and a virtual-real fusion sub-module. Rendering the modeled point cloud, and realizing positioning registration of the virtual scene and the real scene according to coordinate transformation among the model matrix, the view matrix and the projection matrix, so as to accurately present the virtual information into the real environment.

Step F3: and a man-machine interaction sub-module. The interactive function of the user is realized, and the scene is controlled by identifying operations such as gestures of the user and the like and generating a control command.

Step G: and making a decision on the next walking movement of the unmanned equipment through the perception information received by the augmented reality equipment, and finally receiving and executing the command by the unmanned equipment.

The unmanned equipment end is provided with a movement control module for controlling the advancing direction and speed of the unmanned equipment, and the unmanned equipment can move in any direction through the wireless remote controller. According to the scene environment which is acquired and constructed by the unmanned equipment in real time, the information is converted into a virtual scene through the augmented reality equipment display unit, and a user can view the position of the unmanned equipment and identify the structure and semantic information of the environment in real time, so that the unmanned equipment cooperates with the unmanned equipment, a large-scale exploration is realized, and the efficiency and accuracy of remote operation are improved.

The invention provides a man-machine collaborative system for exploring an unknown environment, which aims at the problem of exploring a real environment by man-machine collaboration by using unmanned equipment such as an unmanned vehicle and augmented reality equipment in the unknown environment. The system detects the environment by using unmanned equipment, processes information through a server, finally realizes virtual-real registration display in the augmented reality equipment, and simultaneously presents semantic information of an object. Through the exploration map construction of the unmanned equipment and the real-time display of the enhanced equipment, the advancing direction of the unmanned equipment is decided and controlled, so that the exploration of the region of interest is realized and the obstacle is avoided in time. Meanwhile, the system can achieve the purposes of protecting the safety of people, avoiding potential hazards, improving exploration efficiency and accurately registering the position relationship among the people, unmanned equipment and the environment under the condition that the people do not enter the unknown environment, so that the spatial structure and object information invisible on the corresponding physical position in the unknown environment are comprehensively known. The invention effectively improves the interaction cooperativity between human and machine, enhances the perception level of the environment, improves the safety and is beneficial to better decision making of people.

Finally, it should be noted that the above embodiments are only for illustrating the technical solution of the present application and are not limiting. Although the present application has been described in detail with reference to the embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the present application, which is intended to be covered by the appended claims.

Claims

1. A human-machine collaboration system for exploring an unknown environment, the system comprising: the system comprises an unmanned equipment end, a server end and an augmented reality equipment end; wherein,

2. The human-computer collaborative system for exploring an unknown environment according to claim 1, wherein the unmanned device side comprises an environmental information acquisition device, a camera, a SLAM mapping module, and an IMU inertial module;

3. The human-computer collaborative system for exploring an unknown environment according to claim 2, wherein the SLAM mapping module comprises a point cloud de-distortion sub-module, a point cloud semantic acquisition sub-module, a point cloud segmentation sub-module, a feature point extraction sub-module, a laser odometer sub-module, and a semantic map construction sub-module;

The step S1 includes:

；

wherein: Respectively representing the pose corresponding to the starting moment and the current moment of each point,/> Representing a transformation matrix;

step S1-4: the feature point extraction submodule equally divides the 360-degree direction of the denoised point cloud frame into a plurality of areas, calculates the curvature of the midpoint of each area, and if the curvature is larger than a fifth set threshold value, the line feature is the line feature, and if the curvature is smaller than the fifth set threshold value, the plane feature is the plane feature; selecting a plurality of face features and a plurality of line features within each region;

；

Wherein,Representing the fusion coefficient,/>Transformation matrix representing IMU inertial module,/>Representing the transform matrix that the environmental information acquisition device scans for matches.

4. A human-machine collaboration system to explore an unknown environment as claimed in claim 3, wherein said step S1-2 comprises:

5. The human-machine collaboration system to explore an unknown environment of claim 4, wherein S1-2 further comprises:

6. A human-machine collaboration system to explore an unknown environment as claimed in claim 3, wherein said steps S1-5 comprise:

According to the result of the point cloud clustering in the step S1-3, the tag information is used as constraint conditions of inter-frame matching, and only the points with the same tag can be matched; obtaining a frame pose by an LM method, and optimizing the distance from two adjacent frame points to the surface by surface feature matching to obtain: wherein/> Representing the distance of movement of two frames in the z-axis direction,/>Representing the angle of two frames of roll,/>Representing the angle of pitch of two frames; optimizing the distance from two adjacent frame points to the line through line characteristic matching to obtain/>Wherein/>Representing the distance of movement of two frames in the x-axis direction,/>Representing the distance of movement of two frames in the y-axis direction,/>An angle representing the heading of two frames; will/>And/>Fusion is carried out to obtain a transformation matrix between two frames;

；

wherein: represents the/> Coordinates of an i-th feature point in +1 frame; /(I)Represents the/>Coordinates of a point closest to the i-th feature point in the frame, the point being called j; /(I)Represents the/>In a frame at a point/>On the scanning beamCoordinates of the nearest point; /(I)Representing the point/>, in the kth frameAdjacent beam departure/>, of the scanned beamCoordinates of the nearest point;

distance of line feature points The calculation formula of (2) is as follows:

。

7. The human-machine collaboration system for exploring an unknown environment of claim 2, wherein step S2 comprises:

8. The human-computer collaborative system for exploring an unknown environment according to claim 1, wherein the server side comprises a message acquisition module and a point cloud processing module;

The working process of the server side comprises the following steps:

The coordinate system transformation formula is expressed as follows:

；

Wherein, Representing a transformation matrix,/>Representing a rotation matrix,/>Representing translation vector,/>Representing a point cloud midpointCoordinates in unmanned device coordinate System,/>Representing the point cloud midpoint/>Coordinates in Apriltag coordinate system.

9. The human-machine collaboration system for exploring an unknown environment of claim 1, wherein the augmented reality device side comprises a coordinate transformation module and a display module;

10. The human-machine collaboration system for exploring an unknown environment of claim 9, wherein the display module comprises a model drawing sub-module and a virtual-real fusion sub-module;

the step S6 includes:

11. The human-computer collaboration system for exploring an unknown environment of claim 10, wherein the display module further comprises a human-computer interaction sub-module for implementing an interaction function with a user, and wherein the control command is generated by identifying a user operation to implement a control of a scene.