CN109887057B

CN109887057B - Method and device for generating high-precision map

Info

Publication number: CN109887057B
Application number: CN201910156262.XA
Authority: CN
Inventors: 沈栋; 李昱辰; 钱炜
Original assignee: Hangzhou Fabu Technology Co Ltd
Current assignee: Hangzhou Fabu Technology Co Ltd
Priority date: 2019-01-30
Filing date: 2019-03-01
Publication date: 2023-03-24
Anticipated expiration: 2039-03-01
Also published as: CN109887057A

Abstract

The invention discloses a method and a device for generating a high-precision map without depending on RTK (real-time kinematic positioning). The method comprises the following steps: acquiring picture data from a camera; acquiring point cloud data from a laser radar; acquiring attitude information from a Global Navigation Satellite System (GNSS)/Inertial Measurement Unit (IMU); processing the picture data to extract visual characteristic information, and performing attitude estimation according to the visual characteristic information extracted by the current frame, the previously fused overall attitude information and the previously maintained map visual characteristic information to obtain camera attitude estimation; processing the point cloud data to obtain point cloud information, and performing attitude estimation according to the point cloud information and other information to obtain radar attitude estimation; fusing the attitude information, the camera attitude estimation and the radar attitude estimation to obtain a more accurate and stable attitude estimation result; and according to a more accurate and stable attitude estimation result, fusing the picture data and the point cloud data to construct a high-precision map.

Description

Method and device for generating high-precision map

Technical Field

The present invention relates generally to the field of autopilot, and more particularly to methods and apparatus for generating high precision maps and performing position fixes based on cameras, lidar and GNSS.

Background

As technology evolves, robotic vehicle (e.g., "UAV" or "drone, drone") autopilot technology begins to grow into the hotspot. Robotically driven vehicles are being developed for a wide range of applications, and a key component in automated driving is to know where vehicles are and where they are going, where high-precision maps play an important role, and how to generate high-precision maps is a core problem in the field of automated driving.

A general high-precision map scheme depends on a high-precision GPS (Real-Time Kinematic) and Inertial Measurement Unit (IMU) combined navigation system with an RTK (Real-Time-Kinematic), but the cost is too high, and meanwhile, in an environment with weak or even no signal, such as a high-rise building, a tunnel, an underground garage, or the like, or in the case of a vehicle turning rapidly, shaking, and bad weather, an error is far from reaching the requirement of a high-precision centimeter (cm) level.

Robotic vehicles are typically equipped with a camera capable of capturing images, image sequences or video, a radar device capable of obtaining radar point cloud data, and a GNSS receiver capable of receiving and processing navigation signals. The robotic vehicle may use the captured images, radar point cloud data, and GNSS signals to generate high precision maps, perform vision based navigation and positioning, providing a flexible, scalable, and low cost solution for navigating the robotic vehicle in various environments.

Disclosure of Invention

In view of the above, the present disclosure provides a method, an apparatus, a device, and a computer storage medium for generating a high-precision map and performing positioning based on a camera, a lidar, and a GNSS, which are used to achieve high-precision maps of a cm level in various driving scenes, thereby improving safety of automatic driving of a robot-driven vehicle.

In one aspect, embodiments of the present invention provide a method for generating a high-precision map, the method comprising: acquiring picture data from a camera; acquiring point cloud data from a laser radar; acquiring attitude information from a Global Navigation Satellite System (GNSS)/Inertial Measurement Unit (IMU); processing the picture data to extract visual characteristic information, and performing attitude estimation according to the visual characteristic information extracted by the current frame, the previously fused overall attitude information and the previously maintained map visual characteristic information to obtain camera attitude estimation; processing the point cloud data to obtain point cloud information, and performing attitude estimation according to the point cloud information and other information to obtain radar attitude estimation; fusing the attitude information, the camera attitude estimation and the radar attitude estimation to obtain a more accurate and stable attitude estimation result; and fusing the picture data and the point cloud data according to the more accurate and stable attitude estimation result to construct a high-precision map. Other information here may be previously fused pose information and previously saved point cloud information.

In one embodiment of the present disclosure, the picture data, the point cloud data, and the pose information are pre-processed prior to local storage.

In one embodiment of the present disclosure, the pre-processing includes parsing, time synchronizing, and filtering the picture data, the point cloud data, and the pose information.

In one embodiment of the present disclosure, the attitude information includes acceleration, angular velocity, coordinates, etc. provided by the GNSS/IMU.

In an embodiment of the disclosure, processing the picture data acquired by the camera to extract the visual feature information, and performing pose estimation according to the visual feature information extracted by the current frame, previously fused overall pose information, and previously maintained map visual feature information to obtain the camera pose estimation includes: identifying dynamic obstacles by using a deep learning correlation technique; filtering the dynamic barrier to obtain final picture data; processing the final picture data to acquire the visual characteristic information; based on the visual characteristic information, rough estimation of the current camera attitude is obtained by utilizing the image data attitude information maintained in the previous frame, and then frame and frame matching is carried out to preliminarily optimize the camera attitude; and performing feature matching by using the previously maintained map information, constructing an optimization equation, and optimizing the camera attitude again to obtain the camera attitude estimation.

In one embodiment of the present disclosure, processing the point cloud data to obtain point cloud information, and performing attitude estimation according to the point cloud information and other information to obtain radar attitude estimation includes: extracting characteristic points of the laser radar point cloud data by using structural information; according to the motion data of other information, utilizing the angle of the specific point to perform linear interpolation to calculate the time interval of the specific point relative to the frame start; performing speed compensation by using the time interval to obtain point cloud data corrected with distortion; matching with the structural feature points of the previous frame, and primarily optimizing the attitude information by using the matching information between two adjacent frames; and then, the maintained global map is utilized to carry out feature point matching for optimization so as to obtain more accurate attitude information.

In an embodiment of the disclosure, fusing the pose information, the camera pose estimation, and the radar pose estimation to obtain a more accurate and stable pose estimation result includes: calibrating the pose information, the camera pose estimate, and the radar pose estimate.

In an embodiment of the disclosure, the fusing the picture data and the point cloud data to construct a high-precision map according to the more accurate and stable attitude estimation result includes: projecting the radar point cloud data to a specified coordinate system; projecting the picture data to a prescribed coordinate system; and fusing the radar point cloud data and the picture data projected to a specified coordinate system based on the more accurate and stable attitude estimation result to construct a high-precision map.

In one embodiment of the present disclosure, the prescribed coordinate system is a world coordinate system.

In one embodiment of the present disclosure, the calibration employs kalman filtering.

In another aspect, embodiments of the present invention provide an apparatus for generating a high-precision map. The apparatus may include an acquisition module to: acquiring picture data from a camera; acquiring point cloud data from a laser radar; attitude information is obtained from the GNSS/IMU. The apparatus may also include a pre-processing module to pre-process the picture data, the point cloud data, and the pose information for local storage. The apparatus may further include a camera data processing module configured to process the picture data to extract visual feature information, perform pose estimation according to the visual feature information extracted by the current frame, previously fused global pose information, and previously maintained map visual feature information to obtain a camera pose estimation. The apparatus may also include a lidar data processing module to process the point cloud data to obtain point cloud information and to perform attitude estimation from the point cloud information and other information to obtain radar attitude estimation. The apparatus may further include a fusion module to fuse the pose information, the camera pose estimate, and the radar pose estimate to obtain a more accurate and stable pose estimate. The device can also comprise a map building module which is used for fusing the picture data and the point cloud data according to the more accurate and stable attitude estimation result to build a high-precision map. Other information here may be previously fused pose information and previously saved point cloud information.

Various embodiments may also include a robotically driven vehicle having a high-precision map generation arrangement including a transceiver, a memory, and a processor configured with processor-executable instructions to perform the operations of the method outlined above. Various embodiments include a processing device for use in a robotic driven vehicle configured to perform the operations of the method outlined above. Various embodiments include a non-transitory processor-readable medium having stored thereon processor-executable instructions configured to cause a processor of a robotic-driven vehicle to perform operations of the method outlined above.

Drawings

The accompanying drawings, which are incorporated herein and constitute part of this specification, illustrate exemplary embodiments and, together with the general description given above and the detailed description given below, explain features of the various embodiments.

FIG. 1 illustrates an environment or system suitable for use to implement embodiments of the present invention;

FIG. 2 is a block diagram illustrating components of a high-precision map generation device for use in a robotic-driven vehicle, in accordance with an embodiment of the present invention;

FIG. 3 shows a schematic diagram illustrating the point cloud distortion problem of acquired lidar data and its solution, in accordance with an embodiment of the invention;

FIG. 4 shows a schematic diagram of a method for fusing acquired processed data to obtain pose information of higher accuracy, according to an embodiment of the invention;

FIG. 5 illustrates a schematic diagram for projecting radar point cloud data and camera data onto a unified coordinate system to generate a high-precision map using final pose information, according to an embodiment of the present invention;

FIG. 6 shows a schematic flow chart diagram of a method for generating a high accuracy map, according to an embodiment of the present invention; and

fig. 7 shows a schematic block diagram of an apparatus for generating a high-precision map according to an embodiment of the present invention.

In the drawings, the same or similar reference numerals are used to denote the same or similar elements.

Detailed Description

Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present disclosure have been illustrated in the accompanying drawings, it is to be understood that the present disclosure may be embodied in other various forms and should not be limited to the specific embodiments described below. These specific embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

The word "exemplary" is used herein to mean "serving as an example, instance, or illustration. Any aspect described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other aspects.

As used herein, the terms "robotic vehicle" and "drone", "drone vehicle" refer to: one of various types of vehicles that include an in-vehicle computing device configured to provide some autonomous or semi-autonomous capability. Examples of robotically driven vehicles include, but are not limited to: aircraft such as Unmanned Aerial Vehicles (UAVs); land vehicles (e.g., autonomous or semi-autonomous cars, etc.); water-based vehicles (i.e., vehicles configured to operate on the surface of water or underwater); space-based vehicles (e.g., spacecraft or space probes); and/or some combination thereof. In some embodiments, the robotic driving vehicle may be manned. In other embodiments, the robotic vehicle may be unmanned. In some implementations, the robotic vehicle may be an aircraft (unmanned or manned), which may be a rotorcraft or a winged aircraft.

Various embodiments may be implemented in various robotic vehicles that may communicate with one or more communication networks, an example of which may be suitable for use in connection with various embodiments is shown in fig. 1.

Referring to FIG. 1, a system or environment 1 may include one or more robotic vehicles 10, GNSS 20, and communication network 50. Although the robotic vehicle 10 is shown in fig. 1 as communicating with the communication network 50, the robotic vehicle 10 may or may not communicate with any communication network relating to any of the methods described herein.

In various embodiments, the robotic vehicle 10 may include one or more cameras 140, the one or more cameras 140 configured to obtain images, provide the image data to the processing device 110 of the robotic vehicle 10.

In various embodiments, the robotic vehicle 10 may include one or more lidar 150 configured to obtain radar point cloud data, provide the obtained radar point cloud data to the processing device 110 of the robotic vehicle 10.

The robotic vehicle 10 may navigate or determine position using a navigation system such as a Global Navigation Satellite System (GNSS), global Positioning System (GPS), etc., and attitude information of the robotic vehicle may be obtained using the GNSS/IMU. In some embodiments, the robotic vehicle 10 may use alternative positioning signal sources (i.e., other than GNSS, GPS, etc.).

The robotic vehicle 10 may include a processing device 110, and the processing device 110 may be configured to monitor and control various functions, subsystems, and/or other components of the robotic vehicle 10. For example, the processing device 110 may be configured to monitor and control various functions of the robotic vehicle 10, such as modules, software, instructions, circuitry, hardware related to propulsion, power management, sensor management, navigation, communication, actuation, steering, braking, and/or vehicle operating mode management.

The processing device 110 may house various circuits and devices for controlling the operation of the robotic vehicle 10. For example, the processing device 110 may include a processor 120 that instructs control of the robotic vehicle 10. The processor 120 may include one or more processors configured to execute processor-executable instructions (e.g., applications, routines, scripts, instruction sets, etc.) to control the operation of the robotic vehicle 10 (which includes the operation of various embodiments herein). In some embodiments, the processing device 110 may include a memory 122 coupled to the processor 120 that is configured to store data (e.g., picture data, acquired GNSS/IMU sensor data, radar point cloud data, received messages, applications, etc.). The processor 120 and memory 122, as well as other elements, may be configured as or include a system on a chip (SOC) 115. The processing device 110 may include more than one SOC115, thereby increasing the number of processors 120 and processor cores. The processing device 110 may also include a processor 120 that is not associated with the SOC 115. Each processor 120 may be a multi-core processor.

The term "system on a chip" or "SOC" as used herein refers to a set of interconnected electronic circuits that typically, but not exclusively, include one or more processors (e.g., 120), memory (e.g., 122), and communication interfaces. SOC115 may include various different types of processors 120 and processor cores, such as general purpose processors, central Processing Units (CPUs), digital Signal Processors (DSPs), graphics Processing Units (GPUs), accelerated Processing Units (APUs), subsystem processors of specific components of a processing device (e.g., an image processor for a high precision map generation apparatus (e.g., 130) or a display processor, an auxiliary processor, a single-core processor, and a multi-core processor for a display). The SOC115 may also include other hardware and hardware combinations, such as Field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), other programmable logic devices, discrete gate logic, transistor logic, performance monitoring hardware, watchdog hardware, and time references. An integrated circuit may be configured such that components of the integrated circuit reside on a single piece of semiconductor material (e.g., silicon).

The processing device 110 may also include or be connected to one or more sensors 136, which the processor 120 may use to determine information associated with vehicle operation and/or information associated with the external environment corresponding to the robotic vehicle 10 to control various processes on the robotic vehicle 10. Examples of such sensors 136 include accelerometers, gyroscopes, and electronic compasses configured to provide data to the processor 120 regarding changes in direction and motion of the robotic vehicle 10. For example, in some embodiments, the processor 120 may use data from the sensors 136 as input for determining or predicting motion data of the robotic-driven vehicle 10. Various components within the processing device 110 and/or the SOC115 may be coupled together by various circuitry (e.g., a bus or other similar circuitry).

The processing device 110 may further include a high-precision map generation apparatus 130 that may pre-process picture data obtained from the camera 140, point cloud data obtained from the lidar 150, and pose information obtained from the GNSS/IMU for local storage, process the locally stored picture data to extract visual feature information, perform pose estimation from the visual feature information extracted from the current frame, previously fused global pose information, and previously maintained map visual feature information to obtain camera pose estimation, process the locally stored point cloud data to obtain point cloud information, and then perform pose estimation from the point cloud information and other information to obtain radar pose estimation. Other information here may be previously fused pose information and previously saved point cloud information. The high-precision map generating device 130 may further fuse the obtained attitude information, the camera attitude estimation, and the radar attitude estimation to obtain a more accurate and stable attitude estimation result, and fuse the picture data and the point cloud data based on the more accurate and stable attitude estimation result to construct the high-precision map.

Although the various components of the processing device 110 are shown as separate components, some or all of the components (e.g., the processor 120, the memory 122, and other units) may be integrated together in a single device or module (e.g., a system-on-a-chip module).

Various embodiments may be implemented in a high-precision mapping device 200 of a robotic vehicle, an example of which is shown in fig. 2. Referring to fig. 1-2, a high precision map generation device 200 suitable for use with various embodiments may include a camera 140, a processor 208, a memory 210, a lidar element 212, and a map generation unit 214. Further, the high-precision mapping device 200 may include an Inertial Measurement Unit (IMU) 216 and an environmental detection system 218.

The camera 140 may include at least one image sensor 204 and at least one optical system 206 (e.g., one or more lenses). The camera 140 may obtain one or more digital images (sometimes referred to herein as image frames). The cameras 140 may include a single monocular camera, a stereo camera, and/or an omnidirectional camera. In some embodiments, the camera 140 may be physically separate from the high precision mapping device 200, for example, located outside of the robotic vehicle and connected to the processor 208 via a data cable (not shown). In some embodiments, camera 140 may include another processor (not shown) that may be configured with processor-executable instructions to perform one or more of the operations of the various embodiment methods.

In some embodiments, memory 210 or another memory such as an image buffer (not shown) may be present within camera 140. For example, the camera 140 may include a memory configured to buffer (i.e., temporarily store) image data from the image sensor 204 prior to processing the data (e.g., by the processor 208). In some embodiments, the high precision mapping device 200 may include an image data buffer configured to buffer (i.e., temporarily store) image data from the camera 140. Such cached image data may be provided to the processor 208 or may be accessed by the processor 208 or other processor configured to perform some or all of the operations in various embodiments.

Lidar element 212 may be configured to capture one or more lidar point cloud data. The captured one or more lidar point cloud data may be stored in memory 210.

The high-precision map generating apparatus 200 may include: an inertial measurement unit 216 (IMU) configured to measure various parameters of the robotic vehicle 10. IMU216 may include one or more of a gyroscope, an accelerometer, and a magnetometer. The IMU216 may be configured to detect changes in pitch, roll, and yaw axes associated with the robotic vehicle 10. The IMU216 output measurements may be used to determine the altitude, angular velocity, linear velocity, and/or position of the robotic vehicle.

In some embodiments, the map generation unit 214 may be configured to use information extracted from images captured by the camera 140, one or more lidar point cloud data captured from the lidar component 212, and pose information obtained from the IMU216 to generate a high-precision map, determining various parameters for navigating within the environment, for navigating within the environment of the robotic-driven vehicle 10.

In addition, the high-precision mapping device 200 optionally includes an environmental detection system 218. The environment detection system 218 may be configured to detect various parameters associated with the environment surrounding the robotic vehicle 10. The environment detection system 218 may include one or more of an ambient light detector, a thermal imaging system, an ultrasound detector, a radar system, an ultrasound system, a piezoelectric sensor, a microphone, and so forth. In some embodiments, the parameters detected by the environment detection system 218 may be used to detect ambient light levels, detect various objects within the environment, identify the location of each object, identify object materials, and so forth. In some embodiments, attitude estimation may be based on measurements output by environment detection system 218.

In various embodiments, one or more of images captured by one or more cameras of cameras 140, measurements obtained by IMU216, one or more lidar point cloud data captured from lidar element 212, and/or measurements obtained by environment detection system 218 may be time stamped. Map generation unit 214 may use these timestamp information to extract information from one or more images captured by camera 140 and/or from one or more lidar point cloud data captured by lidar element 212 and/or to navigate in the environment of robotic-driven vehicle 10.

Processor 208 may be coupled to (e.g., in communication with) camera 140, one or more image sensors 204, one or more optical systems 206, memory 210, lidar element 212, map generation unit 214, and IMU216, and optional environment detection system 218. The processor 208 may be a general purpose single-or multi-chip microprocessor (e.g., an ARM processor), a special purpose microprocessor (e.g., a Digital Signal Processor (DSP)), a microcontroller, a programmable gate array, or the like. The processor 208 may be referred to as a Central Processing Unit (CPU). Although a single processor 208 is shown in fig. 2, the high-precision map generating device 200 may include multiple processors (e.g., a multi-core processor) or a combination of different types of processors (e.g., an ARM and a DSP).

The processor 208 may be configured to implement the methods of the various embodiments to generate high precision maps and/or navigate the robotic vehicle 10 in an environment.

The memory 210 may store data (e.g., image data, radar point cloud data, GNSS/IMU measurements, timestamps, data associated with the map generation unit 214, etc.), and instructions that may be executed by the processor 208. In various embodiments, examples of instructions and/or data that may be stored in memory 210 may include image data, gyroscope measurement data, radar point cloud data, camera auto-calibration instructions, and so forth. The memory 210 may be any electronic component capable of storing electronic information including, for example, random Access Memory (RAM), read Only Memory (ROM), magnetic disk storage media, optical storage media, flash memory devices in RAM, on-board memory accompanying a processor, erasable Programmable Read Only Memory (EPROM), electrically Erasable Programmable Read Only Memory (EEPROM), registers, and so forth, including combinations thereof.

Of course, it should be understood by those skilled in the art that the high-precision map generating device 200 may be, for example, a server or a computer, or may be an intelligent terminal, such as an electronic lock, a smart phone, a smart tablet, and the like, and the invention is not limited thereto.

The mechanisms and principles of embodiments of the present invention will be described in detail below. The term "based on" as used hereinafter and in the claims means "based, at least in part, on" unless specifically stated otherwise. The term "including" means the open inclusion, i.e., "including but not limited to". The term "plurality" means "two or more". The term "one embodiment" means "at least one embodiment". The term "another embodiment" means "at least one additional embodiment". Definitions of other terms will be given in the following description.

FIG. 3 shows a schematic diagram illustrating the point cloud distortion problem of acquired lidar data and its solution, according to an embodiment of the invention.

The laser radar obtains the distance of the point cloud under the laser radar coordinate system according to the reflection time difference, and performs conventional pretreatment on the point cloud, more importantly, the problem of point cloud distortion is solved.

The motion of the radar itself can cause distortion of the resulting point cloud. If the frame rate of the radar is fast compared to the external motion, the problem of point cloud distortion caused by motion in one scan is not much affected, whereas if the scan rate is slow, especially for a 2-axis radar, where one axis is significantly slower than the other, the distortion problem is very significant. Other sensors are typically used to derive the velocity for compensation.

The problem of point cloud distortion can be explained by fig. 3, where the laser is rotated through a plane of rotation to 360 degrees. Under a laser radar coordinate system, the object distance is the time interval between laser emission and laser reception multiplied by the light speed, but the laser radar is not at the same position in 2 times of laser emission and laser reception due to the forward movement of the carrier, so that the obtained distance is incorrect, and point cloud distortion is caused.

The point cloud can be calibrated in case motion data (calculated by other sensors) in the radar coordinate system has been obtained. First, the time interval of the point relative to the current frame for starting scanning needs to be calculated, and then the result is updated according to the time interval. According to the characteristic of uniform angular speed rotation of the laser radar, the angle of the current point is obtained by using the point cloud, and linear interpolation is carried out to obtain the time interval between the point cloud and the start of the frame. The following formula shows, where T is the time required for one scan and Δ T is the time interval between the start of the point cloud and the frame.

And performing speed compensation according to the time interval delta t to obtain the point cloud data with corrected distortion. And then, extracting the feature points of the laser radar point cloud data by using the structural information, matching the feature points with the structural feature points of the previous frame, primarily optimizing the attitude information by using the matching information between two adjacent frames, and then optimizing by using the previously maintained global map to perform feature point matching so as to obtain more accurate attitude information.

For the attitude estimation of the camera image data, a dynamic obstacle is identified by using a deep learning correlation technique, the dynamic obstacle is filtered to obtain final image data, and then the final image data is processed to extract visual feature information. Based on the visual characteristic information, rough estimation of the current camera attitude is obtained by utilizing the image data attitude information maintained in the previous frame, then frame and frame matching is carried out, the camera attitude is preliminarily optimized, finally, the map information maintained in the previous frame is utilized to carry out feature matching, an optimization equation is constructed, and the camera attitude is optimized again to obtain the camera attitude estimation.

FIG. 4 shows a schematic diagram of a method for fusing acquired processed data to obtain pose information of higher accuracy, according to an embodiment of the present invention.

As shown in fig. 4, the data processed by the camera, the radar, and the GPS/IMU are used for processing and calculating, and the final result is fused to obtain the attitude information with higher precision and better robustness.

According to the pictures provided by the camera data processing module, a motion model is constructed to solve a preliminary gesture by utilizing the SLAM correlation technology, then BA is carried out, the gesture is further optimized, the optimization goal is to minimize the projection error, the projection error is the gesture obtained according to calculation, the common observation point is projected onto the corresponding frame, and the relative distance of the matched points is calculated. And then matching is carried out to recover the corresponding posture. While the results are calibrated based on data provided by other sensors. The calibration is performed because the frame rate and the processing calculation time of the multi-sensor are different, and each obtained attitude data needs to be fused with the final attitude result independently.

Because the laser radar has a certain degree of lower elevation angle, when the carrier moves on an open road, a part of the obtained point cloud is the point cloud of the ground, and if the attitude estimation is directly carried out, a lot of ground point clouds exist. Affecting the final accuracy, it is necessary to identify the ground point cloud and eliminate the effect it has. Meanwhile, the point cloud belonging to the dynamic barrier is identified and filtered by adopting a related machine learning technology. The method comprises the steps of utilizing point cloud information provided by a laser radar, using a laser SLAM correlation technology to carry out matching, constructing an optimization equation, optimizing and solving the posture. While the results are calibrated based on data provided by other sensors. This is because the frame rates and processing computation times of the multiple sensors are different, and each obtained pose data needs to be fused with the final pose result separately.

And finally, fusing the results of the GPS/IMU, the camera and the radar multi-sensor to obtain a more accurate and more robust attitude estimation result.

Definition of coordinate system

Defining a camera coordinate system and a radar coordinate system, wherein the camera coordinate system takes the optical center of the camera as an origin, and the radar coordinate system takes the transmitting center of a radar as the origin. Coordinates X = (X, y, z) defining a point, defining a pose

R is the selection matrix of 3*3 and T is the translation vector of 3*1. Assuming that the pose of the current camera frame (defined as the ith frame) with respect to the camera start frame (0 th frame) is P, the current camera is rotated R, translated T, with respect to the camera start frame. Definition P describes the position of the current camera, while point X for any one current camera frame _i ＝(x _i ,y _i ,z _i ) All are given by the following equation:

X ₀ ＝RX _i +T

wherein, X ₀ Is X _i Corresponding to a point in the initial frame coordinate system. Obviously, the gesture is transitive:

P _a→c ＝P _a→b P _b→c

P _a→c for the pose change of the c-th frame relative to the a-th frame, P _a→b For the pose change of the b-th frame relative to the a-th frame, P _a→c Is the pose change of the c-th frame relative to the b-th frame.

Coordinate system conversion unification

For the fusion of multiple sensors, it is necessary to unify the results obtained by different sensors in their coordinate systems. Take the unification of camera sensors and radar sensors as an example. P output by visual odometer _camera-i Is that the current camera frame (defined as the ith frame) is directed toThe pose of the camera start frame (frame 0), we need to get P _liadar-i Is the pose of the current radar frame (defined as the ith frame) with respect to the radar start frame (frame 0). Meanwhile, according to the calibration, the calibration relation from the camera to the radar is known and is defined as P _calib ，P _calib The points of the radar coordinate system are transferred to the camera coordinate system. Point X for any one current radar frame (defined as the ith frame) _lidar-i Left multiplication by P _calib The corresponding coordinates in the camera frame can be obtained. I.e. P _calib Representing the current camera-to-radar pose change.

As shown in fig. 5, there are 2 paths from the camera start frame to the i-th radar frame, which are from the camera start frame to the radar start frame to the i-th radar frame, and from the camera start frame to the i-th camera frame to the i-th radar frame, respectively. Thus, an equation can be obtained

P _camera-i P _calib ＝P _calib P _liadar-i

Can obtain

After the attitude of the ith radar frame relative to the radar starting frame is obtained, the time stamp of the ith frame is known, and the IMU data such as the speed and the like of the ith radar frame under a radar starting frame coordinate system can be obtained by using the data and the time of the adjacent 2 frames.

Fig. 6 shows a schematic flow diagram of a method 600 for generating a high-precision map according to an exemplary embodiment of the present invention. The method 600 may be performed by the high accuracy map generating apparatus 130 described with reference to fig. 1, the high accuracy map generating device 200 described with reference to fig. 2. The steps included in method 600 are described in detail below in conjunction with fig. 6.

The method 600 begins at step 602 by obtaining picture data from a camera. It will be understood by those skilled in the art that the acquiring of the picture data may be, for example, acquiring of the captured picture data, acquiring of the processed captured picture data, or other means. The invention is not so limited.

At step 604, point cloud data is acquired from the lidar.

At step 606, attitude information is obtained from the GNSS/IMU. In one aspect, the attitude information includes acceleration, angular velocity, coordinates, and the like, provided by the GNSS/IMU.

At step 608, the obtained picture data is processed to extract visual feature information, and the previously fused global pose information and previously maintained map visual feature information are pose estimated to obtain camera pose estimates.

In one aspect, the acquired picture data, point cloud data, and pose information are pre-processed prior to local storage. In one aspect, the preprocessing operation may include: and analyzing, time synchronizing and screening the picture data, the point cloud data and the attitude information.

In one aspect, processing camera acquired picture data to extract visual feature information, previously fused global pose information, and previously maintained map visual feature information for pose estimation to obtain camera pose estimation, comprises: identifying dynamic obstacles by using a deep learning correlation technique; filtering the dynamic barrier to obtain final picture data; processing the final picture data to extract visual characteristic information; based on the visual characteristic information, rough estimation of the current camera attitude is obtained by utilizing the image data attitude information maintained in the previous frame, and then frame and frame matching is carried out to preliminarily optimize the camera attitude; and finally, performing feature matching by using the previously maintained map information, constructing an optimization equation, and optimizing the camera attitude again to obtain the camera attitude estimation.

At step 610, the obtained point cloud data is processed to obtain point cloud information, and attitude estimation is performed with other information according to the point cloud information to obtain radar attitude estimation. Other information here may be previously fused pose information and previously saved point cloud information. In one aspect, processing obtained point cloud data to obtain point cloud information and performing attitude estimation from the point cloud information and other information to obtain radar attitude estimation includes: extracting characteristic points of the laser radar point cloud data by using the structural information; performing linear interpolation to calculate a time interval of a specific point with respect to the start of a frame using an angle of the specific point according to motion data of other information; performing velocity compensation by using the time interval to obtain point cloud data corrected for distortion; matching with the structural feature points of the previous frame, and primarily optimizing the attitude information by using the matching information between two adjacent frames; and then, the maintained global map is utilized to carry out feature point matching for optimization so as to obtain more accurate attitude information.

In step 612, the pose information, camera pose estimation, and radar pose estimation are fused to obtain a more accurate and stable pose estimation result. In one aspect, fusing pose information, camera pose estimation, and radar pose estimation to obtain a more accurate and stable pose estimation result, comprising: the pose information, camera pose estimate, and radar pose estimate are calibrated. This is because the frame rates and processing computation times of the multiple sensors are different, and each time one pose data is obtained, the pose data needs to be fused with the final pose result. In one aspect, the calibration employs kalman filtering.

At step 614, the picture data and the point cloud data are fused according to the more accurate and stable attitude estimation result to construct a high-precision map. In one aspect, fusing the picture data and the point cloud data according to a more accurate and stable attitude estimation result to construct a high-precision map, comprising: projecting the radar point cloud data to a specified coordinate system; projecting the picture data to a specified coordinate system; and fusing the radar point cloud data and the picture data projected to a specified coordinate system based on the obtained more accurate and stable attitude estimation result to construct a high-precision map. In one aspect, the prescribed coordinate system can be a world coordinate system.

Fig. 7 provides a schematic block diagram of an apparatus 700 for generating a high-precision map according to an embodiment of the present invention.

The apparatus 700 comprises: an acquisition module 702 configured to acquire picture data from a camera, point cloud data from a lidar, and attitude information from a GNSS/IMU. In an optional aspect, the apparatus 700 may further include a pre-processing module 704 configured to pre-process the picture data, the point cloud data, and the pose information for local storage. The apparatus 700 may further include a camera data processing module 706 configured to process locally stored picture data to extract visual feature information, perform pose estimation based on the extracted visual feature information of the current frame, previously fused global pose information, and previously maintained map visual feature information to obtain a camera pose estimate. The apparatus 700 may also include a lidar data processing module 708 configured to process locally stored point cloud data to obtain point cloud information, and perform attitude estimation from the point cloud information and other information to obtain radar attitude estimates. The apparatus 700 may further include a fusion module 710 configured to fuse the pose information, camera pose estimate, and radar pose estimate to obtain a more accurate and stable pose estimate. Other information here may be previously fused pose information and previously saved point cloud information. The apparatus 700 may further include a map building module 712 configured to fuse the picture data and the point cloud data to build a high precision map according to the obtained more accurate and stable pose estimation result.

For specific implementation of the apparatus 700 provided in this embodiment, reference may be made to corresponding method embodiments, which are not described herein again.

For clarity, not all optional elements or sub-elements included in apparatus 700 are shown in fig. 7, and optional modules are shown using dashed lines. All features and operations described in the above method embodiments and embodiments that can be obtained by reference to and in conjunction with the above embodiments are applicable to the apparatus 700, respectively, and therefore will not be described in detail herein.

It will be understood by those skilled in the art that the division of the units or sub-units in the apparatus 700 is not restrictive but exemplary, but rather to facilitate understanding by those skilled in the art, the main functions or operations thereof are logically described. In the apparatus 700, the functions of one unit may be implemented by a plurality of units; conversely, a plurality of units may be implemented by one unit. The invention is not so limited.

Likewise, those skilled in the art will appreciate that the elements included in apparatus 700 may be implemented in a variety of ways, including but not limited to software, hardware, firmware or any combination thereof, and the present invention is not limited thereto.

The present invention may be a system, method, computer-readable storage medium, and/or computer program product. The computer readable storage medium may be, for example, a tangible device capable of holding and storing instructions for use by the instruction execution device.

The computer-readable/executable program instructions may be downloaded to various computing/processing devices from a computer-readable storage medium, or may be downloaded to an external computer or external storage device through various communication means. The invention is not limited in particular to the specific programming languages or instructions used to implement the computer-readable/executable program instructions.

Various aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable/executable program instructions.

The method descriptions and process flow diagrams described above are used merely as illustrative examples and are not intended to require or imply that the operations of the various embodiments must be performed in the order presented. As will be appreciated by one of ordinary skill in the art, the order of operations in the above-described embodiments may be performed in any order.

The various illustrative logical blocks, modules, circuits, and algorithm operations described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and operations have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

A general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof, configured to perform the functions described herein, may be used to implement or execute the hardware described in connection with the aspects disclosed herein to implement the various exemplary logics, logical blocks, modules, and circuits. A general-purpose processor may be a multi-processor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of receiver smart objects, e.g., a combination of a DSP and a microprocessor, a number of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration. Alternatively, some operations or methods may be performed by circuitry that is specific to a given function.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the following claims and the principles and novel features disclosed herein.

Claims

1. A method for generating a high accuracy map, the method comprising:

acquiring picture data from a camera;

acquiring point cloud data from a laser radar;

acquiring attitude information from a global navigation satellite system GNSS/inertial measurement unit IMU;

processing the picture data to extract visual characteristic information, and performing attitude estimation according to the visual characteristic information extracted by the current frame, the previously fused overall attitude information and the previously maintained map visual characteristic information to obtain camera attitude estimation;

processing the point cloud data to obtain point cloud information, and performing attitude estimation according to the point cloud information and other information to obtain radar attitude estimation;

fusing the attitude information, the camera attitude estimation and the radar attitude estimation to obtain a more accurate and stable attitude estimation result;

and fusing the picture data and the point cloud data according to the more accurate and stable attitude estimation result to construct a high-precision map.

2. The method of claim 1, further comprising:

preprocessing the picture data, the point cloud data and the pose information before local storage.

3. The method of claim 2, wherein the pre-processing comprises parsing, time synchronizing, and filtering the picture data, the point cloud data, and the pose information.

4. The method of claim 1, wherein processing the picture data acquired by the camera to extract the visual feature information, performing pose estimation from the visual feature information extracted for a current frame, previously fused global pose information, and previously maintained map visual feature information to obtain the camera pose estimation comprises:

identifying dynamic obstacles by utilizing a deep learning correlation technique;

filtering the dynamic barrier to obtain final picture data;

processing the final picture data to acquire the visual characteristic information;

based on the visual characteristic information, rough estimation of the current camera attitude is obtained by using the image data attitude information maintained in the previous frame, and then frame matching is carried out to preliminarily optimize the camera attitude;

and performing feature matching by using previously maintained map information, constructing an optimization equation, and optimizing the camera attitude again to obtain the camera attitude estimation.

5. The method of claim 1, wherein processing the point cloud data to obtain point cloud information and pose estimating from the point cloud information and other information to obtain radar pose estimates comprises:

extracting characteristic points of the laser radar point cloud data by using structural information;

performing linear interpolation to calculate a time interval of a specific point with respect to a start of a frame using an angle of the specific point according to motion data of other information;

performing speed compensation by using the time interval to obtain point cloud data corrected to be distorted;

matching with the structural feature points of the previous frame, and primarily optimizing the attitude information by using the matching information between two adjacent frames;

and then, the maintained global map is utilized to carry out feature point matching for optimization so as to obtain more accurate attitude information.

6. The method of claim 1, wherein fusing the pose information, the camera pose estimate, and the radar pose estimate to obtain a more accurate and stable pose estimate comprises:

calibrating the pose information, the camera pose estimate, and the radar pose estimate.

7. The method of claim 6, wherein the calibration employs Kalman filtering.

8. The method of claim 1, wherein fusing the picture data and the point cloud data to construct a high-precision map according to the more accurate and stable pose estimation result comprises:

projecting the radar point cloud data to a specified coordinate system;

projecting the picture data to a prescribed coordinate system;

and fusing the radar point cloud data and the picture data projected to a specified coordinate system based on the more accurate and stable attitude estimation result to construct a high-precision map.

9. An apparatus for generating a high accuracy map for performing the method of any of claims 1 to 8.

10. A computer readable storage medium for generating a high accuracy map, the computer readable storage medium having stored thereon at least one executable computer program instruction comprising computer program instructions for performing the steps of the method of any of claims 1 to 8.