WO2022156175A1 - 融合图像和点云信息的检测方法、系统、设备及存储介质 - Google Patents
融合图像和点云信息的检测方法、系统、设备及存储介质 Download PDFInfo
- Publication number
- WO2022156175A1 WO2022156175A1 PCT/CN2021/108585 CN2021108585W WO2022156175A1 WO 2022156175 A1 WO2022156175 A1 WO 2022156175A1 CN 2021108585 W CN2021108585 W CN 2021108585W WO 2022156175 A1 WO2022156175 A1 WO 2022156175A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- information
- point cloud
- pixel
- point
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 72
- 230000004927 fusion Effects 0.000 title claims abstract description 11
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 42
- 238000000034 method Methods 0.000 claims abstract description 21
- 239000011159 matrix material Substances 0.000 claims description 18
- 230000009466 transformation Effects 0.000 claims description 18
- 230000003287 optical effect Effects 0.000 claims description 15
- 230000000007 visual effect Effects 0.000 claims description 14
- 238000003384 imaging method Methods 0.000 claims description 5
- 238000012549 training Methods 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000003709 image segmentation Methods 0.000 claims description 4
- 230000001360 synchronised effect Effects 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 5
- 230000007547 defect Effects 0.000 description 4
- 230000004069 differentiation Effects 0.000 description 4
- 230000007613 environmental effect Effects 0.000 description 4
- 230000008447 perception Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000003672 processing method Methods 0.000 description 3
- 238000005259 measurement Methods 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S17/00—Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
- G01S17/86—Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S17/00—Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
- G01S17/88—Lidar systems specially adapted for specific applications
- G01S17/89—Lidar systems specially adapted for specific applications for mapping or imaging
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S17/00—Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
- G01S17/88—Lidar systems specially adapted for specific applications
- G01S17/93—Lidar systems specially adapted for specific applications for anti-collision purposes
- G01S17/931—Lidar systems specially adapted for specific applications for anti-collision purposes of land vehicles
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/48—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
- G01S7/4802—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/48—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
- G01S7/4808—Evaluating distance, position or velocity data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/80—Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30204—Marker
- G06T2207/30208—Marker matrix
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Definitions
- the invention belongs to the technical field of 3D target detection, and in particular relates to an algorithm for 3D target detection by fusion of a laser radar sensor and an image sensor.
- 3D target detection algorithms in the field of unmanned driving are mainly divided into three categories.
- the first type is based on the principle of binocular stereo vision for image depth estimation, and then the 2D image detection results are converted into 3D space; Other methods such as machine learning directly perform 3D object detection through point clouds; the third category is to fuse point cloud information from camera images and lidar sensors and then perform 3D object detection through convolutional neural networks and other mutual verification strategies.
- the first type of method Due to the limitation of the principle of binocular stereo vision, the first type of method has a much lower depth measurement accuracy than lidar sensors, especially when the object is far away from the camera, the detection accuracy and reliability of such methods will be seriously reduced.
- the second type of method has higher distance measurement accuracy than the first type of method, due to the limitation of the existing lidar sensor principle, the point cloud data obtained by it is very sparse and the information is relatively single, lacking similar auxiliary information such as color information in the image.
- the third type of method should in principle combine the advantages of the above two sensors, but the existing methods of fusing lidar sensor and image data do not make full use of the characteristics of the two sensors, so their detection accuracy is still slightly lower Due to the detection accuracy of the pure laser method, and its detection speed is relatively slow, it is difficult to achieve real-time performance.
- the purpose of the present invention is to provide a detection method, system, equipment and storage medium for fusing image and point cloud information, which overcomes the difficulties of the prior art and can more fully utilize the point cloud information and point cloud information of the radar.
- the image information of the camera can realize higher-precision real-time 3D target detection, and the detection accuracy of small objects will be significantly improved.
- the present invention combines the speed and accuracy of 3D target recognition. Self-driving cars provide safer and more reliable environmental perception information.
- An embodiment of the present invention provides a detection method for fusing image and point cloud information, comprising the following steps:
- S130 Project the laser point cloud of the lidar sensor onto the image, match each laser point in the point cloud information to the corresponding pixel point, and then add multiple feature information of the pixel point to the corresponding point in cloud information;
- S140 Input the point cloud information with multiple feature information into the trained second convolutional neural network to output the category of each 3D object.
- step S110 the following steps are included before the step S110:
- the position of the laser point cloud of the lidar sensor is converted into the visual image coordinate system through the transformation matrix, and the pixel point of the corresponding position is obtained.
- the pixel points of the image are traversed, and the pixel point with the smallest distance from the position where the laser point cloud is projected into the visual image coordinate system is used as the pixel point matching the laser point cloud. the pixel point.
- the step S120 includes the following steps:
- the multiple feature information of each pixel includes the RGB information of the pixel and the object identification information D in the image area where the pixel is located.
- the method before the step S110, further includes: training the second convolutional neural network by using a large amount of point cloud information with multiple feature information, and the second convolutional neural network outputs the category of the 3D object.
- the second convolutional neural network also outputs the 3D contour of each of the 3D objects and the length, width and height dimensions of the 3D contour.
- step S140 the following steps are further included after the step S140:
- One of the minimum distance in the point cloud corresponding to the 3D target, the average distance of all the corresponding point clouds, or the distance to the center point of all the corresponding point clouds is taken as the distance to the 3D target.
- step S140 the following steps are further included after the step S140:
- the 3D contour of the 3D target is established according to the point cloud corresponding to the 3D target, and the size of the 3D contour is obtained as the size of the contour of the 3D target.
- the origin of the lidar sensor is used as the origin of the radar coordinate system
- the forward direction of the vehicle is used as the XL axis of the radar coordinate system
- the upward direction perpendicular to the vehicle body is used as the Z L axis , taking the left direction of the vehicle's forward direction as the Y L axis, the coordinates of the radar coordinate system are (X L , Y L , Z L );
- the optical center of the lens of the vision sensor Take the optical center of the lens of the vision sensor as the origin of the camera coordinate system O C , take the optical axis direction as the Z C axis, the vertical downward direction as the Y C axis, and take the right direction of the vehicle's forward direction as the X C axis, Then the coordinates of the camera coordinate system are (X C , Y C , Z C );
- the coordinate origin of the image coordinate system (v x , v y ) is in the upper left corner of the image.
- T is the transformation matrix of 3 rows and 4 columns from the lidar sensor to the camera coordinate system obtained by joint calibration
- the pixel coordinates after the point cloud is projected to the camera imaging plane can be obtained by the following formula:
- f x and f y are the focal lengths of the lens on the X and Y axes
- u0 and v0 are the coordinates of the optical center in the camera coordinate system.
- Embodiments of the present invention also provide a detection system for fusing images and point cloud information, which is used to implement the above-mentioned detection method for fusing images and point cloud information.
- the detection system for fusing images and point cloud information includes:
- Synchronous acquisition module which uses lidar sensor and image sensor to obtain point cloud information and image information synchronously;
- the first network module inputting the image information into the trained first convolutional neural network to extract multiple feature information of each pixel in the image information, the multiple feature information including at least the color information of each pixel and the object identification information;
- the point cloud projection module projects the laser point cloud of the lidar sensor onto the image, matches each laser point in the point cloud information to the corresponding pixel point, and then adds the multiple feature information of the pixel point to the corresponding pixel point. in the point cloud information;
- the second network module inputs the point cloud information with multiple feature information into the trained second convolutional neural network to output the category of each 3D object.
- Embodiments of the present invention also provide a detection device for fusing image and point cloud information, including:
- the processor is configured to execute the steps of the above-mentioned detection method for fused image and point cloud information by executing the executable instructions.
- Embodiments of the present invention further provide a computer-readable storage medium for storing a program, and when the program is executed, the steps of the above-mentioned detection method for fused image and point cloud information are implemented.
- the detection method, system, device and storage medium for fusing image and point cloud information of the present invention can make full use of the point cloud information of the radar and the image information of the camera, and overcome the inability to combine the recognition speed and the recognition accuracy in the prior art. It can realize real-time 3D target detection with higher precision, and the detection accuracy of small objects will be significantly improved.
- the present invention has both the speed and accuracy of 3D target recognition. Human-driven vehicles provide safer and more reliable environmental perception information.
- FIG. 1 is a flowchart of a detection method for fused image and point cloud information according to an embodiment of the present invention.
- FIGS 2 to 5 are schematic diagrams of the implementation process of the detection method for fused image and point cloud information according to an embodiment of the present invention.
- FIG. 6 is a schematic structural diagram of a detection system for fusing image and point cloud information according to an embodiment of the present invention.
- FIG. 7 is a schematic structural diagram of a detection device for fusing image and point cloud information according to an embodiment of the present invention. as well as
- FIG. 8 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.
- FIG. 1 is a flowchart of a detection method for fused image and point cloud information according to an embodiment of the present invention. As shown in Figure 1, the detection method of the fusion image and point cloud information of the present invention includes the following steps:
- S120 Input the image information into the trained first convolutional neural network to extract multiple feature information of each pixel in the image information, where the multiple feature information at least includes color information and object identification information of each pixel. Since the machine vision model only performs convolution based on the image information (RGB), it is necessary to cut the image and identify it. Relatively speaking, the object identification information D here is likely to be inaccurate.
- RGB image information
- S130 Project the laser point cloud of the lidar sensor onto the image, match each laser point in the point cloud information to a corresponding pixel point, and then add multiple feature information of the pixel point to the corresponding point cloud information.
- S140 Input the point cloud information with multiple feature information into the trained second convolutional neural network to output the category of each 3D object.
- the image information (RGB), point cloud information and the previous object identification information D can be fully used, so that the 3D target can be recognized more accurately, and the accuracy of the recognition of the 3D target can be ensured.
- step S110 the following steps are included before step S110:
- step S130 the position of the laser point cloud of the lidar sensor is converted into the visual image coordinate system through the transformation matrix, and the pixel point of the corresponding position is obtained.
- step S130 the pixel points of the image are traversed, and the pixel point with the smallest distance from the position where the laser point cloud is projected into the visual image coordinate system is used as the pixel point matching the laser point cloud, so as to improve the corresponding Exactly.
- step S120 includes the following steps:
- the multiple feature information of each pixel includes the RGB information of the pixel and the object identification information D in the image area where the pixel is located, so that the image can be regionalized and preliminary object identification information D can be obtained.
- the method before step S110, the method further includes: training a second convolutional neural network by using a large amount of point cloud information with multiple feature information, and the second convolutional neural network outputs the category of the 3D object.
- the second convolutional neural network also outputs the 3D contour of each 3D object and the length, width and height dimensions of the 3D contour.
- step S140 the following steps are further included after step S140:
- One of the minimum distance in the point cloud corresponding to the 3D target, the average distance of all the corresponding point clouds, or the distance to the center point of all the corresponding point clouds is taken as the distance to the 3D target.
- step S140 the following steps are further included after step S140:
- the 3D contour of the 3D target is established according to the point cloud corresponding to the 3D target, and the size of the 3D contour is obtained as the size of the contour of the 3D target.
- the origin of the lidar sensor is used as the origin of the radar coordinate system
- the forward direction of the vehicle is used as the X L axis of the radar coordinate system
- the upward direction perpendicular to the vehicle body is used as the Z L axis
- the left direction in the forward direction of the vehicle is taken as the Y L axis.
- the optical center of the vision sensor lens is taken as the origin of the camera coordinate system O C , the optical axis direction is taken as the Z C axis, the vertical downward direction is taken as the Y C axis, and the right direction of the vehicle's forward direction is taken as the X C axis.
- the coordinate origin of the image coordinate system (v x , v y ) is in the upper left corner of the image.
- T is the transformation matrix of 3 rows and 4 columns from the lidar sensor to the camera coordinate system obtained by joint calibration, which includes rotation and translation, so that the 3D coordinates of the radar point cloud in the camera coordinate system are obtained.
- the pixel coordinates after the point cloud is projected to the camera imaging plane can be obtained by the following formula:
- f x and f y are the focal lengths
- u0 and v0 are the image coordinates of the optical center.
- the radar point cloud can be projected onto the camera image, so that the multiple features of the previously extracted image can be corresponding and fused to the lidar sensor point cloud, and finally the lidar sensor point cloud fused with multiple image features can be obtained.
- the detection method of the fusion image and point cloud information of the present invention can make more full use of the point cloud information of the radar and the image information of the camera, overcome the defect that the recognition speed and recognition accuracy cannot be both in the prior art, and achieve higher Accurate real-time 3D target detection will significantly improve the detection accuracy of small objects.
- the present invention combines the speed of 3D target recognition and the accuracy of recognition, and this technology helps to provide more safety for unmanned vehicles. Reliable context-aware information.
- the implementation process of the detection method for fused image and point cloud information in this embodiment includes: first, the lidar sensor 22 and the image sensor 21 installed on the unmanned truck 1 may be parallel to each other, or There can be a certain angle, and the lidar sensor 22 and the image sensor 21 installed on the unmanned truck 1 are jointly calibrated to obtain the transformation matrix of the lidar sensor coordinate system relative to the visual image coordinate system.
- the coordinates of the radar coordinate system are (X L , Y L , Z L ).
- the optical center of the lens of the vision sensor Take the optical center of the lens of the vision sensor as the origin of the camera coordinate system O C , take the optical axis direction as the Z C axis, the vertical downward direction as the Y C axis, and take the right direction of the vehicle's forward direction as the X C axis, Then the coordinates of the camera coordinate system are (X C , Y C , Z C ).
- the coordinate origin of the image coordinate system (v x , v y ) is in the upper left corner of the image.
- T is the transformation matrix of 3 rows and 4 columns from the lidar sensor to the camera coordinate system obtained by joint calibration.
- the pixel coordinates after the point cloud is projected to the camera imaging plane can be obtained by the following formula:
- f x and f y are the focal lengths of the lens on the X and Y axes
- u0 and v0 are the coordinates of the optical center in the camera coordinate system.
- the transformation matrix of the lidar sensor coordinate system relative to the visual image coordinate system can also be used for other calibration methods, but is not limited to this.
- the difference between training the first convolutional neural network and the second convolutional neural network is that the first convolutional neural network has the functions of regional differentiation and primary image recognition, while the second convolutional neural network and the first convolutional neural network have the functions of regional differentiation and primary image recognition. Compared with the neural network, it does not have the function of regional differentiation, but only has the function of accurate identification of regions.
- the point cloud information and image information are obtained synchronously using the calibrated lidar sensor 22 and the image sensor 21 on the unmanned truck 1 .
- the road surface 43 in front of the unmanned truck 1 includes a first obstacle 41 and the second obstacle 42 .
- the image information is input into the trained first convolutional neural network to extract multiple feature information of each pixel in the image information, and the multiple feature information at least includes color information and object identification information of each pixel.
- image-based image segmentation is performed by inputting image information into a trained machine vision model.
- the object identification information D corresponding to each segmented image area in the image is obtained through the machine vision model.
- the multiple feature information of each pixel includes the RGB information of the pixel and the object identification information D in the image area where the pixel is located. Then the multiple feature information of each pixel is (R, G, B, D).
- the image sensor 21 obtains a schematic image 21A of image information, and the schematic image 21A includes the area of the first obstacle 41 , the area of the second obstacle 42 , and the area of the road 43 .
- the object identification information D of each pixel in the area of the first obstacle 41 is a pedestrian, and the multiple feature information of each pixel in this area is (R, G, B, pedestrian);
- the object identification information D of each pixel in the area is a fence, and the multiple feature information of each pixel in this area is (R, G, B, fence); the object of each pixel in the area of the road 43
- the identification information D is all ground, and the multiple feature information of each pixel in this area is (R, G, B, ground).
- the lidar sensor 22 obtains a schematic diagram 22A of point cloud information.
- Each circular pattern in FIG. 4 is a laser point.
- the schematic point cloud information 22A also includes the first obstacle, the second obstacles, roads.
- the transformation matrix of the lidar sensor coordinate system relative to the visual image coordinate system is used to project the laser point cloud of the lidar sensor onto the image, and each laser point in the point cloud information is matched to the corresponding pixel point. , and then the multiple feature information of the pixel is added to the corresponding point cloud information.
- the position of the laser point cloud of the lidar sensor is converted into the visual image coordinate system through the transformation matrix, and the pixel point of the corresponding position is obtained. For example: traverse the pixel points of the image, and take the pixel point with the smallest distance from the position where the laser point cloud is projected to the visual image coordinate system as the pixel point matching the laser point cloud.
- the multiple feature information of all laser points in the area 41A of the first obstacle in the point cloud is (R, G, B, pedestrian); all the laser points in the area 42A of the second obstacle in the point cloud
- the multiple feature information of is (R, G, B, wall); the multiple feature information of all laser points in the road surface area 43A in the point cloud is (R, G, B, ground).
- the point cloud information with multiple feature information is input into the trained second convolutional neural network to output the class of each 3D object. Since the second convolutional neural network has been trained with a large amount of point cloud information with multiple feature information, it is different from the first convolutional neural network with complex functions, and the second convolutional neural network has higher label classification accuracy. In addition, the second convolutional neural network will weight the object identification information D in the multiple feature information. In some cases, the second convolutional neural network will obtain a category of 3D objects that is different from the object identification information D, improving by one. Improve the accuracy of 3D object recognition.
- the 3D contour of the 3D target is established according to the point cloud corresponding to the 3D target, and the size of the 3D contour is obtained as the size of the contour of the 3D target.
- the pedestrian height of 1.8 meters is obtained through the three-dimensional range of the point cloud corresponding to the pedestrian, and the width of the wall is 4 meters and the height is 3 meters through the stereo range of the point cloud corresponding to the wall. This is the limit.
- the second convolutional neural network can also directly output the 3D contour of each 3D object and the size of the 3D contour, but not limited thereto.
- FIG. 6 is a schematic structural diagram of a detection system for fusing image and point cloud information according to an embodiment of the present invention. As shown in FIG. 6 , an embodiment of the present invention also provides a detection system 5 for fusing image and point cloud information, which is used to realize the above-mentioned detection method for fusing image and point cloud information, and a detection system for fusing image and point cloud information.
- a detection system 5 for fusing image and point cloud information which is used to realize the above-mentioned detection method for fusing image and point cloud information
- a detection system for fusing image and point cloud information include:
- the synchronous acquisition module 51 uses the lidar sensor and the image sensor to obtain point cloud information and image information synchronously;
- the first network module 52 the image information is input into the first convolutional neural network after training to extract the multiple feature information of each pixel in the image information, and the multiple feature information at least includes the color information of each pixel and the object identification information;
- the point cloud projection module 53 projects the laser point cloud of the lidar sensor onto the image, matches each laser point in the point cloud information to the corresponding pixel point, and then adds the multiple feature information of the pixel point to the corresponding point cloud information middle;
- the second network module 54 inputs the point cloud information with multiple feature information into the trained second convolutional neural network to output the category of each 3D object.
- the detection system for fusing image and point cloud information of the present invention can more fully utilize the point cloud information of the radar and the image information of the camera, overcomes the defect that the recognition speed and the recognition accuracy cannot be both in the prior art, and achieves higher Accurate real-time 3D target detection will significantly improve the detection accuracy of small objects.
- the present invention combines the speed of 3D target recognition and the accuracy of recognition, and this technology helps to provide more safety for unmanned vehicles. Reliable context-aware information.
- the embodiment of the present invention also provides a detection device for fusing image and point cloud information, including a processor.
- a memory in which executable instructions for the processor are stored.
- the processor is configured to perform the steps of the detection method for fusing image and point cloud information by executing the executable instructions.
- the detection device for fusing image and point cloud information of the present invention can more fully utilize the point cloud information of the radar and the image information of the camera, overcomes the defect that the recognition speed and recognition accuracy cannot have both in the prior art, and achieves a more efficient High-precision real-time 3D target detection will significantly improve the detection accuracy of small objects.
- the present invention has both the speed and accuracy of 3D target recognition, and the technology helps to provide more autonomous vehicles. Safe and reliable context-aware information.
- aspects of the present invention may be implemented as a system, method or program product. Therefore, various aspects of the present invention can be embodied in the following forms: a complete hardware implementation, a complete software implementation (including firmware, microcode, etc.), or a combination of hardware and software aspects, which may be collectively referred to herein as implementations "Circuit", “Module” or "Platform”.
- FIG. 7 is a schematic structural diagram of a detection device for fusing image and point cloud information according to an embodiment of the present invention.
- An electronic device 600 according to this embodiment of the present invention is described below with reference to FIG. 7 .
- the electronic device 600 shown in FIG. 7 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present invention.
- electronic device 600 takes the form of a general-purpose computing device.
- Components of the electronic device 600 may include, but are not limited to, at least one processing unit 610, at least one storage unit 620, a bus 630 connecting different platform components (including the storage unit 620 and the processing unit 610), a display unit 640, and the like.
- the storage unit stores program codes, and the program codes can be executed by the processing unit 610, so that the processing unit 610 executes the detection and processing method according to various exemplary embodiments of the present invention described in the above-mentioned section of the detection and processing method for fused image and point cloud information in this specification. step.
- the processing unit 610 may perform the steps shown in FIG. 1 .
- the storage unit 620 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 6201 and/or a cache storage unit 6202 , and may further include a read only storage unit (ROM) 6203 .
- RAM random access storage unit
- ROM read only storage unit
- the storage unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, An implementation of a network environment may be included in each or some combination of these examples.
- the bus 630 may be representative of one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any of a variety of bus structures bus.
- the electronic device 600 may also communicate with one or more external devices 700 (eg, keyboards, pointing devices, Bluetooth devices, etc.), with one or more devices that enable a user to interact with the electronic device 600, and/or with Any device (eg, router, modem, etc.) that enables the electronic device 600 to communicate with one or more other computing devices. Such communication may occur through input/output (I/O) interface 650 . Also, the electronic device 600 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 660 . Network adapter 660 may communicate with other modules of electronic device 600 through bus 630 . It should be appreciated that, although not shown, other hardware and/or software modules may be used in conjunction with electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives And data backup storage platform, etc.
- Embodiments of the present invention further provide a computer-readable storage medium for storing a program, and the steps of the detection method for fused image and point cloud information implemented when the program is executed.
- various aspects of the present invention can also be implemented in the form of a program product, which includes program code, when the program product runs on a terminal device, the program code is used to cause the terminal device to execute the above-mentioned description in this specification.
- the steps according to various exemplary embodiments of the present invention are described in the detection processing method section of fused image and point cloud information.
- the point cloud information of the radar and the image information of the camera can be more fully utilized, which overcomes the inability to achieve both recognition speed and recognition accuracy in the prior art It can realize higher-precision real-time 3D target detection, and the detection accuracy of small objects will be significantly improved.
- the present invention combines the speed and accuracy of 3D target recognition. Driving a car provides safer and more reliable environmental awareness.
- FIG. 8 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.
- a program product 800 for implementing the above method according to an embodiment of the present invention is described, which can adopt a portable compact disk read only memory (CD-ROM) and include program codes, and can be used in a terminal device, For example running on a personal computer.
- CD-ROM portable compact disk read only memory
- the program product of the present invention is not limited thereto, and in this document, a readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
- the program product may employ any combination of one or more readable media.
- the readable medium may be a readable signal medium or a readable storage medium.
- the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
- a computer-readable storage medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
- a readable storage medium can also be any readable medium other than a readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
- Program code embodied on a readable storage medium may be transmitted using any suitable medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
- Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural programming Language - such as the "C" language or similar programming language.
- the program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
- the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (eg, using an Internet service provider business via an Internet connection).
- LAN local area network
- WAN wide area network
- the detection method, system, device and storage medium of the fusion image and point cloud information of the present invention can more fully utilize the point cloud information of the radar and the image information of the camera, and overcome the recognition speed and recognition accuracy in the prior art. It can achieve higher-precision real-time 3D target detection, and the detection accuracy of small objects will be significantly improved.
- the present invention has both the speed and accuracy of 3D target recognition, and the technology helps To provide safer and more reliable environmental perception information for driverless cars.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computing Systems (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Networks & Wireless Communication (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Electromagnetism (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Medical Informatics (AREA)
- Algebra (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
本发明提供了融合图像和点云信息的检测方法、系统、设备及存储介质,该方法包括以下步骤:使用激光雷达传感器和图像传感器同步获得点云信息和图像信息;将图像信息输入经过训练的第一卷积神经网络提取图像信息中每个像素点的多重特征信息,多重特征信息至少包括每个像素点的色彩信息以及物体标识信息;将激光雷达传感器的激光点云投影到图像上,将点云信息中每个激光点通过匹配到对应的像素点,然后像素点的多重特征信息添加到对应的点云信息中;将具有多重特征信息的点云信息输入经过训练的第二卷积神经网络输出每个3D目标的类别。本发明能够实现更高精度的实时3D目标检测,对于小物体的检测精度会有比较明显的提升。
Description
本发明属于3D目标检测技术领域,尤其涉及一种利用激光雷达传感器和图像传感器融合进行3D目标检测的算法。
目前在无人驾驶领域的3D目标检测算法主要分为三类。第一类是基于双目立体视觉的原理进行图像的深度估计,然后将2D的图像检测结果转换到3D空间;第二类是只使用纯激光雷达传感器的3D点云,通过卷积神经网络或其他机器学习等方法直接通过点云进行3D目标检测;第三类是融合相机图像和激光雷达传感器的点云信息然后通过卷积神经网络以及其他互验证策略来进行3D目标检测。
但是现有的以上三种检测方法都存在一定的缺陷和不足:
第一类方法由于双目立体视觉原理的限制,其对深度的测量精度会比激光雷达传感器低很多,尤其是当物体离相机较远时这类方法的检测精度和可靠性会严重下降。
第二类方法虽然相比于第一类方法在距离的测量精度上较高,但是由于现有激光雷达传感器原理的限制,其所获得的点云数据都是非常稀疏而且信息比较单一,缺少类似于图像中的颜色信息这类的辅助信息。
第三类方法原则上应是结合了以上两种传感器的优点,但是现有的融合激光雷达传感器和图像数据的方法都没有很好地充分利用两个传感器的特点以至于其检测精度还略低于纯激光方法的检测精度,而且其检测速度比较慢难以实现实时性。
需要说明的是,在上述背景技术部分公开的信息仅用于加强对本实用新型的背景的理解,因此可以包括不构成对本领域普通技术人员已知的现有技术的信息。
发明内容
针对现有技术中的问题,本发明的目的在于提供融合图像和点云信息的检测方法、系统、设备及存储介质,克服了现有技术的困难,能够更充分地利用雷达的点云信息与相机的图像信息,实现更高精度的实时3D目标检测,对于小物体的检测精度会有比较明显的 提升,本发明兼具了3D目标识别的速度和识别的准确性,该技术有助于为无人驾驶汽车提供更为安全可靠的环境感知信息。
本发明的实施例提供一种融合图像和点云信息的检测方法,包括以下步骤:
S110、使用激光雷达传感器和图像传感器同步获得点云信息和图像信息;
S120、将所述图像信息输入经过训练的第一卷积神经网络提取图像信息中每个像素点的多重特征信息,所述多重特征信息至少包括每个像素点的色彩信息以及物体标识信息;
S130、将激光雷达传感器的激光点云投影到图像上,将所述点云信息中每个激光点通过匹配到对应的所述像素点,然后像素点的多重特征信息添加到对应的所述点云信息中;
S140、将具有多重特征信息的点云信息输入经过训练的第二卷积神经网络输出每个3D目标的类别。
在一些实施例中,所述步骤S110之前包括以下步骤:
S100、联合标定激光雷达传感器和图像传感器以获得激光雷达传感器坐标系相对于视觉图像坐标系的转换矩阵;
所述步骤S130中,将所述激光雷达传感器的激光点云的位置通过所述转换矩阵转换到所述视觉图像坐标系中,获得对应位置的所述像素点。
在一些实施例中,所述步骤S130中,遍历所述图像的像素点,将与所述激光点云投影到视觉图像坐标系中的位置的距离最小的像素点作为匹配所述激光点云的所述像素点。
在一些实施例中,所述步骤S120中,包括以下步骤:
S121、将图像信息输入经过训练的机器视觉模型进行基于所述图像的图像分割;
S122、通过机器视觉模型获得所述图像中每个分割后图像区域对应的物体标识信息D;
S123、每个所述像素点的多重特征信息包括该像素点的RGB信息以及所述像素点所处的图像区域中的物体标识信息D。
在一些实施例中,在所述步骤S110之前还包括:采用大量具有多重特征信息的点云信息训练所述第二卷积神经网络,所述第二卷积神经网络输出3D目标的类别。
在一些实施例中,所述第二卷积神经网络还输出每个所述3D目标的3D轮廓以及3D轮廓的长宽高尺寸。
在一些实施例中,在所述步骤S140之后还包括以下步骤:
将所述3D目标所对应的点云中的最小距离、所对应的所有点云的平均距离或者到所对应的所有点云的中心点的距离中的一种作为与所述3D目标的距离。
在一些实施例中,在所述步骤S140之后还包括以下步骤:
根据所述3D目标所对应的点云建立所述3D目标的3D轮廓,获得所述3D轮廓尺寸作为所述3D目标的轮廓的尺寸。
在一些实施例中,在所述步骤S100中,以激光雷达传感器的原点作为雷达坐标系原点,以车辆的前进方向作为雷达坐标系的X
L轴,以垂直车体向上的方向作为Z
L轴,以车辆的前进方向的左侧方向作为Y
L轴,则雷达坐标系的坐标为(X
L,Y
L,Z
L);
以视觉传感器的镜头的光心作为相机坐标系原点O
C,以光轴方向作为Z
C轴,垂直向下的方向作为Y
C轴,以车辆的前进方向的正右侧方向作为X
C轴,则相机坐标系的坐标为(X
C,Y
C,Z
C);
图像坐标系(v
x,v
y)的坐标原点在图像的左上角,使用棋盘格标定板对激光雷达传感器和视觉传感器进行联合标定,获得激光雷达传感器到相机坐标系的转换矩阵:
其中T为联合标定得到的激光雷达传感器到相机坐标系的3行4列的转换矩阵;
点云投影到相机成像平面后的像素坐标可由以下公式得到:
其中,f
x、f
y为镜头在X轴和Y轴上的焦距,u0、v0为光心在相机坐标系中的坐标,则激光雷达传感器到图像坐标系的转换关系可用下面的公式表示:
本发明的实施例还提供一种融合图像和点云信息的检测系统,用于实现上述的融合图像和点云信息的检测方法,融合图像和点云信息的检测系统包括:
同步采集模块,使用激光雷达传感器和图像传感器同步获得点云信息和图像信息;
第一网络模块,将所述图像信息输入经过训练的第一卷积神经网络提取图像信息中每个像素点的多重特征信息,所述多重特征信息至少包括每个像素点的色彩信息以及物体标识信息;
点云投影模块,将激光雷达传感器的激光点云投影到图像上,将所述点云信息中每个 激光点通过匹配到对应的所述像素点,然后像素点的多重特征信息添加到对应的所述点云信息中;
第二网络模块,将具有多重特征信息的点云信息输入经过训练的第二卷积神经网络输出每个3D目标的类别。
本发明的实施例还提供一种融合图像和点云信息的检测设备,包括:
处理器;
存储器,其中存储有处理器的可执行指令;
其中,处理器配置为经由执行可执行指令来执行上述融合图像和点云信息的检测方法的步骤。
本发明的实施例还提供一种计算机可读存储介质,用于存储程序,程序被执行时实现上述融合图像和点云信息的检测方法的步骤。
本发明的融合图像和点云信息的检测方法、系统、设备及存储介质,能够更充分地利用雷达的点云信息与相机的图像信息,克服了现有技术中识别速度和识别准确性无法兼得的缺陷,实现更高精度的实时3D目标检测,对于小物体的检测精度会有比较明显的提升,本发明兼具了3D目标识别的速度和识别的准确性,该技术有助于为无人驾驶汽车提供更为安全可靠的环境感知信息。
通过阅读参照以下附图对非限制性实施例所作的详细描述,本揭露的其它特征、目的和优点将会变得更明显。
图1是本发明一实施例的融合图像和点云信息的检测方法的流程图。
图2至5是本发明一实施例的融合图像和点云信息的检测方法的实施过程示意图。
图6是本发明一实施例的融合图像和点云信息的检测系统的结构示意图。
图7是本发明一实施例的融合图像和点云信息的检测设备的结构示意图。以及
图8是本发明一实施例的计算机可读存储介质的结构示意图。
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的实施方式。相反,提供这些实施方式使得本发明将全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。在图中相同的附 图标记表示相同或类似的结构,因而将省略对它们的重复描述。
图1是本发明一实施例的融合图像和点云信息的检测方法的流程图。如图1所示,本发明融合图像和点云信息的检测方法,包括以下步骤:
S110、使用激光雷达传感器和图像传感器同步获得点云信息和图像信息。
S120、将图像信息输入经过训练的第一卷积神经网络提取图像信息中每个像素点的多重特征信息,多重特征信息至少包括每个像素点的色彩信息以及物体标识信息。由于机器视觉模型仅仅根据图像信息(RGB)进行卷积,既要切割图像又要进行辨识,相对而言,此处的物体标识信息D容易存在不准确的可能性。
S130、将激光雷达传感器的激光点云投影到图像上,将点云信息中每个激光点通过匹配到对应的像素点,然后像素点的多重特征信息添加到对应的点云信息中。
S140、将具有多重特征信息的点云信息输入经过训练的第二卷积神经网络输出每个3D目标的类别。在第二卷积神经网络中由于充分使用了图像信息(RGB)、点云信息以及之前的物体标识信息D,能够更准确地对3D目标进行识别,保证了识别3D目标的准确性。
在一个优选实施例中,步骤S110之前包括以下步骤:
S100、联合标定激光雷达传感器和图像传感器以获得激光雷达传感器坐标系相对于视觉图像坐标系的转换矩阵。
步骤S130中,将激光雷达传感器的激光点云的位置通过转换矩阵转换到视觉图像坐标系中,获得对应位置的像素点。
在一个优选实施例中,步骤S130中,遍历图像的像素点,将与激光点云投影到视觉图像坐标系中的位置的距离最小的像素点作为匹配激光点云的像素点,以便提高对应的准确定。
在一个优选实施例中,步骤S120中,包括以下步骤:
S121、将图像信息输入经过训练的机器视觉模型进行基于图像的图像分割。
S122、通过机器视觉模型获得图像中每个分割后图像区域对应的物体标识信息D。
S123、每个像素点的多重特征信息包括该像素点的RGB信息以及像素点所处的图像区域中的物体标识信息D,从而能够对图像进行区域化,并得到初步的物体标识信息D。在一个优选实施例中,在步骤S110之前还包括:采用大量具有多重特征信息的点云信息训练第二卷积神经网络,第二卷积神经网络输出3D目标的类别。
在一个优选实施例中,第二卷积神经网络还输出每个3D目标的3D轮廓以及3D轮 廓的长宽高尺寸。
在一个优选实施例中,在步骤S140之后还包括以下步骤:
将3D目标所对应的点云中的最小距离、所对应的所有点云的平均距离或者到所对应的所有点云的中心点的距离中的一种作为与3D目标的距离。
在一个优选实施例中,在步骤S140之后还包括以下步骤:
根据3D目标所对应的点云建立3D目标的3D轮廓,获得3D轮廓尺寸作为3D目标的轮廓的尺寸。
在一个优选实施例中,在步骤S100中,以激光雷达传感器的原点作为雷达坐标系原点,以车辆的前进方向作为雷达坐标系的X
L轴,以垂直车体向上的方向作为Z
L轴,以车辆的前进方向的左侧方向作为Y
L轴。
以视觉传感器镜头的光心作为相机坐标系原点O
C,以光轴方向作为Z
C轴,垂直向下的方向作为Y
C轴,以车辆的前进方向的正右侧方向作为X
C轴。
图像坐标系(v
x,v
y)的坐标原点在图像的左上角,使用棋盘格标定板对激光雷达传感器和视觉传感器进行联合标定,获得激光雷达传感器到相机坐标系的转换矩阵:
其中T为联合标定得到的激光雷达传感器到相机坐标系的3行4列的转换矩阵,其中包含了旋转和平移,这样就得到了雷达点云在相机坐标系下的3D坐标,由相机的小孔成像原理,点云投影到相机成像平面后的像素坐标可由以下公式得到:
其中,f
x、f
y为焦距,u0、v0为光心的图像坐标,综合上述公式,激光雷达传感器到图像坐标系的转换关系可用下面的公式表示:
利用该公式,可以将雷达点云投影到相机图像上,由此便可将之前提取的图像多重特征对应并融合到激光雷达传感器点云上,最终得到融合多重图像特征的激光雷达传感器 点云。
本发明的融合图像和点云信息的检测方法,能够更充分地利用雷达的点云信息与相机的图像信息,克服了现有技术中识别速度和识别准确性无法兼得的缺陷,实现更高精度的实时3D目标检测,对于小物体的检测精度会有比较明显的提升,本发明兼具了3D目标识别的速度和识别的准确性,该技术有助于为无人驾驶汽车提供更为安全可靠的环境感知信息。
图2至5是该实施例的融合图像和点云信息的检测方法的实施过程示意图。如图2至5所示,该实施例的融合图像和点云信息的检测方法的实施过程包括:首先,安装与无人卡车1的激光雷达传感器22和图像传感器21可以是相互平行的,也可以有一定的夹角,将安装在无人卡车1的激光雷达传感器22和图像传感器21进行联合标定,以获得激光雷达传感器坐标系相对于视觉图像坐标系的转换矩阵。
以激光雷达传感器的原点作为雷达坐标系原点,以车辆的前进方向作为雷达坐标系的X
L轴,以垂直车体向上的方向作为Z
L轴,以车辆的前进方向的左侧方向作为Y
L轴,则雷达坐标系的坐标为(X
L,Y
L,Z
L)。
以视觉传感器的镜头的光心作为相机坐标系原点O
C,以光轴方向作为Z
C轴,垂直向下的方向作为Y
C轴,以车辆的前进方向的正右侧方向作为X
C轴,则相机坐标系的坐标为(X
C,Y
C,Z
C)。
图像坐标系(v
x,v
y)的坐标原点在图像的左上角,使用棋盘格标定板对激光雷达传感器和视觉传感器进行联合标定,获得激光雷达传感器到相机坐标系的转换矩阵:
其中T为联合标定得到的激光雷达传感器到相机坐标系的3行4列的转换矩阵。
点云投影到相机成像平面后的像素坐标可由以下公式得到:
其中,f
x、f
y为镜头在X轴和Y轴上的焦距,u0、v0为光心在相机坐标系中的坐标,则激光雷达传感器到图像坐标系的转换关系可用下面的公式表示:
也可以通过其他标定方法激光雷达传感器坐标系相对于视觉图像坐标系的转换矩阵,不以此为限。
并且,采用大量具有分区域以及物体标识的照片学习训练第一卷积神经网络,使得第一卷积神经网络能够自动对输入的照片进行区域分化,并单独对每个区域进行图像识别,输出每个区域的物体标识。采用大量具有多重特征信息的点云信息训练第二卷积神经网络,使得第二卷积神经网络能够准确输出3D目标的类别。本发明中,训练第一卷积神经网络与第二卷积神经网络的区别在于:第一卷积神经网络具备区域分化和初级图像识别的功能,而第二卷积神经网络与第一卷积神经网络相比,不具备区域分化的功能,只具备针对区域进行精确识别的功能。
参考图2,使用标定后的在无人卡车1的激光雷达传感器22和图像传感器21同步获得点云信息和图像信息,此时无人卡车1的前方的路面43上包括了第一障碍物41和第二障碍物42。
然后,参考图3,将图像信息输入经过训练的第一卷积神经网络提取图像信息中每个像素点的多重特征信息,多重特征信息至少包括每个像素点的色彩信息以及物体标识信息。本实施例中,通过将图像信息输入经过训练的机器视觉模型进行基于图像的图像分割。通过机器视觉模型获得图像中每个分割后图像区域对应的物体标识信息D。每个像素点的多重特征信息包括该像素点的RGB信息以及像素点所处的图像区域中的物体标识信息D。则每个像素点的多重特征信息为(R,G,B,D)。图像传感器21获得了图像信息示意图21A,图像信息示意图21A中包括了第一障碍物41的区域、第二障碍物42的区域、路面43的区域。第一障碍物41的区域中的每个像素点的物体标识信息D都是行人,该区域的每个像素点的多重特征信息为(R,G,B,行人);第二障碍物42的区域中的每个像素点的物体标识信息D都是围墙,该区域的每个像素点的多重特征信息为(R,G,B,围墙);路面43的区域中的每个像素点的物体标识信息D都是地面,该区域的每个像素点的多重特征信息为(R,G,B,地面)。
参考图4,激光雷达传感器22获得了点云信息示意图22A,图4中的每个圆形图案都是一个激光点,同样地,点云信息示意图22A中也包含了第一障碍物、第二障碍物、路面。
参考图5,随后,利用激光雷达传感器坐标系相对于视觉图像坐标系的转换矩阵将激光雷达传感器的激光点云投影到图像上,将点云信息中每个激光点通过匹配到对应的像素点,然后像素点的多重特征信息添加到对应的点云信息中。将激光雷达传感器的激光点云的位置通过转换矩阵转换到视觉图像坐标系中,获得对应位置的像素点。例如:遍历图像的像素点,将与激光点云投影到视觉图像坐标系中的位置的距离最小的像素点作为匹配激光点云的像素点。本实施例中,点云中第一障碍物的区域41A中所有的激光点的多重特征信息为(R,G,B,行人);点云中第二障碍物的区域42A中所有的激光点的多重特征信息为(R,G,B,围墙);点云中路面的区域43A中所有的激光点的多重特征信息为(R,G,B,地面)。
然后。将具有多重特征信息的点云信息输入经过训练的第二卷积神经网络输出每个3D目标的类别。由于的第二卷积神经网络经过大量具有多重特征信息的点云信息训练,不同于具备功能复杂的第一卷积神经网络,第二卷积神经网络具有更高的标签分类准确性。并且,第二卷积神经网络中会加权使用多重特征信息中的物体标识信息D,在一些情况下,第二卷积神经网络会得出不同于物体标识信息D的3D目标的类别,进步一提高3D目标识别的准确性。
最后,将每个3D目标所对应的点云中的最小距离、所对应的所有点云的平均距离或者到所对应的所有点云的中心点的距离中的一种作为与该3D目标的距离,从而就能得到行人与无人卡车1的当前距离为2米,围墙与无人卡车1的当前距离为4米。并且,根据3D目标所对应的点云建立3D目标的3D轮廓,获得3D轮廓的尺寸作为3D目标的轮廓的尺寸。例如,通过三角函数的计算等现有技术,通过对应行人的点云的立体范围获得行人身高1.8米,通过对应围墙的点云的立体范围获得围墙宽度为4米,高度为3米,但不以此为限。
在一个优选实施例中,第二卷积神经网能够还能直接输出每个3D目标的3D轮廓以及3D轮廓的尺寸,但不以此为限。
图6是本发明一实施例的融合图像和点云信息的检测系统的结构示意图。如图6所示,本发明的实施例还提供一种融合图像和点云信息的检测系统5,用于实现上述的融合图像和点云信息的检测方法,融合图像和点云信息的检测系统包括:
同步采集模块51,使用激光雷达传感器和图像传感器同步获得点云信息和图像信息;
第一网络模块52,将图像信息输入经过训练的第一卷积神经网络提取图像信息中每 个像素点的多重特征信息,多重特征信息至少包括每个像素点的色彩信息以及物体标识信息;
点云投影模块53,将激光雷达传感器的激光点云投影到图像上,将点云信息中每个激光点通过匹配到对应的像素点,然后像素点的多重特征信息添加到对应的点云信息中;
第二网络模块54,将具有多重特征信息的点云信息输入经过训练的第二卷积神经网络输出每个3D目标的类别。
本发明的融合图像和点云信息的检测系统,能够更充分地利用雷达的点云信息与相机的图像信息,克服了现有技术中识别速度和识别准确性无法兼得的缺陷,实现更高精度的实时3D目标检测,对于小物体的检测精度会有比较明显的提升,本发明兼具了3D目标识别的速度和识别的准确性,该技术有助于为无人驾驶汽车提供更为安全可靠的环境感知信息。
本发明实施例还提供一种融合图像和点云信息的检测设备,包括处理器。存储器,其中存储有处理器的可执行指令。其中,处理器配置为经由执行可执行指令来执行的融合图像和点云信息的检测方法的步骤。
如上,本发明的融合图像和点云信息的检测设备能够更充分地利用雷达的点云信息与相机的图像信息,克服了现有技术中识别速度和识别准确性无法兼得的缺陷,实现更高精度的实时3D目标检测,对于小物体的检测精度会有比较明显的提升,本发明兼具了3D目标识别的速度和识别的准确性,该技术有助于为无人驾驶汽车提供更为安全可靠的环境感知信息。
所属技术领域的技术人员能够理解,本发明的各个方面可以实现为系统、方法或程序产品。因此,本发明的各个方面可以具体实现为以下形式,即:完全的硬件实施方式、完全的软件实施方式(包括固件、微代码等),或硬件和软件方面结合的实施方式,这里可以统称为“电路”、“模块”或“平台”。
图7是本发明一实施例的融合图像和点云信息的检测设备的结构示意图。下面参照图7来描述根据本发明的这种实施方式的电子设备600。图7显示的电子设备600仅仅是一个示例,不应对本发明实施例的功能和使用范围带来任何限制。
如图7所示,电子设备600以通用计算设备的形式表现。电子设备600的组件可以包括但不限于:至少一个处理单元610、至少一个存储单元620、连接不同平台组件(包括存储单元620和处理单元610)的总线630、显示单元640等。
其中,存储单元存储有程序代码,程序代码可以被处理单元610执行,使得处理单元610执行本说明书上述融合图像和点云信息的检测处理方法部分中描述的根据本发明各种示例性实施方式的步骤。例如,处理单元610可以执行如图1中所示的步骤。
存储单元620可以包括易失性存储单元形式的可读介质,例如随机存取存储单元(RAM)6201和/或高速缓存存储单元6202,还可以进一步包括只读存储单元(ROM)6203。
存储单元620还可以包括具有一组(至少一个)程序模块6205的程序/实用工具6204,这样的程序模块6205包括但不限于:操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。
总线630可以为表示几类总线结构中的一种或多种,包括存储单元总线或者存储单元控制器、外围总线、图形加速端口、处理单元或者使用多种总线结构中的任意总线结构的局域总线。
电子设备600也可以与一个或多个外部设备700(例如键盘、指向设备、蓝牙设备等)通信,还可与一个或者多个使得用户能与该电子设备600交互的设备通信,和/或与使得该电子设备600能与一个或多个其它计算设备进行通信的任何设备(例如路由器、调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口650进行。并且,电子设备600还可以通过网络适配器660与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。网络适配器660可以通过总线630与电子设备600的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备600使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储平台等。
本发明实施例还提供一种计算机可读存储介质,用于存储程序,程序被执行时实现的融合图像和点云信息的检测方法的步骤。在一些可能的实施方式中,本发明的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当程序产品在终端设备上运行时,程序代码用于使终端设备执行本说明书上述融合图像和点云信息的检测处理方法部分中描述的根据本发明各种示例性实施方式的步骤。
如上所示,该实施例的计算机可读存储介质的程序在执行时,能够更充分地利用雷达的点云信息与相机的图像信息,克服了现有技术中识别速度和识别准确性无法兼得的缺陷,实现更高精度的实时3D目标检测,对于小物体的检测精度会有比较明显的提升,本发明兼具了3D目标识别的速度和识别的准确性,该技术有助于为无人驾驶汽车提供更为 安全可靠的环境感知信息。
图8是本发明一实施例的计算机可读存储介质的结构示意图。参考图8所示,描述了根据本发明的实施方式的用于实现上述方法的程序产品800,其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在终端设备,例如个人电脑上运行。然而,本发明的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
计算机可读存储介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。可读存储介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。可读存储介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言的任意组合来编写用于执行本发明操作的程序代码,程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。
综上,本发明的融合图像和点云信息的检测方法、系统、设备及存储介质,能够更充分地利用雷达的点云信息与相机的图像信息,克服了现有技术中识别速度和识别准确性无法兼得的缺陷,实现更高精度的实时3D目标检测,对于小物体的检测精度会有比较明显的提升,本发明兼具了3D目标识别的速度和识别的准确性,该技术有助于为无人驾驶汽 车提供更为安全可靠的环境感知信息。
以上内容是结合具体的优选实施方式对本发明所作的进一步详细说明,不能认定本发明的具体实施只局限于这些说明。对于本发明所属技术领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干简单推演或替换,都应当视为属于本发明的保护范围。
Claims (12)
- 一种融合图像和点云信息的检测方法,其特征在于,包括以下步骤:S110、使用激光雷达传感器和图像传感器同步获得点云信息和图像信息;S120、将所述图像信息输入经过训练的第一卷积神经网络提取图像信息中每个像素点的多重特征信息,所述多重特征信息至少包括每个像素点的色彩信息以及物体标识信息;S130、将激光雷达传感器的激光点云投影到图像上,将所述点云信息中每个激光点通过匹配到对应的所述像素点,然后像素点的多重特征信息添加到对应的所述点云信息中;S140、将具有多重特征信息的点云信息输入经过训练的第二卷积神经网络输出每个3D目标的类别。
- 根据权利要求1所述的融合图像和点云信息的检测方法,其特征在于,所述步骤S110之前包括以下步骤:S100、联合标定激光雷达传感器和图像传感器以获得激光雷达传感器坐标系相对于视觉图像坐标系的转换矩阵;所述步骤S130中,将所述激光雷达传感器的激光点云的位置通过所述转换矩阵转换到所述视觉图像坐标系中,获得对应位置的所述像素点。
- 根据权利要求1所述的融合图像和点云信息的检测方法,其特征在于,所述步骤S130中,遍历所述图像的像素点,将与所述激光点云投影到视觉图像坐标系中的位置的距离最小的像素点作为匹配所述激光点云的所述像素点。
- 根据权利要求1所述的融合图像和点云信息的检测方法,其特征在于,所述步骤S120中,包括以下步骤:S121、将图像信息输入经过训练的机器视觉模型进行基于所述图像的图像分割;S122、通过机器视觉模型获得所述图像中每个分割后图像区域对应的物体标识信息D;S123、每个所述像素点的多重特征信息包括该像素点的RGB信息以及所述像素点所处的图像区域中的物体标识信息D。
- 根据权利要求4所述的融合图像和点云信息的检测方法,其特征在于,在所述步骤S110之前还包括:采用大量具有多重特征信息的点云信息训练所述第二卷积神经网络,所述第二卷积神经网络输出3D目标的类别。
- 根据权利要求5所述的融合图像和点云信息的检测方法,其特征在于,所述第二卷积神经网络还输出每个所述3D目标的3D轮廓以及3D轮廓的长宽高尺寸。
- 根据权利要求1所述的融合图像和点云信息的检测方法,其特征在于,在所述步骤S140之后还包括以下步骤:将所述3D目标所对应的点云中的最小距离、所对应的所有点云的平均距离或者到所对应的所有点云的中心点的距离中的一种作为与所述3D目标的距离。
- 根据权利要求1所述的融合图像和点云信息的检测方法,其特征在于,在所述步骤S140之后还包括以下步骤:根据所述3D目标所对应的点云建立所述3D目标的3D轮廓,获得所述3D轮廓的尺寸作为所述3D目标的轮廓的尺寸。
- 根据权利要求2所述的融合图像和点云信息的检测方法,其特征在于,在所述步骤S100中,以激光雷达传感器的原点作为雷达坐标系原点,以车辆的前进方向作为雷达坐标系的X L轴,以垂直车体向上的方向作为Z L轴,以车辆的前进方向的左侧方向作为Y L轴,则雷达坐标系的坐标为(X L,Y L,Z L);以视觉传感器的镜头的光心作为相机坐标系原点O C,以光轴方向作为Z C轴,垂直向下的方向作为Y C轴,以车辆的前进方向的正右侧方向作为X C轴,则相机坐标系的坐标为(X C,Y C,Z C);图像坐标系(v x,v y)的坐标原点在图像的左上角,使用棋盘格标定板对激光雷达传感器和视觉传感器进行联合标定,获得激光雷达传感器到相机坐标系的转换矩阵:其中T为联合标定得到的激光雷达传感器到相机坐标系的3行4列的转换矩阵;点云投影到相机成像平面后的像素坐标可由以下公式得到:其中,f x、f y为镜头在X轴和Y轴上的焦距,u0、v0为光心在相机坐标系中的坐标,则激光雷达传感器到图像坐标系的转换关系可用下面的公式表示:
- 一种融合图像和点云信息的检测系统,其特征在于,用于实现如权利要求1所述的融合图像和点云信息的检测方法,包括:同步采集模块,使用激光雷达传感器和图像传感器同步获得点云信息和图像信息;第一网络模块,将所述图像信息输入经过训练的第一卷积神经网络提取图像信息中每个像素点的多重特征信息,所述多重特征信息至少包括每个像素点的色彩信息以及物体标识信息;点云投影模块,将激光雷达传感器的激光点云投影到图像上,将所述点云信息中每个激光点通过匹配到对应的所述像素点,然后像素点的多重特征信息添加到对应的所述点云信息中;第二网络模块,将具有多重特征信息的点云信息输入经过训练的第二卷积神经网络输出每个3D目标的类别。
- 一种融合图像和点云信息的检测设备,其特征在于,包括:处理器;存储器,其中存储有处理器的可执行指令;其中,处理器配置为经由执行可执行指令来执行权利要求1至9中任意一项所述融合图像和点云信息的检测方法的步骤。
- 一种计算机可读存储介质,用于存储程序,其特征在于,程序被执行时实现权利要求1至9中任意一项所述融合图像和点云信息的检测方法的步骤。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP21920556.4A EP4283515A4 (en) | 2021-01-20 | 2021-07-27 | DETECTION METHOD, SYSTEM AND DEVICE BASED ON FUSION OF IMAGE AND POINT CLOUD INFORMATION, AND STORAGE MEDIUM |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110076345.5 | 2021-01-20 | ||
CN202110076345.5A CN112861653B (zh) | 2021-01-20 | 2021-01-20 | 融合图像和点云信息的检测方法、系统、设备及存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022156175A1 true WO2022156175A1 (zh) | 2022-07-28 |
Family
ID=76007756
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/108585 WO2022156175A1 (zh) | 2021-01-20 | 2021-07-27 | 融合图像和点云信息的检测方法、系统、设备及存储介质 |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP4283515A4 (zh) |
CN (1) | CN112861653B (zh) |
WO (1) | WO2022156175A1 (zh) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114973006A (zh) * | 2022-08-02 | 2022-08-30 | 四川省机械研究设计院(集团)有限公司 | 花椒采摘方法、装置、系统及存储介质 |
CN115327524A (zh) * | 2022-07-29 | 2022-11-11 | 江苏集萃深度感知技术研究所有限公司 | 基于毫米波雷达与视觉融合的路侧端目标检测方法及装置 |
CN115616560A (zh) * | 2022-12-02 | 2023-01-17 | 广汽埃安新能源汽车股份有限公司 | 车辆避障方法、装置、电子设备和计算机可读介质 |
CN115830424A (zh) * | 2023-02-09 | 2023-03-21 | 深圳酷源数联科技有限公司 | 基于融合图像的矿废识别方法、装置、设备及存储介质 |
CN115904294A (zh) * | 2023-01-09 | 2023-04-04 | 山东矩阵软件工程股份有限公司 | 一种环境可视化方法、系统、存储介质和电子设备 |
CN115994854A (zh) * | 2023-03-22 | 2023-04-21 | 智洋创新科技股份有限公司 | 一种标志物点云和图像配准的方法和系统 |
CN116030628A (zh) * | 2023-01-06 | 2023-04-28 | 广州市杜格科技有限公司 | 基于双激光点云数据分析的车型分类方法及交通调查设备 |
CN116228854A (zh) * | 2022-12-29 | 2023-06-06 | 中科微至科技股份有限公司 | 一种基于深度学习的包裹自动分拣方法 |
CN116385528A (zh) * | 2023-03-28 | 2023-07-04 | 小米汽车科技有限公司 | 标注信息的生成方法、装置、电子设备、车辆及存储介质 |
CN116503828A (zh) * | 2023-04-24 | 2023-07-28 | 大连理工大学 | 一种基于激光雷达的列车轨道检测方法 |
CN116523970A (zh) * | 2023-07-05 | 2023-08-01 | 之江实验室 | 基于二次隐式匹配的动态三维目标跟踪方法及装置 |
CN116545122A (zh) * | 2023-07-06 | 2023-08-04 | 中国电力科学研究院有限公司 | 一种输电线路防外破监测装置和防外破监测方法 |
CN116883496A (zh) * | 2023-06-26 | 2023-10-13 | 小米汽车科技有限公司 | 交通元素的坐标重建方法、装置、电子设备及存储介质 |
CN117111055A (zh) * | 2023-06-19 | 2023-11-24 | 山东高速集团有限公司 | 一种基于雷视融合的车辆状态感知方法 |
CN117132519A (zh) * | 2023-10-23 | 2023-11-28 | 江苏华鲲振宇智能科技有限责任公司 | 基于vpx总线多传感器图像融合处理模块 |
CN117173257A (zh) * | 2023-11-02 | 2023-12-05 | 安徽蔚来智驾科技有限公司 | 3d目标检测及其标定参数增强方法、电子设备、介质 |
CN117268350A (zh) * | 2023-09-18 | 2023-12-22 | 广东省核工业地质局测绘院 | 一种基于点云数据融合的移动式智能测绘系统 |
CN117893797A (zh) * | 2023-12-26 | 2024-04-16 | 武汉天眸光电科技有限公司 | 基于车路协同的目标检测方法、装置、设备及存储介质 |
CN118097555A (zh) * | 2024-03-01 | 2024-05-28 | 山东大学 | 多视角多模态融合的区域监控检测方法及系统 |
CN118675025A (zh) * | 2024-08-21 | 2024-09-20 | 中国科学院自动化研究所 | 基于图像检测器原始输出的图像融合检测方法及装置 |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112861653B (zh) * | 2021-01-20 | 2024-01-23 | 上海西井科技股份有限公司 | 融合图像和点云信息的检测方法、系统、设备及存储介质 |
CN115147328A (zh) * | 2021-03-29 | 2022-10-04 | 华为技术有限公司 | 三维目标检测方法及装置 |
CN113281717A (zh) * | 2021-06-04 | 2021-08-20 | 上海西井信息科技有限公司 | 基于激光雷达的地面过滤方法、系统、设备及存储介质 |
CN113093216A (zh) * | 2021-06-07 | 2021-07-09 | 山东捷瑞数字科技股份有限公司 | 一种激光雷达和相机融合的不规则物体测量方法 |
CN113296107B (zh) * | 2021-06-23 | 2024-07-23 | 上海西井科技股份有限公司 | 传感器协同检测拖挂角度的方法、系统、设备及存储介质 |
CN114119992B (zh) * | 2021-10-28 | 2024-06-28 | 清华大学 | 基于图像与点云融合的多模态三维目标检测方法及装置 |
CN114373105B (zh) * | 2021-12-20 | 2024-07-26 | 华南理工大学 | 一种点云标注及数据集制作的方法、系统、装置及介质 |
CN114494415A (zh) * | 2021-12-31 | 2022-05-13 | 北京建筑大学 | 一种自动驾驶装载机对沙石堆检测识别和测量的方法 |
CN114792417B (zh) * | 2022-02-24 | 2023-06-16 | 广州文远知行科技有限公司 | 模型训练方法、图像识别方法、装置、设备及存储介质 |
CN114627224B (zh) * | 2022-03-07 | 2024-08-23 | 清华大学深圳国际研究生院 | 一种彩色点云生成方法、系统及应用方法 |
CN115213896A (zh) * | 2022-05-10 | 2022-10-21 | 浙江西图盟数字科技有限公司 | 基于机械臂的物体抓取方法、系统、设备及存储介质 |
CN115265556A (zh) * | 2022-07-30 | 2022-11-01 | 重庆长安汽车股份有限公司 | 自动驾驶车辆的定位方法、装置、设备及介质 |
WO2024044887A1 (en) * | 2022-08-29 | 2024-03-07 | Huawei Technologies Co., Ltd. | Vision-based perception system |
CN118151126A (zh) * | 2022-12-07 | 2024-06-07 | 上海禾赛科技有限公司 | 激光雷达、数据处理方法及光探测和数据采集处理装置 |
CN115760855B (zh) * | 2023-01-09 | 2023-05-23 | 中建科技集团有限公司 | 工件检查方法及相关设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109443369A (zh) * | 2018-08-20 | 2019-03-08 | 北京主线科技有限公司 | 利用激光雷达和视觉传感器融合构建动静态栅格地图的方法 |
CN110135485A (zh) * | 2019-05-05 | 2019-08-16 | 浙江大学 | 单目相机与毫米波雷达融合的物体识别与定位方法和系统 |
CN110853037A (zh) * | 2019-09-26 | 2020-02-28 | 西安交通大学 | 一种基于球面投影的轻量化彩色点云分割方法 |
US20200160559A1 (en) * | 2018-11-16 | 2020-05-21 | Uatc, Llc | Multi-Task Multi-Sensor Fusion for Three-Dimensional Object Detection |
CN112861653A (zh) * | 2021-01-20 | 2021-05-28 | 上海西井信息科技有限公司 | 融合图像和点云信息的检测方法、系统、设备及存储介质 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110188696B (zh) * | 2019-05-31 | 2023-04-18 | 华南理工大学 | 一种水面无人装备多源感知方法及系统 |
CN111522026B (zh) * | 2020-04-21 | 2022-12-09 | 北京三快在线科技有限公司 | 一种数据融合的方法及装置 |
CN111583337B (zh) * | 2020-04-25 | 2023-03-21 | 华南理工大学 | 一种基于多传感器融合的全方位障碍物检测方法 |
-
2021
- 2021-01-20 CN CN202110076345.5A patent/CN112861653B/zh active Active
- 2021-07-27 WO PCT/CN2021/108585 patent/WO2022156175A1/zh unknown
- 2021-07-27 EP EP21920556.4A patent/EP4283515A4/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109443369A (zh) * | 2018-08-20 | 2019-03-08 | 北京主线科技有限公司 | 利用激光雷达和视觉传感器融合构建动静态栅格地图的方法 |
US20200160559A1 (en) * | 2018-11-16 | 2020-05-21 | Uatc, Llc | Multi-Task Multi-Sensor Fusion for Three-Dimensional Object Detection |
CN110135485A (zh) * | 2019-05-05 | 2019-08-16 | 浙江大学 | 单目相机与毫米波雷达融合的物体识别与定位方法和系统 |
CN110853037A (zh) * | 2019-09-26 | 2020-02-28 | 西安交通大学 | 一种基于球面投影的轻量化彩色点云分割方法 |
CN112861653A (zh) * | 2021-01-20 | 2021-05-28 | 上海西井信息科技有限公司 | 融合图像和点云信息的检测方法、系统、设备及存储介质 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4283515A4 * |
Cited By (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115327524A (zh) * | 2022-07-29 | 2022-11-11 | 江苏集萃深度感知技术研究所有限公司 | 基于毫米波雷达与视觉融合的路侧端目标检测方法及装置 |
CN114973006B (zh) * | 2022-08-02 | 2022-10-18 | 四川省机械研究设计院(集团)有限公司 | 花椒采摘方法、装置、系统及存储介质 |
CN114973006A (zh) * | 2022-08-02 | 2022-08-30 | 四川省机械研究设计院(集团)有限公司 | 花椒采摘方法、装置、系统及存储介质 |
CN115616560A (zh) * | 2022-12-02 | 2023-01-17 | 广汽埃安新能源汽车股份有限公司 | 车辆避障方法、装置、电子设备和计算机可读介质 |
CN116228854B (zh) * | 2022-12-29 | 2023-09-08 | 中科微至科技股份有限公司 | 一种基于深度学习的包裹自动分拣方法 |
CN116228854A (zh) * | 2022-12-29 | 2023-06-06 | 中科微至科技股份有限公司 | 一种基于深度学习的包裹自动分拣方法 |
CN116030628A (zh) * | 2023-01-06 | 2023-04-28 | 广州市杜格科技有限公司 | 基于双激光点云数据分析的车型分类方法及交通调查设备 |
CN115904294A (zh) * | 2023-01-09 | 2023-04-04 | 山东矩阵软件工程股份有限公司 | 一种环境可视化方法、系统、存储介质和电子设备 |
CN115904294B (zh) * | 2023-01-09 | 2023-06-09 | 山东矩阵软件工程股份有限公司 | 一种环境可视化方法、系统、存储介质和电子设备 |
CN115830424B (zh) * | 2023-02-09 | 2023-04-28 | 深圳酷源数联科技有限公司 | 基于融合图像的矿废识别方法、装置、设备及存储介质 |
CN115830424A (zh) * | 2023-02-09 | 2023-03-21 | 深圳酷源数联科技有限公司 | 基于融合图像的矿废识别方法、装置、设备及存储介质 |
CN115994854A (zh) * | 2023-03-22 | 2023-04-21 | 智洋创新科技股份有限公司 | 一种标志物点云和图像配准的方法和系统 |
CN116385528A (zh) * | 2023-03-28 | 2023-07-04 | 小米汽车科技有限公司 | 标注信息的生成方法、装置、电子设备、车辆及存储介质 |
CN116385528B (zh) * | 2023-03-28 | 2024-04-30 | 小米汽车科技有限公司 | 标注信息的生成方法、装置、电子设备、车辆及存储介质 |
CN116503828A (zh) * | 2023-04-24 | 2023-07-28 | 大连理工大学 | 一种基于激光雷达的列车轨道检测方法 |
CN117111055A (zh) * | 2023-06-19 | 2023-11-24 | 山东高速集团有限公司 | 一种基于雷视融合的车辆状态感知方法 |
CN116883496B (zh) * | 2023-06-26 | 2024-03-12 | 小米汽车科技有限公司 | 交通元素的坐标重建方法、装置、电子设备及存储介质 |
CN116883496A (zh) * | 2023-06-26 | 2023-10-13 | 小米汽车科技有限公司 | 交通元素的坐标重建方法、装置、电子设备及存储介质 |
CN116523970A (zh) * | 2023-07-05 | 2023-08-01 | 之江实验室 | 基于二次隐式匹配的动态三维目标跟踪方法及装置 |
CN116523970B (zh) * | 2023-07-05 | 2023-10-20 | 之江实验室 | 基于二次隐式匹配的动态三维目标跟踪方法及装置 |
CN116545122A (zh) * | 2023-07-06 | 2023-08-04 | 中国电力科学研究院有限公司 | 一种输电线路防外破监测装置和防外破监测方法 |
CN116545122B (zh) * | 2023-07-06 | 2023-09-19 | 中国电力科学研究院有限公司 | 一种输电线路防外破监测装置和防外破监测方法 |
CN117268350A (zh) * | 2023-09-18 | 2023-12-22 | 广东省核工业地质局测绘院 | 一种基于点云数据融合的移动式智能测绘系统 |
CN117132519B (zh) * | 2023-10-23 | 2024-03-12 | 江苏华鲲振宇智能科技有限责任公司 | 基于vpx总线多传感器图像融合处理模块 |
CN117132519A (zh) * | 2023-10-23 | 2023-11-28 | 江苏华鲲振宇智能科技有限责任公司 | 基于vpx总线多传感器图像融合处理模块 |
CN117173257A (zh) * | 2023-11-02 | 2023-12-05 | 安徽蔚来智驾科技有限公司 | 3d目标检测及其标定参数增强方法、电子设备、介质 |
CN117173257B (zh) * | 2023-11-02 | 2024-05-24 | 安徽蔚来智驾科技有限公司 | 3d目标检测及其标定参数增强方法、电子设备、介质 |
CN117893797A (zh) * | 2023-12-26 | 2024-04-16 | 武汉天眸光电科技有限公司 | 基于车路协同的目标检测方法、装置、设备及存储介质 |
CN118097555A (zh) * | 2024-03-01 | 2024-05-28 | 山东大学 | 多视角多模态融合的区域监控检测方法及系统 |
CN118675025A (zh) * | 2024-08-21 | 2024-09-20 | 中国科学院自动化研究所 | 基于图像检测器原始输出的图像融合检测方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
EP4283515A1 (en) | 2023-11-29 |
CN112861653A (zh) | 2021-05-28 |
EP4283515A4 (en) | 2024-06-12 |
CN112861653B (zh) | 2024-01-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022156175A1 (zh) | 融合图像和点云信息的检测方法、系统、设备及存储介质 | |
US11042762B2 (en) | Sensor calibration method and device, computer device, medium, and vehicle | |
CN110163930B (zh) | 车道线生成方法、装置、设备、系统及可读存储介质 | |
WO2022022694A1 (zh) | 自动驾驶环境感知方法及系统 | |
WO2021073656A1 (zh) | 图像数据自动标注方法及装置 | |
CN113657224B (zh) | 车路协同中用于确定对象状态的方法、装置、设备 | |
KR102420476B1 (ko) | 차량의 위치 추정 장치, 차량의 위치 추정 방법, 및 이러한 방법을 수행하도록 프로그램된 컴퓨터 프로그램을 저장하는 컴퓨터 판독가능한 기록매체 | |
CN112639882B (zh) | 定位方法、装置及系统 | |
US11144770B2 (en) | Method and device for positioning vehicle, device, and computer readable storage medium | |
CN111238494A (zh) | 载具、载具定位系统及载具定位方法 | |
CN112232275B (zh) | 基于双目识别的障碍物检测方法、系统、设备及存储介质 | |
WO2023065342A1 (zh) | 车辆及其定位方法、装置、设备、计算机可读存储介质 | |
US11842440B2 (en) | Landmark location reconstruction in autonomous machine applications | |
WO2021168854A1 (zh) | 可行驶区域检测的方法和装置 | |
TW202020734A (zh) | 載具、載具定位系統及載具定位方法 | |
CN113469045A (zh) | 无人集卡的视觉定位方法、系统、电子设备和存储介质 | |
Youssefi et al. | Visual and light detection and ranging-based simultaneous localization and mapping for self-driving cars | |
WO2021189420A1 (zh) | 一种数据处理方法及装置 | |
US12039788B2 (en) | Path planning method and system using the same | |
CN115937449A (zh) | 高精地图生成方法、装置、电子设备和存储介质 | |
CN115562076A (zh) | 用于无人驾驶矿车的仿真系统、方法以及存储介质 | |
CN114462545A (zh) | 一种基于语义slam的地图构建方法及装置 | |
Zhang | Target-based calibration of 3D LiDAR and binocular camera on unmanned vehicles | |
CN118603103B (zh) | 基于视觉slam的无人机室内导航系统 | |
WO2022133911A1 (zh) | 目标检测方法、装置、可移动平台及计算机可读存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21920556 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2021920556 Country of ref document: EP Effective date: 20230821 |