WO2021134325A1 - Appareil et procédé de détection d'obstacle basés sur une technologie sans conducteur et dispositif informatique - Google Patents

Appareil et procédé de détection d'obstacle basés sur une technologie sans conducteur et dispositif informatique Download PDF

Info

Publication number
WO2021134325A1
WO2021134325A1 PCT/CN2019/130155 CN2019130155W WO2021134325A1 WO 2021134325 A1 WO2021134325 A1 WO 2021134325A1 CN 2019130155 W CN2019130155 W CN 2019130155W WO 2021134325 A1 WO2021134325 A1 WO 2021134325A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature information
point cloud
current frame
data
extraction
Prior art date
Application number
PCT/CN2019/130155
Other languages
English (en)
Chinese (zh)
Inventor
邹晓艺
何明
叶茂盛
吴伟
许双杰
许家妙
曹通易
Original Assignee
深圳元戎启行科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳元戎启行科技有限公司 filed Critical 深圳元戎启行科技有限公司
Priority to PCT/CN2019/130155 priority Critical patent/WO2021134325A1/fr
Priority to CN201980037716.XA priority patent/CN113678136A/zh
Publication of WO2021134325A1 publication Critical patent/WO2021134325A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition

Definitions

  • This application relates to an obstacle detection method, device, computer equipment and storage medium based on unmanned driving technology.
  • the point cloud data is projected into the image data to obtain feature information of multiple channels, and then obstacle detection is performed based on the feature information of multiple channels.
  • some information may be lost, resulting in the extraction of effective feature information of each source data is not comprehensive enough, resulting in low accuracy of obstacle detection.
  • an obstacle detection method, device, computer device, and storage medium based on an unmanned driving technology that can improve the accuracy of obstacle detection in an unmanned driving process are provided.
  • An obstacle detection method based on unmanned driving technology including:
  • the point cloud feature information corresponding to each perspective and the current frame image data are input into the corresponding feature extraction model, and the spatial feature information corresponding to each perspective and the current frame image data are extracted in parallel through the corresponding feature extraction model Image feature information;
  • the fused feature information is input into a trained detection model, and the fused feature information is predicted and calculated through the detection model, and an obstacle detection result is output.
  • An obstacle detection device based on unmanned driving technology including:
  • the acquisition module is used to acquire the current frame point cloud data and the current frame image data within a preset angle range
  • a projection module configured to project the point cloud data of the current frame on multiple viewing angles to obtain two-dimensional planes corresponding to the multiple viewing angles;
  • the first extraction module is used to perform feature extraction on the two-dimensional plane corresponding to each perspective to obtain point cloud feature information corresponding to each perspective;
  • the second extraction module is used to input the point cloud feature information corresponding to each perspective and the current frame image data into the corresponding feature extraction model, and extract the spatial feature information corresponding to each perspective in parallel through the corresponding feature extraction model.
  • the fusion module is used to fuse the spatial feature information corresponding to multiple perspectives with the image feature information to obtain the fused feature information;
  • the prediction module is used to input the fused feature information into the trained detection model, and perform prediction operations on the fused feature information through the detection model, and output obstacle detection results.
  • a computer device including a memory and one or more processors, the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the one or more processors execute The following steps:
  • the point cloud feature information corresponding to each perspective and the current frame image data are input into the corresponding feature extraction model, and the spatial feature information corresponding to each perspective and the current frame image data are extracted in parallel through the corresponding feature extraction model Image feature information;
  • the fused feature information is input into a trained detection model, and the fused feature information is predicted and calculated through the detection model, and an obstacle detection result is output.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the computer-readable instructions When executed by one or more processors, the one or more processors perform the following steps:
  • the fused feature information is input into a trained detection model, and the fused feature information is predicted and calculated through the detection model, and an obstacle detection result is output.
  • Fig. 1 is an application environment diagram of an obstacle detection method based on unmanned driving technology in one or more embodiments.
  • Fig. 2 is a schematic flowchart of an obstacle detection method based on unmanned driving technology in one or more embodiments.
  • FIG. 3 is a schematic flowchart of the step of fusing spatial feature information and image feature information corresponding to multiple viewing angles to obtain fused feature information in one or more embodiments.
  • Fig. 4 is a block diagram of an obstacle detection device based on unmanned driving technology in one or more embodiments.
  • Figure 5 is a block diagram of a computer device in one or more embodiments.
  • the obstacle detection method based on the unmanned driving technology provided in this application can be applied to the schematic diagram of obstacle detection during the unmanned driving process as shown in FIG. 1.
  • the first vehicle-mounted sensor 102 sends the collected point cloud data of the current frame to the vehicle-mounted computer device 104.
  • the first vehicle-mounted sensor may be a lidar.
  • On-board computer equipment can be referred to as computer equipment.
  • the second vehicle-mounted sensor 106 sends the collected image data of the current frame within the preset angle range to the computer device 104.
  • the second vehicle-mounted sensor may be a vehicle-mounted camera.
  • the computer device 104 projects the point cloud data of the current frame on multiple viewing angles to obtain two-dimensional planes corresponding to the multiple viewing angles.
  • the computer device 104 performs feature extraction on the two-dimensional plane corresponding to each view angle to obtain point cloud feature information corresponding to each view angle.
  • the computer device 104 inputs the point cloud feature information corresponding to each perspective and the current frame image data into the corresponding feature extraction model, and extracts the spatial feature information corresponding to each perspective and the current frame image data in parallel through the corresponding feature extraction model. Image feature information.
  • the computer device 104 fuses the spatial feature information corresponding to the multiple viewing angles with the current frame image data to obtain the fused feature information.
  • the computer device 104 inputs the fused feature information into the trained detection model, and performs a prediction operation on the fused feature information through the detection model, and outputs an obstacle detection result.
  • an obstacle detection method based on unmanned driving technology is provided. Taking the method applied to the computer equipment in FIG. 1 as an example for description, the method includes the following steps:
  • Step 202 Obtain current frame point cloud data and current frame image data within a preset angle range.
  • the collected current frame point cloud data is transmitted to the computer device through the first on-board sensor installed on the vehicle, and the preset angle range collected by the second on-board sensor installed on the vehicle
  • the image data of the current frame within is sent to the computer device.
  • the first vehicle-mounted sensor may be a lidar.
  • the current frame point cloud data is the current frame point cloud data within a 360-degree range collected by the first vehicle-mounted sensor.
  • the second vehicle-mounted sensor may be a vehicle-mounted camera.
  • the current frame image data within the preset angle range may be the current frame image data within a 360-degree range around the vehicle collected by multiple on-board cameras.
  • Step 204 Project the point cloud data of the current frame on multiple viewing angles to obtain two-dimensional planes corresponding to the multiple viewing angles.
  • Step 206 Perform feature extraction on the two-dimensional plane corresponding to each view angle to obtain point cloud feature information corresponding to each view angle.
  • the point cloud data of the current frame is 3D point cloud data.
  • the computer device projects the acquired point cloud data of the current frame to multiple viewing angles, thereby projecting the 3D point cloud data into the two-dimensional planes corresponding to the multiple viewing angles, and realizes the conversion of the 3D point cloud data into the two-dimensional data in the two-dimensional plane .
  • Multiple viewing angles may include a bird's-eye view and a front view.
  • the computer device projects the current point cloud data on the bird's-eye view angle
  • a two-dimensional plane corresponding to the bird's-eye view angle can be obtained.
  • the computer device projects the point cloud data of the current frame on the orthographic perspective
  • a two-dimensional plane corresponding to the orthographic perspective can be obtained.
  • the two-dimensional plane corresponding to each view includes the point cloud data of the current frame after projection.
  • the computer device can extract the point cloud feature information corresponding to each perspective in the two-dimensional plane corresponding to each perspective.
  • the point cloud feature information may be the local feature information of each point in the current frame point cloud data corresponding to each pixel in the two-dimensional plane, and the local feature information may include local depth, point cloud density, and the like.
  • the trained neural network model is pre-stored in the computer equipment.
  • the neural network model can be a pointnet based on the attention layer.
  • the computer device can input the two-dimensional plane corresponding to each perspective into the trained neural network model, and perform prediction operations on the two-dimensional plane corresponding to each perspective through the neural network model to obtain the point cloud feature information corresponding to each perspective .
  • Step 208 Input the point cloud feature information and current frame image data corresponding to each perspective into the corresponding feature extraction model, and extract the spatial feature information corresponding to each perspective and the current frame image data in parallel through the corresponding feature extraction model.
  • Image feature information Input the point cloud feature information and current frame image data corresponding to each perspective into the corresponding feature extraction model, and extract the spatial feature information corresponding to each perspective and the current frame image data in parallel through the corresponding feature extraction model.
  • the computer device converts the point cloud feature information corresponding to each view angle and the current frame image data to obtain the point cloud feature vector corresponding to each view angle and the image matrix corresponding to the current frame image data.
  • a plurality of feature extraction models are pre-stored in the computer equipment.
  • the multiple feature extraction models may be the same type of feature extraction models.
  • the feature extraction model is obtained by training a large amount of sample data.
  • the feature extraction model may be a 2D convolutional neural network model.
  • the computer device inputs the point cloud feature vector corresponding to each perspective and the image matrix corresponding to the current frame data into the corresponding feature extraction model, and performs parallel feature extraction through the feature extraction model to obtain the spatial feature information corresponding to each perspective and the current frame data. Image feature information corresponding to the frame image data.
  • the feature extraction model can include a pooling layer, and the computer device can perform dimensionality reduction processing on the point cloud feature information corresponding to each perspective according to the first resolution through the pooling layer of the corresponding feature extraction model, and then obtain the corresponding point cloud feature information for each perspective.
  • Spatial feature information The pooling layer of the corresponding feature extraction model performs dimensionality reduction processing on the current frame of image data according to the second resolution, and then obtains the image feature information corresponding to the current frame of image data.
  • the spatial feature information may include information such as the shape of the obstacle.
  • the image feature information may include information such as the shape and color of the obstacle.
  • Step 210 Fusion of spatial feature information and image feature information corresponding to multiple viewing angles to obtain fused feature information.
  • Step 212 Input the fused feature information into the trained detection model, and perform a prediction operation on the fused feature information through the detection model, and output an obstacle detection result.
  • the computer device can merge the spatial feature information and image feature information corresponding to multiple viewing angles.
  • the way of fusion may be to first stitch the spatial feature information and image feature information corresponding to multiple viewing angles according to preset parameters, and then align the stitched feature information to the preset viewing angles to obtain the fused feature information.
  • the computer equipment converts the fused feature information to obtain the fused feature vector.
  • the trained detection model is pre-stored in the computer equipment.
  • the detection model is obtained through training with a large amount of sample data.
  • the detection model may be a 2D convolutional neural network.
  • the detection model includes multiple network layers, for example, it may include an input layer, an attention layer, a convolutional layer, a pooling layer, a fully connected layer, and so on.
  • the computer device inputs the fused feature vector into the detection model, calculates the context vector and weight corresponding to the fused feature vector through the attention layer of the detection model, and generates a first extraction result according to the context feature and weight.
  • the convolutional layer extracts the context feature corresponding to the context vector according to the first extraction result to generate the second extraction result.
  • the second extraction result is reduced in dimensionality through the pooling layer of the detection model.
  • the second extraction result after dimensionality reduction is classified by the fully connected layer, and the classification result can be obtained.
  • the classification results are weighted and output through the output layer.
  • the computer equipment obtains the obstacle detection result according to the classification result outputted by the weighting.
  • the computer device obtains the current frame point cloud data and the current frame image data within a preset angle range, and projects the current frame point cloud data on multiple viewing angles to obtain two-dimensional planes corresponding to the multiple viewing angles. It is conducive to the subsequent fusion of the point cloud data of the current frame and the image data of the current frame.
  • the computer device performs feature extraction on the two-dimensional plane corresponding to each perspective, obtains the point cloud feature information corresponding to each perspective, and inputs the point cloud feature information corresponding to each perspective and the current frame image data into the corresponding feature extraction model.
  • the spatial feature information corresponding to each view angle and the image feature information corresponding to the current frame image data are extracted in parallel through the corresponding feature extraction model.
  • the computer equipment By performing multiple feature extraction on the two-dimensional plane corresponding to each viewing angle, it is possible to extract more comprehensive and effective feature information from the point cloud data of the current frame.
  • the computer equipment fuses the spatial feature information and the image feature information corresponding to multiple viewing angles to obtain the fused feature information. Based on the data characteristics of multiple source data, multiple source data can be complemented to obtain more comprehensive obstacle feature information.
  • the computer equipment predicts and calculates the fused feature information through the detection model, and outputs the obstacle detection result. Since the fused feature information is comprehensive, and the detection model is pre-trained, the accuracy of obstacle detection is effectively improved.
  • the steps of fusing the spatial feature information and the image feature information corresponding to multiple viewing angles to obtain the fused feature information include:
  • step 302 the spatial feature information and the image feature information corresponding to the multiple viewing angles are spliced according to preset parameters to obtain spliced feature information.
  • Step 304 align the spliced feature information to a preset viewing angle according to the preset parameters to obtain the aligned feature information, and use the aligned feature information as the fused feature information.
  • the computer device can perform dimensionality reduction processing on the point cloud feature information corresponding to each perspective according to the first resolution through the pooling layer of the corresponding feature extraction model, and obtain the spatial feature information after the dimensionality reduction processing, that is, the space corresponding to multiple perspectives. In the process of feature information.
  • the computer device can perform dimensionality reduction processing on the current frame image data according to the second resolution through the pooling layer of the corresponding feature extraction model to obtain the image feature information after the dimensionality reduction processing, that is, the image feature information corresponding to the current frame image data.
  • the preset parameter may be the coordinate conversion relationship between the point cloud data and the image data.
  • the computer device splices the spatial feature information corresponding to the bird's-eye view angle and the spatial feature information corresponding to the front view angle with the image feature information respectively according to preset parameters. After the computer device obtains the spliced feature information, it can align the spliced feature information to a preset viewing angle according to preset parameters.
  • the preset viewing angle may be a bird's-eye view.
  • the computer device then obtains the aligned feature information on the preset viewing angle, and uses the aligned feature information as the fused feature information.
  • the computer device stitches the spatial feature information and image feature information corresponding to the multiple viewing angles according to preset parameters, and then aligns the stitched feature information to the preset viewing angles according to the preset parameters to obtain the aligned Feature information, using the aligned feature information as the fused feature information.
  • the spatial feature information corresponding to multiple viewing angles can improve the accurate 3D information, the lack of color information, and the image feature information includes higher-resolution color information, lacks 3D information, by splicing and aligning the spatial feature information and the image feature information , To achieve the fusion of complementary data, so as to perform obstacle detection based on the fused feature information, which can further improve the accuracy of obstacle detection.
  • the two-dimensional plane includes the two-dimensional data corresponding to each point in the point cloud data of the current frame
  • performing feature extraction on the two-dimensional plane corresponding to each perspective to obtain point cloud feature information includes: Extract multiple data dimensions from the two-dimensional data corresponding to each point in the point cloud data of the current frame; input multiple data dimensions into the trained neural network model, and perform prediction operations on the feature information of multiple dimensions through the neural network model , Get point cloud feature information.
  • the computer device can extract multiple data dimensions from the two-dimensional data corresponding to each point in the point cloud data of the current frame. Multiple data dimensions may include the coordinates of points, reflectivity and other dimensions.
  • the trained neural network model is pre-stored in the computer equipment. The trained neural network model is obtained by training with a large amount of sample data.
  • the neural network model can be a pointnet based on the attention layer.
  • the neural network model can include multiple network layers.
  • the network layer may include an attention layer, a convolutional layer, and so on.
  • the computer device can input the extracted multiple data dimensions into the trained neural network model, and calculate the context vectors and weights corresponding to the multiple data dimensions through the attention layer of the neural network model.
  • the neural network model takes the context vector and weight as the input of the convolutional layer, and extracts the context features corresponding to the context vector through the convolutional layer.
  • the neural network model takes the context features and weights as the input of the pooling layer, and reduces the dimensionality of the context features through the pooling layer.
  • the output layer of the neural network model outputs the context features and weights after dimensionality reduction, and uses the context features after dimensionality reduction as point cloud feature information.
  • the computer device extracts multiple data dimensions from the two-dimensional data corresponding to each point in the point cloud data of the current frame, and performs prediction operations on the multiple data dimensions through a neural network model to obtain point cloud feature information. Since the neural network is pre-trained, the local feature information of each point in the current frame of point cloud data can be accurately extracted through the neural network model, which is beneficial to the subsequent extraction of spatial feature information of the current frame of point cloud data.
  • projecting the point cloud data of the current frame on multiple viewing angles to obtain a two-dimensional plane corresponding to the multiple viewing angles includes: projecting the point cloud data of the current frame on a bird's-eye view angle to obtain the corresponding bird's-eye view angle.
  • the two-dimensional plane of the current frame project the point cloud data of the current frame on the orthographic perspective to obtain the two-dimensional plane corresponding to the orthographic perspective.
  • the computer device can project the point cloud data of the current frame to multiple perspectives.
  • the coordinates of the point cloud data of the current frame can be expressed as (x, y, z).
  • Computer equipment can project with preset resolutions. Multiple viewing angles may include a bird's-eye view and a front view. For example, in the process of bird's-eye perspective projection, when the preset resolution is 0.1m per grid, then the point cloud data of the current frame within the range of -60 ⁇ x ⁇ 60 and -60 ⁇ y ⁇ 60 can be set. Projected into a two-dimensional plane with a size of 1200x 1200.
  • the point cloud data of the current frame within the range of 0.05m will all fall on the corresponding grid.
  • the computer device can obtain an unobstructed and intuitive obstruction view, avoiding the problem of inaccurate point cloud feature information extracted by obstructing objects.
  • the computer device can describe the shape of a smaller target more intuitively, for example, can describe pedestrians more intuitively. This is beneficial for the computer equipment to extract more comprehensive and accurate effective feature information from the two-dimensional planes corresponding to multiple viewing angles.
  • the detection model includes a plurality of network layers, and performing prediction operations on the fused feature information through the detection model, and outputting obstacle detection results, includes: inputting the fused feature information into the input layer of the detection model;
  • the fused feature information is input to the attention layer of the detection model through the input layer, and the context vector and weight corresponding to the fused feature information are calculated through the attention layer to generate the first extraction result;
  • the first extraction result is input to the convolutional layer , Extract the context feature corresponding to the context vector through the convolutional layer to generate the second extraction result; input the second extraction result into the pooling layer, and perform dimensionality reduction processing on the second extraction result through the pooling layer;
  • Second, the extraction results are input to the fully connected layer, and the second extraction results after the dimensionality reduction process are classified through the fully connected layer to obtain the classification results, and the classification results are weighted through the output layer and output; the weighted classification results are selected from the weighted output
  • the classification result is used as the obstacle detection result.
  • the trained detection model is pre-stored in the computer equipment.
  • the detection model may be a detection model obtained after pre-training with a large amount of sample data.
  • the detection model may be a 2D convolutional neural network based on the attention layer.
  • the detection model can include multiple network layers.
  • the detection model may include multiple network layers such as an input layer, an attention layer, a convolutional layer, a pooling layer, a fully connected layer, and an output layer.
  • the computer device inputs the fused feature information to the input layer of the detection model by calling the trained detection model.
  • the fused feature information is transmitted to the attention layer through the input layer, and the context vector and weight corresponding to the fused feature information are calculated through the attention layer, and the first extraction result is generated according to the context vector and weight.
  • the detection model uses the first extraction result as the input of the convolution layer, extracts the context feature corresponding to the context vector through the convolution layer, and generates the second extraction result according to the context feature and weight. Furthermore, the computer device uses the second extraction result as the input of the pooling layer, and performs dimensionality reduction processing on the second extraction result through the pooling layer to obtain the second extraction result after the dimensionality reduction processing. The computer device uses the second extraction result after the dimensionality reduction processing as the input of the fully connected layer, and classifies the second extraction result after the dimensionality reduction processing to obtain the classification result.
  • the classification result can include multiple categories of obstacles, multiple location information, and so on. Furthermore, the classification results are weighted and output through the output layer. Furthermore, the computer device selects the classification result with the largest weight among the weighted classification results, which is the obstacle detection result. Obstacle detection results may include the location information of the obstacle, the size of the obstacle, the shape of the obstacle, and so on.
  • the computer device calculates the context vector and weight corresponding to the fused feature information through the attention layer of the detection model, and generates the first extraction result. It can filter the interference information in the fused feature information, and realize the feature focus processing on the fused feature information.
  • the context feature corresponding to the context vector is extracted through the convolutional layer to generate the second extraction result, and the second extraction result is reduced by the pooling layer, which can extract the main context features and avoid the influence of redundant features.
  • the computer device classifies the second extraction result after dimensionality reduction to obtain the classification result, and then weights the classification result and outputs it.
  • the classification result with the largest weight among the weighted classification results is selected as the obstacle detection result, which can classify
  • the results are normalized to further improve the accuracy of obstacle detection.
  • the two-dimensional plane includes multiple pixels, and each pixel corresponds to the two-dimensional data of multiple points in the point cloud data of the current frame.
  • each view angle Before feature extraction is performed on the two-dimensional plane corresponding to each view angle, it also includes : Perform average processing on the two-dimensional data of multiple points corresponding to each pixel to obtain the average value; perform normalization processing on the points corresponding to the corresponding pixels according to the average value.
  • the computer device may also perform normalization processing on multiple points in the current frame point cloud data in the two-dimensional plane.
  • the two-dimensional plane includes multiple pixels, and each pixel may be represented by a grid, and each grid includes multiple points in the point cloud data of the current frame.
  • the computer equipment averages the coordinates of multiple points in each grid to obtain the average value. Furthermore, the computer equipment makes the difference between the coordinates of each point in the grid and the average value to realize the normalization of the point cloud data of the current frame in each grid.
  • the feature extraction is performed on the two-dimensional plane corresponding to each perspective to obtain the point cloud feature information corresponding to each perspective
  • the method further includes: invoking multiple threads to concurrently extract each perspective in the two-dimensional plane corresponding to each perspective.
  • Point cloud feature information corresponding to each perspective before inputting the point cloud feature information corresponding to each perspective and the current frame image data into the corresponding feature extraction model, it also includes: using multi-threaded point cloud features corresponding to each perspective
  • the information and the current frame image data are converted in parallel to obtain the point cloud feature vector corresponding to each view angle and the image matrix corresponding to the current frame image data.
  • the computer device uses multiple threads to concurrently extract the point cloud feature information corresponding to each perspective in the two-dimensional plane corresponding to each perspective through multiple threads. Thereby improving the extraction efficiency of point cloud feature information.
  • the computer device can also use the multi-thread to perform the point cloud feature information corresponding to each perspective and the current frame image data before inputting the point cloud feature information corresponding to each perspective and the current frame image data into the corresponding feature extraction model. Parallel conversion effectively reduces the time-consuming feature extraction model for feature extraction.
  • the computer device may also obtain historical trajectory information of multiple obstacles in the current environment according to the obstacle detection result, and at the same time obtain the current position information of the vehicle. Predict the trajectory of multiple obstacles within a preset time period based on historical trajectory information and current position information.
  • the computer device tracks the movement process of the obstacle in the obstacle detection result, predicts the position information at the current time based on the position information of the obstacle at the previous time, and compares the predicted position information at the current time with the actual position. Information is compared to obtain error information. The computer device corrects the position information at the next moment according to the error information, thereby obtaining historical trajectory information of multiple obstacles.
  • the computer equipment can obtain the current position information sent by the vehicle-mounted locator. Therefore, the computer device can render the acquired historical trajectory information of multiple obstacles into a feature map to obtain a trajectory rendering map.
  • the historical trajectory information may be the trajectory of each frame of the history of multiple obstacles.
  • the computer device renders the historical trajectory information of multiple obstacles in the current frame to obtain a trajectory rendering map.
  • the color of obstacles in each frame in the trajectory rendering diagram changes with the distance from the current frame. The farther away from the current frame, the lighter the color of the obstacle.
  • the obstacle itself and the surrounding environment information can be obtained, and the influence factors of the trajectory can be considered from various aspects, which is more conducive to improving the accuracy of trajectory prediction.
  • the computer equipment obtains the current position information collected by the vehicle-mounted locator.
  • the current location information may be the location information of the vehicle on the high-precision map at the current moment.
  • the current location information can be expressed in the form of latitude and longitude.
  • the computer equipment extracts map elements from the current location information. Map elements can include information such as lane lines, center lines, sidewalks, and stop lines.
  • the computer device may render the extracted map elements according to multiple channel dimensions, and render the map elements into a map element rendering map corresponding to the channel dimensions. When the map elements are different, the channel dimensions corresponding to the map elements can also be different.
  • Channel dimensions can include color channels, element channels, and so on.
  • the color channel can include three channels of red, green, and blue.
  • Elemental passages can include lane-line passages, center-line passages, and sidewalk passages. The current position of the obstacle can be rendered intuitively and accurately through the channel dimension corresponding to the map element, which is conducive to subsequent trajectory prediction.
  • the trajectory rendering image and the map element rendering image can be stitched together.
  • the computer device determines the corresponding channel dimensions of the trajectory rendering map and the map element rendering map, and performs image stitching on the trajectory rendering map and the map element rendering map in the corresponding channel dimensions to obtain a spliced image matrix.
  • the spliced image matrix may be a complete image including the trajectory rendering map and the map element rendering map.
  • the computer device has pre-trained a feature extractor before acquiring the historical trajectory information and current position information of multiple obstacles in the current environment.
  • the computer device calls the trained feature extractor, and inputs the spliced image matrix into the trained feature extractor.
  • the computer device extracts the image feature information and context feature information corresponding to the spliced image matrix through the feature extractor, and then outputs the feature extraction result corresponding to the spliced image matrix through the fully connected layer of the feature extractor. It realizes the combination of various influence factors of the obstacle trajectory, and further improves the comprehensiveness of the feature extraction results.
  • the computer equipment can calculate the feature extraction results by means of regression prediction to obtain the trajectories of multiple obstacles within a preset time period. Because the obstacle detection result is more comprehensive and accurate, and the feature extraction result includes the trajectory of multiple obstacles in the history frame, the scope of environmental information is expanded, and the trajectory prediction based on various influencing factors is realized, thereby effectively providing the trajectory The accuracy of the forecast.
  • an obstacle detection device based on unmanned driving technology including: an acquisition module 402, a projection module 404, a first extraction module 406, a second extraction module 408, The fusion module 410 and the prediction module 412, where:
  • the acquiring module 402 is used to acquire the point cloud data of the current frame and the image data of the current frame within a preset angle range.
  • the projection module 404 is configured to project the point cloud data of the current frame on multiple viewing angles to obtain two-dimensional planes corresponding to the multiple viewing angles.
  • the first extraction module 406 is configured to perform feature extraction on the two-dimensional plane corresponding to each view angle to obtain point cloud feature information corresponding to each view angle.
  • the second extraction module 408 is used to input the point cloud feature information and current frame image data corresponding to each perspective into the corresponding feature extraction model, and extract the spatial feature information corresponding to each perspective and the current frame through the corresponding feature extraction model in parallel. Image feature information corresponding to the frame image data.
  • the fusion module 410 is used for fusing spatial feature information and image feature information corresponding to multiple viewing angles to obtain fused feature information.
  • the prediction module 412 is configured to input the fused feature information into the trained detection model, perform prediction operations on the fused feature information through the detection model, and output obstacle detection results.
  • the fusion module 410 is further configured to splice the spatial characteristic information and image characteristic information corresponding to the multiple viewing angles according to preset parameters to obtain the spliced characteristic information; according to the preset parameters, the spliced characteristic information The alignment is performed to the preset viewing angle to obtain the aligned feature information, and the aligned feature information is used as the fused feature information.
  • the first extraction module 406 is also used to extract multiple data dimensions from the two-dimensional data corresponding to each point in the point cloud data of the current frame; input the multiple data dimensions to the trained neural network model
  • the neural network model is used to perform predictive operations on multiple data dimensions to obtain point cloud feature information.
  • the projection module 404 is also used to project the point cloud data of the current frame on the bird's-eye view angle to obtain a two-dimensional plane corresponding to the bird's-eye view angle; Two-dimensional plane corresponding to the viewing angle
  • the prediction module 412 is also used to input the fused feature information to the input layer of the detection model; the fused feature information is input to the attention layer of the detection model through the input layer, and the attention layer is used to calculate The context vector and weight corresponding to the fused feature information are generated to generate the first extraction result; the first extraction result is input to the convolutional layer, and the context feature corresponding to the context vector is extracted through the convolutional layer to generate the second extraction result; the second extraction The result is input to the pooling layer, and the second extraction result is reduced by the pooling layer; the second extraction result after the dimensionality reduction is input into the fully connected layer, and the second extraction result after the dimensionality reduction is processed through the fully connected layer
  • the classification results are obtained by classification, and the classification results are weighted and output through the output layer; the classification result with the largest weight among the weighted classification results is selected as the obstacle detection result.
  • the above-mentioned device further includes: a normalization processing module for averaging the two-dimensional data of multiple points corresponding to each pixel to obtain an average value; according to the average value, the points corresponding to the corresponding pixels are processed Standardized processing.
  • the first extraction module 406 is also used to call multiple threads to concurrently extract the point cloud feature information corresponding to each perspective in the two-dimensional plane corresponding to each perspective; the above-mentioned device further includes: a conversion module for The point cloud feature information corresponding to each view angle and the current frame image data are converted in parallel by using multiple threads to obtain the point cloud feature vector corresponding to each view angle and the image matrix corresponding to the current frame image data.
  • the various modules in the above-mentioned obstacle detection device based on unmanned driving technology can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a computer device is provided, and its internal structure diagram may be as shown in FIG. 5.
  • the computer equipment includes a processor, a memory, a communication interface and a database connected through a system bus.
  • the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the database of the computer equipment is used to store obstacle detection results.
  • the communication interface of the computer device is used to connect and communicate with the first vehicle-mounted sensor, the second vehicle-mounted sensor, and the vehicle-mounted positioning sensor.
  • the computer readable instruction is executed by the processor to realize an obstacle detection method.
  • FIG. 5 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • a computer device that includes a memory and one or more processors.
  • the memory stores computer-readable instructions.
  • the one or more processors execute each of the foregoing method implementations. The steps in the example.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the computer-readable instructions are executed by one or more processors, the one or more processors execute the steps in each of the foregoing method embodiments. step.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)

Abstract

Un procédé de détection d'obstacle basé sur une technologie sans conducteur comprend les étapes consistant à : obtenir des données de nuage de points de trame actuelle et des données d'image de trame actuelle dans une plage d'angles prédéfinie (202) ; projeter les données de nuage de points de trame actuelle à partir d'une pluralité d'angles de visualisation pour obtenir des plans bidimensionnels correspondant à la pluralité d'angles de visualisation (204) ; effectuer une extraction de caractéristiques sur le plan bidimensionnel correspondant à chaque angle de visualisation pour obtenir des informations de caractéristiques de nuage de points correspondant à chaque angle de visualisation (206) ; entrer les informations de caractéristiques de nuage de points correspondant à chaque angle de visualisation et aux données d'image de trame actuelle dans un modèle d'extraction de caractéristiques correspondant, et extraire des informations de caractéristiques spatiales correspondant à chaque angle de visualisation et des informations de caractéristiques d'image correspondant aux données d'image de trame actuelle en parallèle au moyen du modèle d'extraction de caractéristiques correspondant (208) ; fusionner les informations de caractéristiques spatiales correspondant à la pluralité d'angles de visualisation et les informations de caractéristiques d'image pour obtenir des informations de caractéristiques fusionnées (210) ; et entrer les informations de caractéristiques fusionnées dans un modèle de détection entraîné, effectuer un calcul de prédiction sur les informations de caractéristiques fusionnées au moyen du modèle de détection, et délivrer en sortie un résultat de détection d'obstacle (212).
PCT/CN2019/130155 2019-12-30 2019-12-30 Appareil et procédé de détection d'obstacle basés sur une technologie sans conducteur et dispositif informatique WO2021134325A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/130155 WO2021134325A1 (fr) 2019-12-30 2019-12-30 Appareil et procédé de détection d'obstacle basés sur une technologie sans conducteur et dispositif informatique
CN201980037716.XA CN113678136A (zh) 2019-12-30 2019-12-30 基于无人驾驶技术的障碍物检测方法、装置和计算机设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/130155 WO2021134325A1 (fr) 2019-12-30 2019-12-30 Appareil et procédé de détection d'obstacle basés sur une technologie sans conducteur et dispositif informatique

Publications (1)

Publication Number Publication Date
WO2021134325A1 true WO2021134325A1 (fr) 2021-07-08

Family

ID=76686174

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/130155 WO2021134325A1 (fr) 2019-12-30 2019-12-30 Appareil et procédé de détection d'obstacle basés sur une technologie sans conducteur et dispositif informatique

Country Status (2)

Country Link
CN (1) CN113678136A (fr)
WO (1) WO2021134325A1 (fr)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113535877A (zh) * 2021-07-16 2021-10-22 上海高仙自动化科技发展有限公司 智能机器人地图的更新方法、装置、设备、介质及芯片
CN113724393A (zh) * 2021-08-12 2021-11-30 北京达佳互联信息技术有限公司 三维重建方法、装置、设备及存储介质
CN114173106A (zh) * 2021-12-01 2022-03-11 北京拙河科技有限公司 基于光场相机的实时视频流融合处理方法与系统
CN114419372A (zh) * 2022-01-13 2022-04-29 南京邮电大学 一种多尺度点云分类方法及系统
CN114429631A (zh) * 2022-01-27 2022-05-03 北京百度网讯科技有限公司 三维对象检测方法、装置、设备以及存储介质
CN114511717A (zh) * 2022-02-17 2022-05-17 广发银行股份有限公司 背景相似识别的方法、系统、设备及介质
CN114584850A (zh) * 2022-03-09 2022-06-03 合肥工业大学 一种面向点云视频流媒体传输的用户视角预测方法
CN114743001A (zh) * 2022-04-06 2022-07-12 合众新能源汽车有限公司 语义分割方法、装置、电子设备及存储介质
CN114913373A (zh) * 2022-05-12 2022-08-16 苏州轻棹科技有限公司 一种基于图像点云对序列的分类方法和装置
CN114972165A (zh) * 2022-03-24 2022-08-30 中山大学孙逸仙纪念医院 一种时间平均剪切力的测量方法和装置
CN115471561A (zh) * 2022-11-14 2022-12-13 科大讯飞股份有限公司 对象关键点定位方法、清洁机器人控制方法及相关设备
CN117456002A (zh) * 2023-12-22 2024-01-26 珠海市格努科技有限公司 无序抓取过程中对象的位姿估计方法、装置和电子设备

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115761702B (zh) * 2022-12-01 2024-02-02 广汽埃安新能源汽车股份有限公司 车辆轨迹生成方法、装置、电子设备和计算机可读介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140340518A1 (en) * 2013-05-20 2014-11-20 Nidec Elesys Corporation External sensing device for vehicle, method of correcting axial deviation and recording medium
CN107966700A (zh) * 2017-11-20 2018-04-27 天津大学 一种用于无人驾驶汽车的前方障碍物检测系统及方法
CN108269281A (zh) * 2016-12-30 2018-07-10 无锡顶视科技有限公司 基于双目视觉的避障技术方法
CN109145677A (zh) * 2017-06-15 2019-01-04 百度在线网络技术(北京)有限公司 障碍物检测方法、装置、设备及存储介质
US20190220650A1 (en) * 2015-08-24 2019-07-18 Qualcomm Incorporated Systems and methods for depth map sampling
CN110371108A (zh) * 2019-06-14 2019-10-25 浙江零跑科技有限公司 车载超声波雷达与车载环视系统融合方法
CN110488805A (zh) * 2018-05-15 2019-11-22 武汉小狮科技有限公司 一种基于3d立体视觉的无人车避障系统及方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229366B (zh) * 2017-12-28 2021-12-14 北京航空航天大学 基于雷达和图像数据融合的深度学习车载障碍物检测方法
CN109948661B (zh) * 2019-02-27 2023-04-07 江苏大学 一种基于多传感器融合的3d车辆检测方法
CN110045729B (zh) * 2019-03-12 2022-09-13 北京小马慧行科技有限公司 一种车辆自动驾驶方法及装置
CN110223223A (zh) * 2019-04-28 2019-09-10 北京清城同衡智慧园高新技术研究院有限公司 街道扫描方法、装置及扫描仪
CN110363820B (zh) * 2019-06-28 2023-05-16 东南大学 一种基于激光雷达、图像前融合的目标检测方法
CN110458112B (zh) * 2019-08-14 2020-11-20 上海眼控科技股份有限公司 车辆检测方法、装置、计算机设备和可读存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140340518A1 (en) * 2013-05-20 2014-11-20 Nidec Elesys Corporation External sensing device for vehicle, method of correcting axial deviation and recording medium
US20190220650A1 (en) * 2015-08-24 2019-07-18 Qualcomm Incorporated Systems and methods for depth map sampling
CN108269281A (zh) * 2016-12-30 2018-07-10 无锡顶视科技有限公司 基于双目视觉的避障技术方法
CN109145677A (zh) * 2017-06-15 2019-01-04 百度在线网络技术(北京)有限公司 障碍物检测方法、装置、设备及存储介质
CN107966700A (zh) * 2017-11-20 2018-04-27 天津大学 一种用于无人驾驶汽车的前方障碍物检测系统及方法
CN110488805A (zh) * 2018-05-15 2019-11-22 武汉小狮科技有限公司 一种基于3d立体视觉的无人车避障系统及方法
CN110371108A (zh) * 2019-06-14 2019-10-25 浙江零跑科技有限公司 车载超声波雷达与车载环视系统融合方法

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113535877A (zh) * 2021-07-16 2021-10-22 上海高仙自动化科技发展有限公司 智能机器人地图的更新方法、装置、设备、介质及芯片
CN113724393A (zh) * 2021-08-12 2021-11-30 北京达佳互联信息技术有限公司 三维重建方法、装置、设备及存储介质
CN113724393B (zh) * 2021-08-12 2024-03-19 北京达佳互联信息技术有限公司 三维重建方法、装置、设备及存储介质
CN114173106B (zh) * 2021-12-01 2022-08-05 北京拙河科技有限公司 基于光场相机的实时视频流融合处理方法与系统
CN114173106A (zh) * 2021-12-01 2022-03-11 北京拙河科技有限公司 基于光场相机的实时视频流融合处理方法与系统
CN114419372A (zh) * 2022-01-13 2022-04-29 南京邮电大学 一种多尺度点云分类方法及系统
CN114429631A (zh) * 2022-01-27 2022-05-03 北京百度网讯科技有限公司 三维对象检测方法、装置、设备以及存储介质
CN114429631B (zh) * 2022-01-27 2023-11-14 北京百度网讯科技有限公司 三维对象检测方法、装置、设备以及存储介质
CN114511717A (zh) * 2022-02-17 2022-05-17 广发银行股份有限公司 背景相似识别的方法、系统、设备及介质
CN114584850A (zh) * 2022-03-09 2022-06-03 合肥工业大学 一种面向点云视频流媒体传输的用户视角预测方法
CN114584850B (zh) * 2022-03-09 2023-08-25 合肥工业大学 一种面向点云视频流媒体传输的用户视角预测方法
CN114972165A (zh) * 2022-03-24 2022-08-30 中山大学孙逸仙纪念医院 一种时间平均剪切力的测量方法和装置
CN114972165B (zh) * 2022-03-24 2024-03-15 中山大学孙逸仙纪念医院 一种时间平均剪切力的测量方法和装置
CN114743001A (zh) * 2022-04-06 2022-07-12 合众新能源汽车有限公司 语义分割方法、装置、电子设备及存储介质
CN114913373A (zh) * 2022-05-12 2022-08-16 苏州轻棹科技有限公司 一种基于图像点云对序列的分类方法和装置
CN114913373B (zh) * 2022-05-12 2024-04-09 苏州轻棹科技有限公司 一种基于图像点云对序列的分类方法和装置
CN115471561A (zh) * 2022-11-14 2022-12-13 科大讯飞股份有限公司 对象关键点定位方法、清洁机器人控制方法及相关设备
CN117456002A (zh) * 2023-12-22 2024-01-26 珠海市格努科技有限公司 无序抓取过程中对象的位姿估计方法、装置和电子设备
CN117456002B (zh) * 2023-12-22 2024-04-02 珠海市格努科技有限公司 无序抓取过程中对象的位姿估计方法、装置和电子设备

Also Published As

Publication number Publication date
CN113678136A (zh) 2021-11-19

Similar Documents

Publication Publication Date Title
WO2021134325A1 (fr) Appareil et procédé de détection d'obstacle basés sur une technologie sans conducteur et dispositif informatique
JP6031554B2 (ja) 単眼カメラに基づく障害物検知方法及び装置
CN111797650B (zh) 障碍物的识别方法、装置、计算机设备和存储介质
EP3617944A1 (fr) Procédé et appareil de reconnaissance d'objet, dispositif, véhicule et support
CN111507166A (zh) 通过一起使用照相机和雷达来学习cnn的方法及装置
EP3822852B1 (fr) Méthode, appareil, support d'enregistrement de données lisible par ordinateur et produit programme d'ordinateur de formation d'un modèle de planification de trajectoire
CN111563415A (zh) 一种基于双目视觉的三维目标检测系统及方法
CN111461221B (zh) 一种面向自动驾驶的多源传感器融合目标检测方法和系统
CN110796104A (zh) 目标检测方法、装置、存储介质及无人机
CN115004259B (zh) 对象识别方法、装置、计算机设备和存储介质
CN116469079A (zh) 一种自动驾驶bev任务学习方法及相关装置
WO2021134357A1 (fr) Procédé et appareil de traitement d'informations de perception, dispositif informatique et support de stockage
Liu et al. Vehicle-related distance estimation using customized YOLOv7
CN115701864A (zh) 神经网络训练方法、目标检测方法、设备、介质及产品
EP4040400A1 (fr) Inspection guidée avec modèles de reconnaissance d'objet et planification de navigation
CN113160272B (zh) 目标跟踪方法、装置、电子设备及存储介质
US20230401748A1 (en) Apparatus and methods to calibrate a stereo camera pair
CN116894829A (zh) 焊缝缺陷检测的方法、装置、计算机设备及存储介质
CN116681739A (zh) 目标运动轨迹生成方法、装置及电子设备
CN116403186A (zh) 基于FPN Swin Transformer与Pointnet++ 的自动驾驶三维目标检测方法
Unger et al. Multi-camera bird’s eye view perception for autonomous driving
CN111460854A (zh) 一种远距离目标检测方法、装置及系统
CN112433193B (zh) 一种基于多传感器的模位置定位方法及系统
Zhang et al. A vision-centric approach for static map element annotation
CN115249407A (zh) 指示灯状态识别方法、装置、电子设备、存储介质及产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19958320

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19958320

Country of ref document: EP

Kind code of ref document: A1