CN117784168A - Construction area sensing method, device, equipment and storage medium in automatic driving - Google Patents

Construction area sensing method, device, equipment and storage medium in automatic driving Download PDF

Info

Publication number
CN117784168A
CN117784168A CN202311782491.5A CN202311782491A CN117784168A CN 117784168 A CN117784168 A CN 117784168A CN 202311782491 A CN202311782491 A CN 202311782491A CN 117784168 A CN117784168 A CN 117784168A
Authority
CN
China
Prior art keywords
grid
image
points
current frame
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311782491.5A
Other languages
Chinese (zh)
Inventor
王文斌
黄先楼
张丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Uisee Technologies Beijing Co Ltd
Original Assignee
Uisee Technologies Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Uisee Technologies Beijing Co Ltd filed Critical Uisee Technologies Beijing Co Ltd
Priority to CN202311782491.5A priority Critical patent/CN117784168A/en
Publication of CN117784168A publication Critical patent/CN117784168A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The disclosure relates to the technical field of automatic driving, and discloses a construction area sensing method, device and equipment in automatic driving and a storage medium. The method comprises the following steps: acquiring point cloud data of a current frame under a vehicle coordinate system; inputting the point cloud data into a semantic segmentation model to obtain semantic labels of each point in the point cloud data; converting the point cloud data containing the semantic tag into point cloud data under the aerial view angle; performing rasterization processing on the point cloud data under the aerial view angle to obtain a current frame raster image and a pixel value corresponding to each raster in the current frame raster image; based on the pixel value corresponding to each grid, carrying out binarization processing on the grid image of the current frame; and determining a construction area image based on the binarized current frame raster image. According to the embodiment of the disclosure, the construction area points are divided into the background points through the semantic tags, so that the construction area is judged to be a static obstacle, the risk of misjudging the construction area as a dynamic target is reduced, and the safety and stability of driving are improved.

Description

Construction area sensing method, device, equipment and storage medium in automatic driving
Technical Field
The disclosure relates to the technical field of automatic driving, and in particular relates to a construction area sensing method, device and equipment in automatic driving and a storage medium.
Background
One scenario often encountered during unmanned driving is the presence of a construction area on the vehicle's travel route, which is often temporary and generally not updated in real time in high-precision maps used by the unmanned. In order to improve the safety and smoothness of unmanned operation, the method has very important significance for the real-time perception of the related attribute of the area. However, in the related art, in a sensing frame mainly including a lidar, there is a problem that a construction area cannot be accurately sensed, and thus safety and smoothness of driving are affected to some extent.
Disclosure of Invention
In order to solve the above technical problems or at least partially solve the above technical problems, embodiments of the present disclosure provide a method, an apparatus, a device, and a storage medium for sensing a construction area in automatic driving, which can accurately sense a construction area encountered in an unmanned driving process, and can effectively improve unmanned safety and smoothness.
In a first aspect, an embodiment of the present disclosure provides a method for sensing a construction area in autopilot, the method including:
Acquiring point cloud data of a current frame under a vehicle coordinate system;
inputting the point cloud data into a semantic segmentation model to obtain semantic labels of each point in the point cloud data;
converting the point cloud data containing the semantic tag into point cloud data under the aerial view angle;
performing rasterization processing on the point cloud data under the aerial view angle to obtain a current frame raster image and a pixel value corresponding to each raster in the current frame raster image; the pixel value corresponding to the grid is determined according to the duty ratio of the category of the semantic tag of each point in the grid, wherein the category of the semantic tag comprises a foreground point, a background point and a construction area point, and the construction area point belongs to the background point;
based on the pixel value corresponding to each grid, carrying out binarization processing on the grid image of the current frame;
and determining a construction area image based on the binarized current frame raster image.
In a second aspect, an embodiment of the present disclosure further provides an apparatus for sensing a construction area in automatic driving, the apparatus including:
the data acquisition module is used for acquiring point cloud data of a current frame under a vehicle coordinate system;
the semantic tag module is used for inputting the point cloud data into the semantic segmentation model to obtain semantic tags of each point in the point cloud data;
The coordinate conversion module is used for converting the point cloud data containing the semantic tag into the point cloud data under the aerial view angle;
the rasterization processing module is used for rasterizing the point cloud data under the aerial view angle to obtain a current frame raster image and a pixel value corresponding to each raster in the current frame raster image; the pixel value corresponding to the grid is determined according to the duty ratio of the category of the semantic tag of each point in the grid, wherein the category of the semantic tag comprises a foreground point, a background point and a construction area point, and the construction area point belongs to the background point;
the binarization module is used for carrying out binarization processing on the grid image of the current frame based on the pixel values corresponding to the grids;
and the construction area determining module is used for determining a construction area image based on the binarized current frame grid image.
In a third aspect, embodiments of the present disclosure further provide an electronic device, including: one or more processors; a storage means for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the in-flight construction area awareness method as described above.
In a fourth aspect, the disclosed embodiments also provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the in-flight construction zone awareness method as described above.
According to the construction area sensing method in automatic driving, point cloud data containing semantic tags are converted into point cloud data under a bird's eye view angle, rasterization processing is conducted on the point cloud data under the bird's eye view angle to obtain a current frame raster image and pixel values corresponding to each raster in the current frame raster image, binarization processing is conducted on the current frame raster image based on the pixel values corresponding to the grids, and a construction area image is determined based on the binarization processed current frame raster image; the pixel values corresponding to the grids are determined according to the duty ratio of the category of the semantic labels of each point in the grids, and the construction area points are divided into background points, so that after the construction area is detected and perceived, the construction area is judged to be a static obstacle, the risk of misjudging the construction area as a dynamic target is reduced, and the stability of the travelling crane is effectively improved.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.
FIG. 1 is a flow chart of a method of construction area awareness in autopilot in an embodiment of the present disclosure;
FIG. 2 is a schematic illustration of a construction area point cloud image in an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a construction area skeleton diagram in an embodiment of the present disclosure;
FIG. 4 is a flowchart of a grid-corresponding pixel value calculation process in an embodiment of the disclosure;
FIG. 5 is a flow chart of extracting keypoints in an embodiment of the disclosure;
FIG. 6 is a schematic diagram of a construction zone key point in an embodiment of the present disclosure;
FIG. 7 is a schematic structural view of an automatic in-driving construction area sensing device according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of an electronic device in an embodiment of the disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.
The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
As described in the background art section, in the sensing frame mainly including the lidar, there is a problem that the construction area cannot be accurately sensed, and the safety and smoothness of the travelling crane are affected to some extent.
Specifically, in the perception framework mainly including the lidar, the existing technical scheme generally treats the construction area as a common obstacle.
The construction area is treated as a common obstacle, on one hand, dynamic and static false detection possibly occurs, and part of the construction area is detected as a dynamic target to influence the smoothness of driving; on the other hand, because the semantic information related to the construction area cannot be accurately perceived, the planning control cannot take necessary measures (such as lane changing, obstacle avoidance, early deceleration and the like) for the construction area, and the safety and smoothness of driving can be influenced to some extent.
Aiming at the problems, the embodiment of the disclosure provides a construction area sensing method in automatic driving, which can reduce the risk that a construction area is misjudged as a dynamic target and effectively improve the stability of driving.
Fig. 1 is a flowchart of a construction area sensing method in automatic driving in an embodiment of the present disclosure. The method can be performed by an in-flight construction area sensing device, which can be implemented in software and/or hardware, and which can be deployed in an electronic device. As shown in fig. 1, the method may specifically include steps S110-S160.
In S110, point cloud data of a current frame in a vehicle coordinate system is acquired.
The point cloud data may be position data (position information) of points (sampling points) on the outer surface of the object within a certain range.
The point cloud data can be obtained by detecting the point cloud data of the surrounding environment in real time through a point cloud measuring device (such as a laser radar). In an autopilot scenario, the acquisition of point cloud data may be measured by an on-board point cloud measurement device provided on the vehicle. The vehicle-mounted point cloud measuring device is a point cloud measuring device arranged on a vehicle (such as an automatic driving vehicle, particularly an automatic driving vehicle with an automatic driving level of above L3), and can collect original point cloud data around the vehicle in real time.
In some embodiments, the in-vehicle point cloud measurement device may be an in-vehicle lidar. The vehicle-mounted laser radar is one type of vehicle-mounted point cloud measuring equipment, and can emit laser beams in a certain range in a scanning manner (namely, sequentially emit the laser beams in different directions in the range); meanwhile, the vehicle-mounted laser radar continuously detects laser echoes reflected back by the object to determine the position of the closest object in the corresponding direction (i.e., the opposite direction of the reflection occurs) to obtain point cloud data, and thus, the point cloud data comprises positions of a plurality of points (sampling points) on the surface of the object relative to the laser radar.
In some embodiments, the vehicle-mounted lidar may be in the form of a single wire, 4 wire, 16 wire, 32 wire, 64 wire, 128 wire, or the like. Specifically, the above "line number" represents the number of laser beams that can be emitted by the vehicle-mounted laser radar at the same time, where different laser beams can scan different ranges without overlapping (e.g., scan the front, side, rear, etc. of the vehicle respectively); alternatively, the scan ranges of the different laser beams may also overlap.
In S120, the point cloud data is input to the semantic segmentation model, so as to obtain semantic labels of points in the point cloud data.
In some embodiments, the steps S110 to S120 may be to obtain laser radar point cloud data converted into a vehicle coordinate system, extract point clouds in an ROI (region of interest ) area, send the point cloud semantic tags to a point cloud semantic segmentation model for reasoning, and obtain semantic tags of points in the point cloud data.
It should be noted that, the semantic tag space density of the point cloud is not required to be high in the embodiment of the present disclosure, but the number of laser radars commonly used on the current vehicles is large, and the spatial resolution is high. Therefore, in some embodiments, the point cloud data input into the semantic segmentation model may be downsampled by a preset proportion, so that the reasoning speed of the model may be further improved.
In S130, the point cloud data containing the semantic tag is converted into point cloud data under Bird Eye View (BEV).
Note that, the point cloud data under the bird's eye view angle may correspond to a plane, which may be referred to as a reference plane.
In S140, rasterizing the point cloud data under the aerial view angle to obtain a current frame raster image and a pixel value corresponding to each raster in the current frame raster image; the pixel value corresponding to the grid is determined according to the duty ratio of the category of the semantic label of each point in the grid, wherein the category of the semantic label comprises a foreground point, a background point and a construction area point, and the construction area point belongs to the background point.
The rasterizing process may be performed on the point cloud data under the bird's eye view angle, by dividing the reference plane into a plurality of grids, and determining the points corresponding to each grid. The point corresponding to the grid may be a point where the bird's eye projection is located on the grid.
It should be noted that, the "point" mentioned in the disclosure is a point with a semantic tag after the processing in the step S120, and may also be referred to as a semantic tag point.
It should be further noted that, in the present disclosure, the "grid" may be a single grid region into which the point cloud is divided under the bird's eye view. Whereas the whole image consisting of a plurality of grids is called "raster image", each pixel value of this "raster image" corresponds to one grid of the point cloud at the bird's eye view.
The disclosed embodiments divide semantic tag points into foreground points and background points. The foreground points are points of traffic participant types such as vehicles, pedestrians and the like, the background points are points of static barriers such as buildings, floors and the like, and the construction area points belong to one of the background points.
The construction area in the present disclosure may be an area surrounded by objects such as a construction fence and a water horse on a vehicle running road.
In S150, the current frame raster image is binarized based on the pixel value corresponding to each raster.
In S160, a construction area image is determined based on the binarized current frame raster image.
In the above steps S150 to 160, the raster image may be binarized, and the skeleton or image of the construction area may be extracted to obtain the construction area image. In one embodiment, the construction area point cloud map may be as shown in fig. 2, where a block 201 selects a point cloud corresponding to a first construction area, and a block 202 selects a point cloud corresponding to a second construction area. The result after binarization processing and skeleton extraction of the construction area may include a first construction area 301 and a second construction area 302 as shown in fig. 3.
According to the construction area sensing method in automatic driving, construction area points are divided into background points, so that after the construction area is detected and sensed, the construction area is judged to be a static obstacle, the risk of misjudging the construction area as a dynamic target is reduced, and the stability of driving is effectively improved.
The inventors found that the conventional definition of the pixel values corresponding to the grid in the related art is to set the pixel values according to the probability distribution of the construction area points in the grid, but this has the following disadvantages: the probability distribution is the decimal between 0 and 1, the decimal is generally represented by floating point number, the occupied memory is more, and the calculation complexity is higher; the condition of misdetection of the semantic tags is not optimized, the precision of the semantic segmentation model is sensitive, and the perception robustness is affected; the situation when there is no consideration for the lack of valid semantic label points within the grid, especially when the downsampling ratio of the input point cloud is large, is more likely to occur. In this case, it is not reasonable to directly set the grid to a grid without construction area points. In addition, when the binarization processing is performed on the raster image later, it is inconvenient to take a threshold value.
In view of the above problems, as shown in fig. 4, the calculation process of the pixel value corresponding to the grid in the embodiment of the present disclosure may include steps S401 to S404.
In S401, when the area where the grid is located does not have any point with a semantic tag, setting the pixel value corresponding to the grid to a first negative value;
in S402, when the area where the grid is located includes points with semantic tags, determining the category of each point semantic tag in the grid according to the semantic tags of each point in the grid;
in S403, the proportion of the construction area points in the grid to the total points in the grid, the proportion of the foreground points in the grid to the total points in the grid, and the proportion of other background points in the grid to the total points in the grid are counted, wherein the other background points are background points except the construction area points;
in S404, calculating a pixel value corresponding to the grid based on a proportion of construction area points in the grid to total points in the grid, a proportion of foreground points in the grid to total points in the grid, a proportion of other background points in the grid to total points in the grid, and weights of the construction area points, the foreground points and the other background points;
wherein, the relation between the pixel value corresponding to the grid and the proportion of the construction area point in the grid to the total point in the grid is positive correlation; the relationship between the pixel value corresponding to the grid and the proportion of the foreground point in the grid to the total point in the grid is negative correlation; the relationship between the pixel value corresponding to the grid and the size of the proportion of other background points in the grid to the total points in the grid is negative.
It should be noted that, in the above embodiment, "total points in grid" is used to represent total points of points with semantic tags in the grid.
In some embodiments, in step S404, the pixel value corresponding to the grid may be calculated by the following formula:
I(i k ,j k )=Round(αP w -βP F -γP B ) (1)
wherein I (I) k ,j k ) The coordinates under the bird's eye view angle are (i) k ,j k ) Pixel values corresponding to the grid of (2), round represents a rounding function, P w Representing the proportion of the construction area points in the grid to the total points in the grid, P F Representing the proportion of foreground points in the grid to the total points in the grid, P B The proportion of other background points in the grid to the total points in the grid is represented, alpha represents the weight of the construction area points, beta represents the weight of the foreground points, and gamma represents the weight of the other background points.
In the above formula, α, β, and γ are positive numbers.
Furthermore, the inventors found that β can take a larger value than α and γ for safety reasons, since in general the construction area points are more prone to false detection than other background points and the foreground object is perceived as a larger safety risk of the construction area object.
In the pixel value calculation process corresponding to the grid, discretizing (rounding) is performed on the calculated result. When there is no point cloud with semantic tag in the selected grid, the pixel value corresponding to the grid can be set to a first negative value c as an uncertain measurement. In one embodiment, a typical value of c is c= -0.5 γ, and then when the raster image is binarized, 0 may be directly used as the threshold.
The inventors have found that, because the accuracy of the semantic segmentation model of the point cloud is limited, there may be some errors in the point cloud semantic tags obtained in step S120. In order to improve the robustness of construction area perception, the information of the historical frame can be fused in the current frame.
In some embodiments, the fusion history frame may be an unfused raster image of the first N frames, and the N frames are respectively converted into the coordinate system of the current frame for fusion, which is simpler.
In some embodiments, the fusion history frame may also be a raster image obtained by fusing only the previous frame, and the fusion method has lower operation complexity and storage space because only one inter-frame coordinate transformation is performed during fusion.
In some embodiments, before performing binarization processing on the current frame raster image based on the pixel values corresponding to the grids in the step S150, the embodiment of the present disclosure may further obtain a previous frame fused image, where the previous frame fused image fuses a plurality of historical frame raster images; converting the previous frame fusion image into a coordinate system of a grid image of the current frame; and fusing the previous frame fused image after the coordinate system is converted into the grid image of the current frame, and updating the pixel value corresponding to each grid in the grid image of the current frame.
In some embodiments, the previous frame fused image (historical frame raster image) is T k-1 (I, j) raster image of current frame is I k (i, j) transformation of the previous frame fused image to raster image H in the current frame raster image coordinate system k (i, j) calculated as follows:
H k (i,j) =TR×T k-1 (i,j) (2)
the TR is a transformation matrix of the previous frame and the current frame, and can be obtained according to positioning information of vehicles of the previous frame and the next frame.
The inventors have found that the fusion process in the above embodiments is relatively sensitive to the accuracy of the positioning information, especially to the heading angle. To solve this problem, embodiments of the present disclosure devised a simple method of evaluating positioning accuracy, and historical frame data was fused only when positioning accuracy was satisfactory.
In some embodiments, fusing a previous frame fused image after converting the coordinate system into a current frame raster image, and updating pixel values corresponding to each raster in the current frame raster image, including: calculating the positioning accuracy between the fusion image after the coordinate system conversion and the grid image of the current frame; and under the condition that the positioning accuracy is greater than a preset accuracy threshold, fusing the previous frame fused image after converting the coordinate system into the grid image of the current frame, and updating the pixel value corresponding to each grid in the grid image of the current frame.
In some embodiments, the positioning accuracy between the fused image after converting the coordinate system and the current frame raster image may be calculated by the following formula:
wherein μ represents positioning accuracy, N (I) k (I, j)) represents the current frame raster image I k (i, j) grid set with positive pixel value, N (H) k (i, j)) represents the fused image H after converting the coordinate system k (i, j) a grid set with positive pixel values.
In some embodiments, the previous frame fused image after the coordinate system is converted is fused into the grid image of the current frame, which can be shown in the following formula:
wherein the fused raster image is F k (i,j),And the weight coefficient of the fused image of the previous frame in fusion is represented. T represents a positioning accuracy threshold.
In some embodiments of the present invention, in some embodiments,the value range can be 0-1.0, and the value range of T can be 0-1.0.
In some embodiments, binarizing the current frame raster image based on pixel values corresponding to each raster includes: performing threshold processing on the current frame raster image fused with the previous frame raster image to obtain a first image, and storing the first image serving as the previous frame raster image of the next frame raster image; and carrying out binarization processing on the raster image or the first image of the current frame.
In some embodiments, the current frame raster image fused with the previous frame fused image may be thresholded by the following formula:
wherein T (i) k ,j k ) Representing a first image, F (i k ,j k ) The coordinates under the bird's eye view angle are (i) k ,j k ) Pixel value, H, corresponding to the grid of (2) t Represents an upper threshold limit, L t Representing a lower threshold; wherein H is t The value is positive, L t The value is negative.
In some embodiments, H t The value is positive, L t The value is negative, so that the value is generally satisfied with H in consideration of the problem of missed detection of the construction area with more attention to perception t Is greater than L t Is required for the absolute value of (a).
In some embodiments, binarizing F (i, j) after thresholding with a threshold value of 0 is performed to obtain a binary image B (i, j) according to the following formula:
in some embodiments, as shown in fig. 5, after the binarization process, an output center point or boundary point may be selected to describe the construction area as required by the task.
If the center point is extracted, refining the B (i, j) to generate a skeleton image with a pixel width; if the boundary points are extracted, the contour image of the connected domain exceeding a certain size or area in B (i, j) is extracted.
In some embodiments, determining the construction area image based on the binarized current frame raster image may include performing refinement processing on the binarized current frame raster image, extracting a center point of the construction area enclosure, and generating a skeleton map of the construction area enclosure; extracting key points from a plurality of points forming the skeleton diagram according to the curvature of each point in the skeleton diagram; and determining a construction area image based on the key points.
In some embodiments, determining the construction area image based on the binarized current frame raster image may include extracting boundary points of the connected domain exceeding a preset size in the binarized current frame raster image, and generating a contour map of the construction area fence; extracting key points from a plurality of points forming the contour map according to the curvature of each point in the contour map; and determining a construction area image based on the key points.
In the above embodiment, the principle of extracting the key points from the plurality of points constituting the skeleton map according to the curvatures of the points in the skeleton map is the same as that of extracting the key points from the plurality of points constituting the skeleton map according to the curvatures of the points in the outline map, and the following description will be given by taking the extraction of the key points from the plurality of points constituting the skeleton map according to the curvatures of the points in the skeleton map as an example.
The sensing may be performed on the construction area using an ordered sequence of points. In the above embodiment, the number of points in the skeleton map is relatively large, and the key points in the skeleton map need to be extracted, so as to reduce the operation amount of planning control and the transmission resource consumption.
The key points should better show the whole feature and local detail of the construction area, and the extraction mode is as follows:
clustering the skeleton diagram obtained in the embodiment according to connectivity to obtain a cluster set { C } 1 ,C 2 ,……,C 2 }。
For each cluster in the cluster set, calculating the curvature of each point except the end points according to the arrangement sequence of the points. All endpoints are extracted as key points; extracting key points from points with curvature lower than a certain threshold T according to a certain distance interval d; points having a curvature equal to or greater than the threshold T are also extracted as key points. The purpose is to extract the key points in the area with gentle change, but to keep the points and boundary points with severe change, so that the boundary information of the construction area can be kept as much as possible while the key points are reduced.
For the kth point P (i k ,j k ) The front and rear points are P (i) k-1 ,j k-1 ) And P (i) k+1 ,j k+1 ) The curvature gamma of the point k The calculation mode of (2) is as follows:
the key points extracted in the above embodiment may be shown as white dots in fig. 6. And finally, converting the extracted key points from the grid image to the vehicle coordinate system, and converting the extracted key points from the vehicle coordinate system to the world coordinate system for output to obtain the construction area image.
Fig. 7 is a schematic structural diagram of an automatic driving construction area sensing device according to an embodiment of the present disclosure. As shown in fig. 7: the apparatus includes a data acquisition module 710, a semantic tag module 720, a coordinate conversion module 730, a rasterization processing module 740, a binarization module 750, and a construction area determination module 760.
The data acquisition module 710 is configured to acquire point cloud data of a current frame in a vehicle coordinate system;
the semantic tag module 720 is configured to input the point cloud data into a semantic segmentation model to obtain semantic tags of points in the point cloud data;
the coordinate conversion module 730 is configured to convert the point cloud data containing the semantic tag into point cloud data under the bird's eye view angle;
the rasterizing processing module 740 is configured to perform rasterizing processing on the point cloud data under the aerial view angle to obtain a current frame raster image and a pixel value corresponding to each raster in the current frame raster image; the pixel value corresponding to the grid is determined according to the duty ratio of the category of the semantic tag of each point in the grid, wherein the category of the semantic tag comprises a foreground point, a background point and a construction area point, and the construction area point belongs to the background point;
a binarization module 750, configured to binarize the current frame raster image based on pixel values corresponding to the grids;
the construction area determining module 760 is configured to determine a construction area image based on the binarized current frame raster image.
In some embodiments, the calculating process of the pixel value corresponding to the grid may include setting the pixel value corresponding to the grid to a first negative value when the area where the grid is located does not have any points with semantic tags; when the area of the grid contains points with semantic tags, determining the category of the semantic tags of each point in the grid according to the semantic tags of each point in the grid; respectively counting the proportion of construction area points in the grid to total points in the grid, the proportion of foreground points in the grid to total points in the grid and the proportion of other background points in the grid to total points in the grid, wherein the other background points are background points except the construction area points; calculating a pixel value corresponding to the grid based on the proportion of construction area points in the grid to total points in the grid, the proportion of foreground points in the grid to total points in the grid, the proportion of other background points in the grid to total points in the grid, the weight of the construction area points, the weight of the foreground points and the weight of other background points; wherein, the relation between the pixel value corresponding to the grid and the proportion of the construction area point in the grid to the total point in the grid is positive correlation; the relationship between the pixel value corresponding to the grid and the proportion of the foreground point in the grid to the total point in the grid is negative correlation; the relationship between the pixel value corresponding to the grid and the size of the proportion of other background points in the grid to the total points in the grid is negative.
In some embodiments, the pixel value corresponding to the grid is calculated by equation (1).
In some embodiments, the apparatus may further comprise an image fusion module.
The image fusion module is used for acquiring a previous frame fusion image before binarizing the grid image of the current frame based on the pixel value corresponding to each grid, and the previous frame fusion image fuses a plurality of historical frame grid images; converting the previous frame fusion image into a coordinate system of a grid image of the current frame; and fusing the previous frame fused image after the coordinate system is converted into the grid image of the current frame, and updating the pixel value corresponding to each grid in the grid image of the current frame.
In some embodiments, the image fusion module fuses a previous frame fusion image after the coordinate system is converted into a current frame raster image, and updates pixel values corresponding to each raster in the current frame raster image, including: calculating the positioning accuracy between the fusion image after the coordinate system conversion and the grid image of the current frame; and under the condition that the positioning accuracy is greater than a preset accuracy threshold, fusing the previous frame fused image after converting the coordinate system into the grid image of the current frame, and updating the pixel value corresponding to each grid in the grid image of the current frame.
In some embodiments, the positioning accuracy between the fused image after converting the coordinate system and the grid image of the current frame is calculated by formula (3).
In some embodiments, the binarization module 750 is configured to perform thresholding on a current frame raster image fused with a previous frame raster image, to obtain a first image, and store the first image as a previous frame raster image fused with a next frame raster image; and performing binarization processing on the first image.
In some embodiments, thresholding is performed on the current frame raster image fused with the previous frame fused image by equation (5).
In some embodiments, the construction area determining module 760 is configured to refine the binarized current frame raster image, extract a center point of the construction area enclosure, and generate a skeleton diagram of the construction area enclosure; extracting key points from a plurality of points forming the skeleton diagram according to the curvature of each point in the skeleton diagram; and determining a construction area image based on the key points.
In some embodiments, the construction area determining module 760 is configured to extract boundary points of the connected domain exceeding a preset size in the binarized current frame raster image, and generate a contour map of the construction area enclosure; extracting key points from a plurality of points forming the contour map according to the curvature of each point in the contour map; and determining a construction area image based on the key points.
The device for sensing the construction area in the automatic driving provided by the embodiment of the disclosure can execute the steps in the method for sensing the construction area in the automatic driving provided by the embodiment of the disclosure, and has the execution steps and the beneficial effects which are not described herein.
Fig. 8 is a schematic structural diagram of an electronic device in an embodiment of the disclosure. Referring now in particular to fig. 8, a schematic diagram of an electronic device 800 suitable for use in implementing embodiments of the present disclosure is shown. The electronic device shown in fig. 8 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.
As shown in fig. 8, the electronic device 800 may include a processing means (e.g., a central processor, a graphics processor, etc.) 801 that may perform various suitable actions and processes to implement the methods of embodiments as described in the present disclosure according to programs stored in a Read Only Memory (ROM) 802 or loaded from a storage 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the electronic device 800 are also stored. The processing device 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flowchart, thereby implementing the method of in-flight construction zone awareness as described above. In such an embodiment, the computer program may be downloaded and installed from a network via communication device 809, or installed from storage device 808, or installed from ROM 802. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 801.
It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring point cloud data of a current frame under a vehicle coordinate system; inputting the point cloud data into a semantic segmentation model to obtain semantic labels of each point in the point cloud data; converting the point cloud data containing the semantic tag into point cloud data under the aerial view angle; performing rasterization processing on the point cloud data under the aerial view angle to obtain a current frame raster image and a pixel value corresponding to each raster in the current frame raster image; the pixel value corresponding to the grid is determined according to the duty ratio of the category of the semantic tag of each point in the grid, wherein the category of the semantic tag comprises a foreground point, a background point and a construction area point, and the construction area point belongs to the background point; based on the pixel value corresponding to each grid, carrying out binarization processing on the grid image of the current frame; and determining a construction area image based on the binarized current frame raster image.
Alternatively, the electronic device may perform other steps described in the above embodiments when the above one or more programs are executed by the electronic device.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
Scheme 1, a construction area perception method in automatic driving, the method includes:
acquiring point cloud data of a current frame under a vehicle coordinate system;
inputting the point cloud data into a semantic segmentation model to obtain semantic tags of each point in the point cloud data;
Converting the point cloud data containing the semantic tag into point cloud data under the aerial view angle;
performing rasterization processing on the point cloud data under the aerial view angle to obtain a grid image of the current frame and a pixel value corresponding to each grid in the grid image of the current frame; the pixel value corresponding to the grid is determined according to the duty ratio of the category of the semantic label of each point in the grid, wherein the category of the semantic label comprises a foreground point, a background point and a construction area point, and the construction area point belongs to the background point;
based on the pixel value corresponding to each grid, carrying out binarization processing on the grid image of the current frame;
and determining a construction area image based on the binarized current frame raster image.
Scheme 2, the method according to scheme 1, the calculation process of the pixel value corresponding to the grid includes:
when the area where the grid is located does not have any point with semantic tags, setting the pixel value corresponding to the grid to be a first negative value;
when the area of the grid contains points with semantic tags, determining the category of the semantic tags of each point in the grid according to the semantic tags of each point in the grid;
Respectively counting the proportion of construction area points in the grid to the total points in the grid, the proportion of foreground points in the grid to the total points in the grid and the proportion of other background points in the grid to the total points in the grid, wherein the other background points are background points except the construction area points;
calculating a pixel value corresponding to the grid based on the proportion of construction area points in the grid to total points in the grid, the proportion of foreground points in the grid to total points in the grid, the proportion of other background points in the grid to total points in the grid, the weight of construction area points, the weight of foreground points and the weight of other background points;
wherein, the relation between the pixel value corresponding to the grid and the proportion of the construction area point in the grid to the total point in the grid is positive correlation; the relation between the pixel value corresponding to the grid and the proportion of the foreground point in the grid to the total point in the grid is negative correlation; the relationship between the pixel value corresponding to the grid and the proportion of other background points in the grid to the total points in the grid is negative.
Scheme 3, according to the method of scheme 2, the pixel value corresponding to the grid is calculated by the following formula:
I(i k ,j k )=Round(αP w -βP F -γP B )
Wherein I (I) k ,j k ) The coordinates under the bird's eye view angle are (i) k ,j k ) Pixel values corresponding to the grid of (2), round represents a rounding function, P w Representing the proportion of construction area points in a grid to the total points in the grid, P F Representing the proportion of foreground points in a grid to the total points in the grid, P B The method is characterized in that the method is used for representing the proportion of other background points in the grid to the total points in the grid, alpha represents the weight of construction area points, beta represents the weight of foreground points, and gamma represents the weight of other background points.
In an aspect 4, according to the method of aspect 1, before the binarizing the raster image of the current frame based on the pixel values corresponding to the grids, the method further includes:
acquiring a previous frame fusion image, wherein the previous frame fusion image fuses a plurality of historical frame grid images;
converting the previous frame fusion image into a coordinate system of a grid image of the current frame;
and fusing the previous frame fused image after the coordinate system conversion into a grid image of the current frame, and updating pixel values corresponding to grids in the grid image of the current frame.
The method according to claim 5, wherein the fusing the previous frame fused image after the coordinate system is converted into the current frame raster image, and updating the pixel value corresponding to each raster in the current frame raster image, includes:
Calculating the positioning accuracy between the fusion image after the coordinate system conversion and the grid image of the current frame;
and under the condition that the positioning accuracy is greater than a preset accuracy threshold, fusing the previous frame fused image after converting the coordinate system into the grid image of the current frame, and updating the pixel value corresponding to each grid in the grid image of the current frame.
Solution 6, according to the method of solution 5, the positioning accuracy between the fused image after converting the coordinate system and the grid image of the current frame is calculated by the following formula:
wherein μ represents positioning accuracy, N (I) k (I, j)) represents the current frame raster image I k (i, j) grid set with positive pixel value, N (H) k (i, j)) represents the fused image H after converting the coordinate system k (i, j) a grid set with positive pixel values.
The method according to claim 7, wherein the binarizing the current frame raster image based on the pixel values corresponding to the grids, includes:
performing threshold processing on a current frame raster image fused with a previous frame raster image to obtain a first image and storing the first image, wherein the first image is used as a previous frame raster image of a next frame raster image;
and carrying out binarization processing on the first image.
In the method according to the scheme 8, according to the scheme 7, thresholding is performed on the current frame raster image fused with the previous frame fused image by the following formula:
/>
wherein T (i) k ,j k ) Representing a first image, F (i k ,j k ) The coordinates under the bird's eye view angle are (i) k ,j k ) Pixel value, H, corresponding to the grid of (2) t Represents an upper threshold limit, L t Representing a lower threshold; wherein H is t The value is positive, L t The value is negative.
The method according to claim 9, wherein the determining the construction area image based on the binarized current frame raster image includes:
refining the binarized current frame grid image, extracting the center point of the construction area enclosure, and generating a skeleton diagram of the construction area enclosure;
extracting key points from a plurality of points forming the skeleton diagram according to the curvature of each point in the skeleton diagram;
and determining a construction area image based on the key points.
The method according to claim 10, wherein the determining the construction area image based on the binarized current frame raster image includes:
extracting boundary points of connected domains exceeding a preset size in the binarized current frame grid image, and generating a contour map of a construction area fence;
Extracting key points from a plurality of points forming the contour map according to the curvature of each point in the contour map;
and determining a construction area image based on the key points.
Scheme 11, a construction zone sensing device in autopilot, said device comprising:
the data acquisition module is used for acquiring point cloud data of a current frame under a vehicle coordinate system;
the semantic tag module is used for inputting the point cloud data into a semantic segmentation model to obtain semantic tags of each point in the point cloud data;
the coordinate conversion module is used for converting the point cloud data containing the semantic tag into the point cloud data under the aerial view angle;
the rasterization processing module is used for carrying out rasterization processing on the point cloud data under the aerial view angle to obtain a current frame raster image and a pixel value corresponding to each raster in the current frame raster image; the pixel value corresponding to the grid is determined according to the duty ratio of the category of the semantic label of each point in the grid, wherein the category of the semantic label comprises a foreground point, a background point and a construction area point, and the construction area point belongs to the background point;
the binarization module is used for carrying out binarization processing on the grid image of the current frame based on the pixel value corresponding to each grid;
And the construction area determining module is used for determining a construction area image based on the binarized current frame grid image.
Scheme 12, an electronic device, the electronic device comprising:
one or more processors;
a storage means for storing one or more programs;
when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the in-flight construction area awareness method of any one of aspects 1-10.
Solution 13, a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of in-flight construction area awareness according to any one of solutions 1 to 10.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Claims (10)

1. A method of construction zone awareness in autopilot, the method comprising:
acquiring point cloud data of a current frame under a vehicle coordinate system;
inputting the point cloud data into a semantic segmentation model to obtain semantic tags of each point in the point cloud data;
converting the point cloud data containing the semantic tag into point cloud data under the aerial view angle;
performing rasterization processing on the point cloud data under the aerial view angle to obtain a grid image of the current frame and a pixel value corresponding to each grid in the grid image of the current frame; the pixel value corresponding to the grid is determined according to the duty ratio of the category of the semantic label of each point in the grid, wherein the category of the semantic label comprises a foreground point, a background point and a construction area point, and the construction area point belongs to the background point;
based on the pixel value corresponding to each grid, carrying out binarization processing on the grid image of the current frame;
and determining a construction area image based on the binarized current frame raster image.
2. The method according to claim 1, wherein the calculation process of the pixel value corresponding to the grid includes:
when the area where the grid is located does not have any point with semantic tags, setting the pixel value corresponding to the grid to be a first negative value;
When the area of the grid contains points with semantic tags, determining the category of the semantic tags of each point in the grid according to the semantic tags of each point in the grid;
respectively counting the proportion of construction area points in the grid to the total points in the grid, the proportion of foreground points in the grid to the total points in the grid and the proportion of other background points in the grid to the total points in the grid, wherein the other background points are background points except the construction area points;
calculating a pixel value corresponding to the grid based on the proportion of construction area points in the grid to total points in the grid, the proportion of foreground points in the grid to total points in the grid, the proportion of other background points in the grid to total points in the grid, the weight of construction area points, the weight of foreground points and the weight of other background points;
wherein, the relation between the pixel value corresponding to the grid and the proportion of the construction area point in the grid to the total point in the grid is positive correlation; the relation between the pixel value corresponding to the grid and the proportion of the foreground point in the grid to the total point in the grid is negative correlation; the relationship between the pixel value corresponding to the grid and the proportion of other background points in the grid to the total points in the grid is negative.
3. The method of claim 2, wherein the pixel value corresponding to the grid is calculated by the formula:
I(i k ,j k )=Round(αP w -βP F -γP B )
wherein I (I) k ,j k ) The coordinates under the bird's eye view angle are (i) k ,j k ) Pixel values corresponding to the grid of (2), round represents a rounding function, P w Representing the proportion of construction area points in a grid to the total points in the grid, P F Representing the proportion of foreground points in a grid to the total points in the grid, P B The method is characterized in that the method is used for representing the proportion of other background points in the grid to the total points in the grid, alpha represents the weight of construction area points, beta represents the weight of foreground points, and gamma represents the weight of other background points.
4. The method according to claim 1, wherein before binarizing the current frame raster image based on the pixel values corresponding to each of the grids, the method further comprises:
acquiring a previous frame fusion image, wherein the previous frame fusion image fuses a plurality of historical frame grid images;
converting the previous frame fusion image into a coordinate system of a grid image of the current frame;
and fusing the previous frame fused image after the coordinate system conversion into a grid image of the current frame, and updating pixel values corresponding to grids in the grid image of the current frame.
5. The method according to claim 4, wherein the fusing the previous frame fused image after the coordinate system is transformed into the current frame raster image, and updating the pixel value corresponding to each raster in the current frame raster image, includes:
calculating the positioning accuracy between the fusion image after the coordinate system conversion and the grid image of the current frame;
and under the condition that the positioning accuracy is greater than a preset accuracy threshold, fusing the previous frame fused image after converting the coordinate system into the grid image of the current frame, and updating the pixel value corresponding to each grid in the grid image of the current frame.
6. The method of claim 5, wherein the positioning accuracy between the fused image after converting the coordinate system and the grid image of the current frame is calculated by the following formula:
wherein μ represents positioning accuracy, N (I) k (I, j)) represents the current frame raster image I k (i, j) grid set with positive pixel value, N (H) k (i, j)) represents the fused image H after converting the coordinate system k (i, j) a grid set with positive pixel values.
7. The method of claim 4, wherein binarizing the current frame raster image based on pixel values corresponding to each of the grids, comprises:
Performing threshold processing on a current frame raster image fused with a previous frame raster image to obtain a first image and storing the first image, wherein the first image is used as a previous frame raster image of a next frame raster image;
and carrying out binarization processing on the first image.
8. An automatic in-driving construction area sensing device, the device comprising:
the data acquisition module is used for acquiring point cloud data of a current frame under a vehicle coordinate system;
the semantic tag module is used for inputting the point cloud data into a semantic segmentation model to obtain semantic tags of each point in the point cloud data;
the coordinate conversion module is used for converting the point cloud data containing the semantic tag into the point cloud data under the aerial view angle;
the rasterization processing module is used for carrying out rasterization processing on the point cloud data under the aerial view angle to obtain a current frame raster image and a pixel value corresponding to each raster in the current frame raster image; the pixel value corresponding to the grid is determined according to the duty ratio of the category of the semantic label of each point in the grid, wherein the category of the semantic label comprises a foreground point, a background point and a construction area point, and the construction area point belongs to the background point;
The binarization module is used for carrying out binarization processing on the grid image of the current frame based on the pixel value corresponding to each grid;
and the construction area determining module is used for determining a construction area image based on the binarized current frame grid image.
9. An electronic device, the electronic device comprising:
one or more processors;
a storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the in-flight construction area awareness method of any one of claims 1-8.
10. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when executed by a processor, implements the method of construction area awareness in autopilot according to any one of claims 1-8.
CN202311782491.5A 2023-12-22 2023-12-22 Construction area sensing method, device, equipment and storage medium in automatic driving Pending CN117784168A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311782491.5A CN117784168A (en) 2023-12-22 2023-12-22 Construction area sensing method, device, equipment and storage medium in automatic driving

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311782491.5A CN117784168A (en) 2023-12-22 2023-12-22 Construction area sensing method, device, equipment and storage medium in automatic driving

Publications (1)

Publication Number Publication Date
CN117784168A true CN117784168A (en) 2024-03-29

Family

ID=90382822

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311782491.5A Pending CN117784168A (en) 2023-12-22 2023-12-22 Construction area sensing method, device, equipment and storage medium in automatic driving

Country Status (1)

Country Link
CN (1) CN117784168A (en)

Similar Documents

Publication Publication Date Title
EP3506158B1 (en) Method and apparatus for determining lane line on road
CN112347999B (en) Obstacle recognition model training method, obstacle recognition method, device and system
CN108509820B (en) Obstacle segmentation method and device, computer equipment and readable medium
CN110738121A (en) front vehicle detection method and detection system
CN110246183B (en) Wheel grounding point detection method, device and storage medium
CN112329754B (en) Obstacle recognition model training method, obstacle recognition method, device and system
CN112740225B (en) Method and device for determining road surface elements
CN113674287A (en) High-precision map drawing method, device, equipment and storage medium
WO2024012211A1 (en) Autonomous-driving environmental perception method, medium and vehicle
CN111316328A (en) Method for maintaining lane line map, electronic device and storage medium
CN115147333A (en) Target detection method and device
CN113255444A (en) Training method of image recognition model, image recognition method and device
CN115139303A (en) Grid well lid detection method, device, equipment and storage medium
CN112562419B (en) Off-line multi-target tracking-based weather avoidance zone setting method
CN114241448A (en) Method and device for obtaining heading angle of obstacle, electronic equipment and vehicle
CN117784168A (en) Construction area sensing method, device, equipment and storage medium in automatic driving
CN112651405B (en) Target detection method and device
CN111542828A (en) Line recognition method, line recognition device, line recognition system, and computer storage medium
CN114581615B (en) Data processing method, device, equipment and storage medium
CN112179360B (en) Map generation method, apparatus, system and medium
CN117496485A (en) Object detection method and device, electronic equipment and computer readable storage medium
CN116129395A (en) Point cloud data optimization method, device and equipment and automatic driving vehicle
CN116824081A (en) Ground model construction method, device and equipment based on multi-sensor perception
CN115424442A (en) Radar map-based vehicle driving event detection method, device, equipment and medium
CN114972758A (en) Instance segmentation method based on point cloud weak supervision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination