CN117078800A - Method and device for synthesizing ground identification based on BEV image - Google Patents
Method and device for synthesizing ground identification based on BEV image Download PDFInfo
- Publication number
- CN117078800A CN117078800A CN202310951584.XA CN202310951584A CN117078800A CN 117078800 A CN117078800 A CN 117078800A CN 202310951584 A CN202310951584 A CN 202310951584A CN 117078800 A CN117078800 A CN 117078800A
- Authority
- CN
- China
- Prior art keywords
- image
- ground
- bev
- identifier
- ground identifier
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 230000002194 synthesizing effect Effects 0.000 title claims abstract description 25
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 28
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 28
- 230000011218 segmentation Effects 0.000 claims abstract description 15
- 230000009466 transformation Effects 0.000 claims description 24
- 238000012545 processing Methods 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 8
- 238000003860 storage Methods 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims 2
- 239000002131 composite material Substances 0.000 claims 1
- 230000002708 enhancing effect Effects 0.000 claims 1
- 230000001131 transforming effect Effects 0.000 claims 1
- 238000012549 training Methods 0.000 abstract description 9
- 230000008569 process Effects 0.000 abstract description 5
- 235000004522 Pentaglottis sempervirens Nutrition 0.000 description 11
- 238000010586 diagram Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 5
- 238000007781 pre-processing Methods 0.000 description 5
- 238000001514 detection method Methods 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000011426 transformation method Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000002679 ablation Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000001308 synthesis method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/60—Editing figures and text; Combining figures or text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/588—Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30248—Vehicle exterior or interior
- G06T2207/30252—Vehicle exterior; Vicinity of vehicle
- G06T2207/30256—Lane; Road marking
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a method and a device for synthesizing ground identification based on BEV images, wherein the method comprises the following steps: acquiring a ground identifier template and a background image, wherein the ground identifier template comprises a ground arrow and a text identifier; projecting the background image to an overhead view to obtain a BEV image, and identifying position coordinates for adding the ground identifier in the BEV image; and fusing the ground identifier and the BEV image into a target BEV image according to the position coordinates, and converting the target BEV image into a common image under a common viewing angle, wherein the common image comprises the ground identifier template. According to the method, the background image and the ground mark are utilized, the target data with the labels can be automatically synthesized in batches by means of the road surface segmentation results of the lane lines and the drivable areas, and the new type of data synthesis can be directly performed without any label of the data to be synthesized in the process, so that the model training problem in the data shortage is solved to a great extent.
Description
Technical Field
The invention relates to the field of automatic driving image data synthesis, in particular to a method and a device for synthesizing ground marks based on BEV images.
Background
The data enhancement method is divided into offline enhancement and online enhancement, and common enhancement methods include geometric enhancement, online enhancement which is widely used in deep learning and other means, and the generalization of the model can be improved to a certain extent. Advanced enhancement means such as CycleGAN can also be used for image style migration, but the data enhancement means can only enhance on the original tagged data in supervised learning, and cannot generate new categories.
The model pre-labeling method refers to a technology for obtaining a pseudo label by carrying out model reasoning on data to be labeled through a pre-trained model, such as a yolox model and a yolov7 model which are widely applied in the engineering field, wherein the model is trained by a large amount of data with labels, is sensitive to the trained category, and has no discrimination capability on the category which is not learned.
At present, the technology of data synthesis has been widely applied to the 2D and 3D fields of automatic driving, and is expected to realize high efficiency of sensing tasks in the automatic driving field. More in the 2D field, data synthesis is performed based on images, and more complex factors such as background, viewing angle, camera depth, illumination, gesture and the like need to be considered in data synthesis in the 3D field. The synthetic data has very wide application, and can be almost suitable for all machine learning and deep learning tasks. The data generation can be performed through the StyleGAN and other generation type countermeasure networks, but the StyleGAN method is widely applied to face generation at present, and has unsatisfactory effects in other fields.
Disclosure of Invention
Aiming at the technical problems, the invention provides a method and a device for synthesizing a ground identifier based on BEV images, which can realize ground identifier synthesis based on the BEV images.
In a first aspect of the invention, there is provided a method of synthesizing a ground identification based on BEV images, comprising:
acquiring a ground identifier template and a background image, wherein the ground identifier template comprises a ground arrow and a text identifier;
projecting the background image to an overhead view to obtain a BEV image, and identifying position coordinates for adding the ground identifier in the BEV image;
and fusing the ground identifier and the BEV image into a target BEV image according to the position coordinates, and converting the target BEV image into a common image under a common viewing angle, wherein the common image comprises the ground identifier template.
In an alternative embodiment, the projecting the background image to the overhead view obtains a BEV image, and identifies position coordinates in the BEV image for adding the ground identifier, including:
during recognition, the panoramic segmentation model is utilized to segment the background image into a lane line mask and a pavement travelable area mask;
and determining position coordinates between two adjacent lane lines for placing the ground identifier template according to the constraint condition.
In an alternative embodiment, the identifying, using the panorama segmentation model to segment the background image into the lane line and the drivable area of the road surface, comprises:
projecting the background image, the lane line mask and a pavement movable region mask into the BEV image by inverse perspective transformation;
the road vehicle is filtered according to the road exercisable area mask, and four position point coordinates for placing the ground identifier template in the BEV image are calculated.
In an alternative embodiment, the determining the position coordinates between two adjacent lane lines for placing the ground identifier template according to the constraint condition includes:
and calculating and determining the position coordinates between two adjacent lane lines for placing the ground identifier template according to the width of the lane lines, the distance between the adjacent lane lines and the size of the traffic identifier.
In an alternative embodiment, the method for synthesizing the ground identifier based on the BEV image further comprises performing enhancement processing on the ground identifier template by means of size transformation, shading transformation, gaussian blur, texture transformation and noise addition; the background of the ground identifier is processed into pixels similar to the background image.
In an alternative embodiment, the fusing the ground identifier and the BEV image according to the location coordinates to a target BEV image comprises:
the ground identifier is fused with the background image projected into the BEV image according to the position coordinates.
In an alternative embodiment, the method for synthesizing the ground identifier based on the BEV image further includes outputting a synthesized data tag corresponding to the ground identifier when the target BEV image is transformed into a normal image under a normal viewing angle.
In a second aspect of the invention, there is provided an apparatus for synthesizing a ground identification based on BEV images, comprising:
the acquisition module is used for acquiring a ground identifier template and a background image, wherein the ground identifier template comprises a ground arrow and a text identifier;
the processing module is used for projecting the background image to an overhead view to obtain a BEV image, and identifying position coordinates for adding the ground identifier in the BEV image;
and the synthesis module is used for fusing the ground identifier and the BEV image into a target BEV image according to the position coordinates, and converting the target BEV image into a common image under a common viewing angle, wherein the common image comprises the ground identifier template.
In a third aspect of the present invention, there is provided an electronic apparatus comprising:
at least one processor; and at least one memory communicatively coupled to the processor, wherein: the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method according to the first aspect of the embodiments of the invention.
In a fourth aspect of the invention, a computer-readable storage medium is provided, on which a computer program is stored which, when run by a computer, performs the method according to the first aspect of the embodiment of the invention.
According to the method, the background image and the ground mark are utilized, the target data with the labels can be automatically synthesized in batches by means of the road surface segmentation results of the lane lines and the drivable areas, and the new type of data synthesis can be directly performed without any label of the data to be synthesized in the process, so that the model training problem in the data shortage is solved to a great extent.
Drawings
FIG. 1 is a flow chart of a method for synthesizing a ground identification based on BEV images in accordance with an embodiment of the present invention.
Fig. 2 is a schematic diagram of background color replacement of a ground identifier in an embodiment of the invention.
Fig. 3 is a schematic diagram of a result of recognition of a background image and panoramic segmentation in an embodiment of the present invention.
FIG. 4 is a schematic illustration of the projection of the diagram of FIG. 3 into BEV space.
Fig. 5 is a diagram showing the comparison between the ground identifier before and after synthesis at a common viewing angle in an embodiment of the present invention.
FIG. 6 is a block diagram of an apparatus for synthesizing a ground identification based on BEV images in accordance with an embodiment of the present invention.
Fig. 7 is a schematic structural view of an electronic device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be understood that the terms "comprises" and "comprising," when used in this specification and the claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The invention relates to 2D data synthesis of ground identification data such as ground arrows, and the like, and the invention realizes 2D data synthesis based on BEV images, namely realizes ground identification data synthesis of 2D common images in 3D scenes, unlike data synthesis in 2D scenes. The invention firstly obtains the template of the synthesized data and determines the placement position of the template. When the template positioning needs to be considered, for example, the pavement identifier should be on the ground and satisfies the camera front view angle rule of 'near-far-small', the background image can be projected to the aerial view angle through an inverse perspective transformation (IPM, inverse Perspective Mapping) method to obtain a corresponding aerial view. Under the view angle of the aerial view, the positions of the lane lines are firstly positioned through the lane line segmentation model, and the positions of vehicles on the road surface are filtered through a mask (mask) of the drivable area of the road surface; and then positioning the position of the template between two adjacent lane lines according to the position relation of the lane lines and a certain priori position constraint condition, so as to determine the final position of the template. The ground identification is then poisson fused with the background image, enabling the ground identification to be fused more naturally with the background. Finally, the synthesized data and labels can be used for downstream detection and segmentation tasks. The method comprises the following steps:
referring to fig. 1, fig. 1 is a flowchart of a method for synthesizing a ground identifier based on BEV images according to an embodiment of the present invention. The invention provides a method for synthesizing a ground identifier based on BEV images, which comprises the following steps:
step 100: a ground identifier template and a background image are acquired, wherein the ground identifier template comprises a ground arrow and a text identifier.
This step is used to obtain templates for data synthesis. For a 2D automatic driving scene, the acquisition of the ground identifier template mainly comprises the steps of acquiring from an existing template library, acquiring a template from an open source data set and manually manufacturing the template; the background image is a road surface image acquired by the front-view camera, and the background image selects a clean road surface without an arrow or a character identifier in a certain range or a ground without an arrow or a character identifier. The template library refers to a manufactured picture template and contains all data types required to be synthesized; the partially open source dataset may release a template that has been already manufactured, for example, a template in which 128 traffic signs have been released by TT100K (traffic sign dataset), or may use a cut target as a template. Illustratively, the ground identifier includes an indication of straight, an indication of straight ahead or right turn, an indication of straight ahead or left turn or right turn, an indication of right and left turn only on a road ahead, an indication of left or right turn ahead, an indication of left and right turn ahead, an indication of straight ahead or right turn ahead, an indication of left and right turn ahead of a road, an indication of right and left turn ahead of a road, an indication of right and right turn ahead of a road, and the like.
The background image of the BEV image can be obtained from a cloud data lake, and the background image can be projected into the BEV image by using a transformation method. Data enhancement is required for the acquired ground identifier template. The enhancement of the ground identifier template comprises common enhancement, background replacement and other methods, and aims to enhance the diversity and generalization of the template, and the details are described later.
Step 200: and projecting the background image to an overhead view to obtain a BEV image, and identifying position coordinates for adding the ground identifier in the BEV image.
In automated/assisted driving, detection of lane lines is very important. In the image captured by the front-view camera, the objects that are originally parallel to each other are intersected in the image due to the existence of the perspective effect. And the inverse perspective transformation (IPM transformation) is to eliminate this perspective effect. In this step, the background image is projected to the bird's eye view angle by the inverse perspective transformation method to obtain a corresponding bird's eye view, that is, the background image is displayed at the bird's eye view angle. The application scene of the invention comprises, but is not limited to, data synthesis under the cloud automatic driving BEV, and is also suitable for the data synthesis scene under the vehicle-end automatic driving BEV under the condition of sufficient vehicle-end calculation power resources.
And after the BEV image is acquired, the lane lines and the pavement drivable areas in the background image are identified by using an image segmentation model. The lane lines may then be projected into the BEV image by an inverse perspective transformation; since the identified lane lines exist in the bird's eye view, the placement position of the ground identifier can be positioned in the bird's eye view according to the lane lines.
Step 300: and fusing the ground identifier and the BEV image into a target BEV image according to the position coordinates, and converting the target BEV image into a common image under a common viewing angle, wherein the common image comprises the ground identifier template.
After the processing of the steps, a lane line mask and a pavement travelable area mask corresponding to each background image can be obtained, and then a basic model for automatic driving training can be obtained. The base model is added with a ground identifier to synthesize training data with labels for automatic driving training.
After determining the location coordinates for adding the ground identifier, the ground identifier and the BEV image are fused into a target BEV image according to the location coordinates.
After synthesizing the pavement identifier under the view angle of the aerial view, the aerial view is transformed back to the normal camera view angle through the inverse matrix perspective of the transformation matrix, so that the pavement image under the view angle of the front-view camera can be obtained, the physical rule of actual 'near-large-far-small' is met, the corresponding synthesized data label is obtained, and the synthesized data can be used for downstream detection and segmentation tasks.
From the above, the invention utilizes the background image and the ground mark, and can realize automatic batch synthesis of the target data with the labels by means of the road surface segmentation results of the lane lines and the drivable region, thereby realizing the data enhancement of BEV images and the image enhancement of common visual angles; in the process, the new type of data synthesis can be directly performed without any label of the data to be synthesized, and the model training problem in the data shortage is solved to a great extent.
Further, the method further comprises the step of carrying out data enhancement and preprocessing on the ground identifier template before carrying out data synthesis, wherein the data enhancement and preprocessing comprises background color processing, background image preprocessing and the like of the ground identifier. Referring to fig. 2, fig. 2 is a schematic diagram of template background color substitution of a ground identifier. For example, the background color of the ground identification template may be directly changed to the same hue as the background image. For example, the number of the arrow templates is 12, and the 12 templates are directly subjected to background color replacement.
In addition, the arrow may also be background color replaced after data synthesis. The ground identifier can be identified by using the yolo series model, and the background of the ground identifier is processed into pixels similar to the background image, so that the effect of the finally synthesized picture can be more real.
The ground identifier and the background image can be enhanced by means of size transformation, shading transformation, gaussian blur, texture transformation and noise addition.
Further, in the above step 200, the projecting the background image to the overhead view to obtain a BEV image, and identifying, by using a panorama segmentation model, a position coordinate for adding the ground identifier in the BEV image specifically includes:
when the model is used for recognition, the background image is divided into a lane line mask (mask) and a pavement travelable area mask (mask) by using a Panoptic-deep Lab model; of course, other models that achieve similar results may be used. As shown in fig. 3, the left side view (1) is a background image, the center view (2) is an identified lane line mask, and the right side view (3) is a road surface travelable region mask. Wherein the road vehicles can be filtered according to the road exercisable area mask to reduce the impact on data.
The lane line of the background image is identified through the panoramic segmentation model, so that the position of the lane line can be determined better, and the placement position of the ground identifier is determined according to the lane line. The default lane lines are parallel to each other in the bird's eye view. And determining position coordinates between two adjacent lane lines for placing the ground identifier template according to the constraint condition.
Further, as shown in FIG. 4, the background image, the lane line mask, and the pavement movable region mask are projected into the BEV image by an inverse perspective transformation. Wherein the left side view (4) is a background image under the bird's eye view, the central view (5) is a lane line mask under the bird's eye view, and the right side view (6) is a pavement drivable region mask under the bird's eye view. As can be seen from fig. 4, the lane lines are parallel to each other in the bird's eye view, and the coordinates of the position points of the ground identifier can be located by the lane lines parallel to each other in the lane line mask.
Specifically, the position where the road identifier can be pasted can be located by setting some constraint conditions. For example, according to the width of the lane lines, the distance between the adjacent lane lines and the size of the traffic identifier, the position coordinates for placing the ground identifier template between the two adjacent lane lines are calculated and determined. The line width of the lane lines is 15cm, the distance between adjacent lane lines of the urban road is 3.5 meters, and the pavement identifier of the urban road is 4.5 meters. The constraining road surface identifier is arranged in the middle between the adjacent lane lines, namely the origin of coordinates of the road surface identifier is positioned at the midpoint between the adjacent lane lines, and the transverse distance of the road surface identifier is smaller than the distance between the adjacent lane lines.
Four location point coordinates in the BEV image for placement of the ground identifier template are calculated based on the intermediate line between adjacent lane lines and the ground identifier template itself size. The homography projective transformation may complete the spatial transformation by eight point coordinates, such as the pavement identifier for background replacement in fig. 2, which is a rectangular icon, and its four corner coordinates may be converted into coordinates in BEV space by the homography transformation algorithm. Since the lane line mask projective transformation uses the coordinate transformation of the background image under the bird's eye view, the coordinates of the road surface identifier on the lane line mask can also be transformed into BEV space based on the coordinates of the original background image. And then fusing the ground identifier with the background image projected into the BEV image according to the position coordinates by using a poisson fusion technology means to obtain a target BEV image.
Further, after synthesizing the pavement arrow identifier under the view angle of the aerial view, the aerial view is transformed back to the normal camera view angle through the inverse matrix perspective of the transformation matrix; and outputting the synthesized data label corresponding to the ground identifier when the target BEV image is converted into a common image under a common visual angle. The common image output by the invention can meet the physical rule of actual 'near-big-far-small', and simultaneously obtains the corresponding synthesized data label, and the synthesized data can be used for downstream detection and segmentation tasks. The effect diagram before and after synthesis is shown in fig. 5, in which the left side (7) is an image before synthesis and the right side (8) is an image after synthesis.
The invention performs a synthetic data ablation experiment of a Ceymo (road marking dataset) public dataset based on class 1 Stright Arrow data, and a model adopts yolov7 as shown in the following table 1. Based on self-collected data as background images, 1614 pieces of synthesized data are synthesized. From table 1, it can be seen that 50% of training sets randomly selected from the streight Arrow category on the public data set Ceymo data set can reach 77% of maps by finetune based on the synthesized data, which proves that the data synthesis method provided by the invention is effective and reliable.
TABLE 1
Type(s) | Synthesizing data | Ceymo training set | Total training set | Ceymo test set | Map@0.5 | Map@0.5:0.95 |
1(SA) | 0 | 677 | 677 | 256 | 0.975 | 0.804 |
1(SA) | 1614 | 677*0.3 = 203 | 1817 | 256 | 0.327 | 0.215 |
1(SA) | 1614 | 677*0.5 = 338 | 1952 | 256 | 0.773 | 0.594 |
。
The method provided by the invention has the advantages that the data synthesis speed is high, the data synthesis can be automatically carried out in batches, and the diversity of the data is ensured; the synthesized data is provided with the labels, and when the data is synthesized, the corresponding labels can be generated together without manual labeling, so that a great deal of labeling cost is saved; the synthesized data is safer in privacy protection.
As shown in fig. 6, the present invention further provides an apparatus for synthesizing a ground identifier based on BEV images, including:
the obtaining module 61 is configured to obtain a ground identifier template and a background image, where the ground identifier template includes a ground arrow and a text identifier.
During recognition, the panoramic segmentation model is utilized to segment the background image into a lane line mask and a pavement travelable area mask; and determining position coordinates between two adjacent lane lines for placing the ground identifier template according to the constraint condition.
Specifically, the background image, the lane line mask and the pavement movable area mask are projected into the BEV image through inverse perspective transformation; the road vehicle is filtered according to the road exercisable area mask, and four position point coordinates for placing the ground identifier template in the BEV image are calculated.
A processing module 62 is configured to project the background image into an overhead view to obtain a BEV image, and identify position coordinates in the BEV image for adding the ground identifier.
For example, the position coordinates between two adjacent lane lines for placing the ground identifier template are calculated and determined according to the width of the lane lines, the distance between the adjacent lane lines and the size of the traffic identifier.
And a synthesis module 63, configured to fuse the ground identifier and the BEV image into a target BEV image according to the position coordinates, and transform the target BEV image into a normal image under a normal viewing angle, where the normal image includes the ground identifier template. For example, the ground identifier is fused with the background image projected into the BEV image according to the position coordinates. The synthesis module 63 is further configured to output a synthetic data tag corresponding to the ground identifier when the target BEV image is transformed into a normal image under a normal viewing angle.
Further, the device for synthesizing the ground identifier based on the BEV image further comprises a preprocessing module, a processing module and a processing module, wherein the preprocessing module is used for performing enhancement processing on the ground identifier template in a mode of size transformation, shading transformation, gaussian blur, texture transformation and noise addition; the background of the ground identifier is processed into pixels similar to the background image.
As shown in fig. 7, the present invention further provides an electronic device, including:
at least one processor; and at least one memory communicatively coupled to the processor, wherein: the memory stores program instructions executable by the processor, which invoke the program instructions to perform the method of synthesizing a ground identification based on BEV images described above.
The invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the above method of synthesizing a ground identification based on BEV images.
It is understood that the computer-readable storage medium may include: any entity or device capable of carrying a computer program, a recording medium, a USB flash disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a software distribution medium, and so forth. The computer program comprises computer program code. The computer program code may be in the form of source code, object code, executable files, or in some intermediate form, among others. The computer readable storage medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a software distribution medium, and so forth.
In some embodiments of the present invention, the apparatus for synthesizing a ground identification based on BEV images may include a controller, which is a single-chip microcomputer chip, integrated with a processor, a memory, a communication module, etc. The processor may refer to a processor comprised by the controller. The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and additional implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (10)
1. A method for synthesizing a ground identification based on BEV images, comprising:
acquiring a ground identifier template and a background image, wherein the ground identifier template comprises a ground arrow and a text identifier;
projecting the background image to an overhead view to obtain a BEV image, and identifying position coordinates for adding the ground identifier in the BEV image;
and fusing the ground identifier and the BEV image into a target BEV image according to the position coordinates, and converting the target BEV image into a common image under a common viewing angle, wherein the common image comprises the ground identifier template.
2. The method of claim 1, wherein projecting the background image into an overhead view to obtain a BEV image and identifying position coordinates in the BEV image for adding the ground identifier comprises:
during recognition, the panoramic segmentation model is utilized to segment the background image into a lane line mask and a pavement travelable area mask;
and determining position coordinates between two adjacent lane lines for placing the ground identifier template according to the constraint condition.
3. The method for synthesizing a ground identification based on BEV images according to claim 2, wherein the identifying, using a panorama segmentation model to segment the background image into a lane line and a drivable region of a road surface, comprises:
projecting the background image, the lane line mask and a pavement movable region mask into the BEV image by inverse perspective transformation;
the road vehicle is filtered according to the road exercisable area mask, and four position point coordinates for placing the ground identifier template in the BEV image are calculated.
4. The method of synthesizing a ground identification based on BEV images according to claim 2, wherein said determining the position coordinates between two adjacent lane lines for placing the ground identifier template according to the constraint condition comprises:
and calculating and determining the position coordinates between two adjacent lane lines for placing the ground identifier template according to the width of the lane lines, the distance between the adjacent lane lines and the size of the traffic identifier.
5. The method of claim 1, further comprising enhancing the ground identifier template by size conversion, shading, gaussian blur, texture conversion, noise addition; the background of the ground identifier is processed into pixels similar to the background image.
6. The method of claim 1, wherein the fusing the ground identifier with the BEV image to a target BEV image according to the location coordinates comprises:
the ground identifier is fused with the background image projected into the BEV image according to the position coordinates.
7. The method of claim 1, further comprising outputting a composite data tag corresponding to the ground identifier when transforming the target BEV image into a normal image at normal viewing angles.
8. An apparatus for synthesizing a ground identification based on BEV images, comprising:
the acquisition module is used for acquiring a ground identifier template and a background image, wherein the ground identifier template comprises a ground arrow and a text identifier;
the processing module is used for projecting the background image to an overhead view to obtain a BEV image, and identifying position coordinates for adding the ground identifier in the BEV image;
and the synthesis module is used for fusing the ground identifier and the BEV image into a target BEV image according to the position coordinates, and converting the target BEV image into a common image under a common viewing angle, wherein the common image comprises the ground identifier template.
9. An electronic device, comprising:
at least one processor; and at least one memory communicatively coupled to the processor, wherein: the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being run by a computer, performs the method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310951584.XA CN117078800A (en) | 2023-07-31 | 2023-07-31 | Method and device for synthesizing ground identification based on BEV image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310951584.XA CN117078800A (en) | 2023-07-31 | 2023-07-31 | Method and device for synthesizing ground identification based on BEV image |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117078800A true CN117078800A (en) | 2023-11-17 |
Family
ID=88703315
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310951584.XA Pending CN117078800A (en) | 2023-07-31 | 2023-07-31 | Method and device for synthesizing ground identification based on BEV image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117078800A (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150103173A1 (en) * | 2013-10-16 | 2015-04-16 | Denso Corporation | Synthesized image generation device |
US20180129887A1 (en) * | 2016-11-07 | 2018-05-10 | Samsung Electronics Co., Ltd. | Method and apparatus for indicating lane |
CN112017262A (en) * | 2020-08-10 | 2020-12-01 | 当家移动绿色互联网技术集团有限公司 | Pavement marker generation method and device, storage medium and electronic equipment |
CN114445592A (en) * | 2022-01-29 | 2022-05-06 | 重庆长安汽车股份有限公司 | Bird view semantic segmentation label generation method based on inverse perspective transformation and point cloud projection |
CN114677458A (en) * | 2022-03-28 | 2022-06-28 | 智道网联科技(北京)有限公司 | Road mark generation method and device for high-precision map, electronic equipment and storage medium |
CN115713678A (en) * | 2022-11-23 | 2023-02-24 | 武汉中海庭数据技术有限公司 | Arrow picture data augmentation method and system, electronic device and storage medium |
CN116245960A (en) * | 2023-02-24 | 2023-06-09 | 武汉光庭信息技术股份有限公司 | BEV top view generation method, system, electronic equipment and storage medium |
-
2023
- 2023-07-31 CN CN202310951584.XA patent/CN117078800A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150103173A1 (en) * | 2013-10-16 | 2015-04-16 | Denso Corporation | Synthesized image generation device |
US20180129887A1 (en) * | 2016-11-07 | 2018-05-10 | Samsung Electronics Co., Ltd. | Method and apparatus for indicating lane |
CN112017262A (en) * | 2020-08-10 | 2020-12-01 | 当家移动绿色互联网技术集团有限公司 | Pavement marker generation method and device, storage medium and electronic equipment |
CN114445592A (en) * | 2022-01-29 | 2022-05-06 | 重庆长安汽车股份有限公司 | Bird view semantic segmentation label generation method based on inverse perspective transformation and point cloud projection |
CN114677458A (en) * | 2022-03-28 | 2022-06-28 | 智道网联科技(北京)有限公司 | Road mark generation method and device for high-precision map, electronic equipment and storage medium |
CN115713678A (en) * | 2022-11-23 | 2023-02-24 | 武汉中海庭数据技术有限公司 | Arrow picture data augmentation method and system, electronic device and storage medium |
CN116245960A (en) * | 2023-02-24 | 2023-06-09 | 武汉光庭信息技术股份有限公司 | BEV top view generation method, system, electronic equipment and storage medium |
Non-Patent Citations (2)
Title |
---|
CAO P: "MULTI-VIEW FRUSTUM POINTNET FOR OBJECT DETECTION IN AUTONOMOUS DRIVING", 《2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING(ICIP)》, 15 April 2020 (2020-04-15) * |
元海文;肖长诗;修素朴;文元桥;周春辉;徐周华;: "基于单应性平面投影的高速车辆检测与定位方法", 武汉理工大学学报(交通科学与工程版), no. 01, 15 February 2017 (2017-02-15) * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Shin et al. | Vision-based navigation of an unmanned surface vehicle with object detection and tracking abilities | |
Taneja et al. | Image based detection of geometric changes in urban environments | |
Muad et al. | Implementation of inverse perspective mapping algorithm for the development of an automatic lane tracking system | |
CN107784038B (en) | Sensor data labeling method | |
CN111666805B (en) | Class marking system for autopilot | |
US10185880B2 (en) | Method and apparatus for augmenting a training data set | |
Bruls et al. | The right (angled) perspective: Improving the understanding of road scenes using boosted inverse perspective mapping | |
Kum et al. | Lane detection system with around view monitoring for intelligent vehicle | |
CN109741241B (en) | Fisheye image processing method, device, equipment and storage medium | |
CN113096003B (en) | Labeling method, device, equipment and storage medium for multiple video frames | |
WO2021155558A1 (en) | Road marking identification method, map generation method and related product | |
CN112651881B (en) | Image synthesizing method, apparatus, device, storage medium, and program product | |
CN111160328A (en) | Automatic traffic marking extraction method based on semantic segmentation technology | |
CN112258610B (en) | Image labeling method and device, storage medium and electronic equipment | |
CN110809766B (en) | Advanced driver assistance system and method | |
CN107798010A (en) | A kind of annotation equipment of sensing data | |
CN117078800A (en) | Method and device for synthesizing ground identification based on BEV image | |
CN116245960A (en) | BEV top view generation method, system, electronic equipment and storage medium | |
CN115861733A (en) | Point cloud data labeling method, model training method, electronic device and storage medium | |
CN112767412B (en) | Vehicle part classification method and device and electronic equipment | |
CN110827340A (en) | Map updating method, device and storage medium | |
Du et al. | Validation of vehicle detection and distance measurement method using virtual vehicle approach | |
Li et al. | Lane detection and road surface reconstruction based on multiple vanishing point & symposia | |
CN202058178U (en) | Character and image correction device | |
JP7383659B2 (en) | Navigation target marking methods and devices, electronic equipment, computer readable media |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |