CN117078800A - Method and device for synthesizing ground identification based on BEV image - Google Patents

Method and device for synthesizing ground identification based on BEV image Download PDF

Info

Publication number
CN117078800A
CN117078800A CN202310951584.XA CN202310951584A CN117078800A CN 117078800 A CN117078800 A CN 117078800A CN 202310951584 A CN202310951584 A CN 202310951584A CN 117078800 A CN117078800 A CN 117078800A
Authority
CN
China
Prior art keywords
image
ground
bev
identifier
ground identifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310951584.XA
Other languages
Chinese (zh)
Inventor
别晓芳
张松
梅近仁
李剑
孟超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zero Beam Technology Co ltd
Original Assignee
Zero Beam Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zero Beam Technology Co ltd filed Critical Zero Beam Technology Co ltd
Priority to CN202310951584.XA priority Critical patent/CN117078800A/en
Publication of CN117078800A publication Critical patent/CN117078800A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • G06T2207/30256Lane; Road marking
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a method and a device for synthesizing ground identification based on BEV images, wherein the method comprises the following steps: acquiring a ground identifier template and a background image, wherein the ground identifier template comprises a ground arrow and a text identifier; projecting the background image to an overhead view to obtain a BEV image, and identifying position coordinates for adding the ground identifier in the BEV image; and fusing the ground identifier and the BEV image into a target BEV image according to the position coordinates, and converting the target BEV image into a common image under a common viewing angle, wherein the common image comprises the ground identifier template. According to the method, the background image and the ground mark are utilized, the target data with the labels can be automatically synthesized in batches by means of the road surface segmentation results of the lane lines and the drivable areas, and the new type of data synthesis can be directly performed without any label of the data to be synthesized in the process, so that the model training problem in the data shortage is solved to a great extent.

Description

Method and device for synthesizing ground identification based on BEV image
Technical Field
The invention relates to the field of automatic driving image data synthesis, in particular to a method and a device for synthesizing ground marks based on BEV images.
Background
The data enhancement method is divided into offline enhancement and online enhancement, and common enhancement methods include geometric enhancement, online enhancement which is widely used in deep learning and other means, and the generalization of the model can be improved to a certain extent. Advanced enhancement means such as CycleGAN can also be used for image style migration, but the data enhancement means can only enhance on the original tagged data in supervised learning, and cannot generate new categories.
The model pre-labeling method refers to a technology for obtaining a pseudo label by carrying out model reasoning on data to be labeled through a pre-trained model, such as a yolox model and a yolov7 model which are widely applied in the engineering field, wherein the model is trained by a large amount of data with labels, is sensitive to the trained category, and has no discrimination capability on the category which is not learned.
At present, the technology of data synthesis has been widely applied to the 2D and 3D fields of automatic driving, and is expected to realize high efficiency of sensing tasks in the automatic driving field. More in the 2D field, data synthesis is performed based on images, and more complex factors such as background, viewing angle, camera depth, illumination, gesture and the like need to be considered in data synthesis in the 3D field. The synthetic data has very wide application, and can be almost suitable for all machine learning and deep learning tasks. The data generation can be performed through the StyleGAN and other generation type countermeasure networks, but the StyleGAN method is widely applied to face generation at present, and has unsatisfactory effects in other fields.
Disclosure of Invention
Aiming at the technical problems, the invention provides a method and a device for synthesizing a ground identifier based on BEV images, which can realize ground identifier synthesis based on the BEV images.
In a first aspect of the invention, there is provided a method of synthesizing a ground identification based on BEV images, comprising:
acquiring a ground identifier template and a background image, wherein the ground identifier template comprises a ground arrow and a text identifier;
projecting the background image to an overhead view to obtain a BEV image, and identifying position coordinates for adding the ground identifier in the BEV image;
and fusing the ground identifier and the BEV image into a target BEV image according to the position coordinates, and converting the target BEV image into a common image under a common viewing angle, wherein the common image comprises the ground identifier template.
In an alternative embodiment, the projecting the background image to the overhead view obtains a BEV image, and identifies position coordinates in the BEV image for adding the ground identifier, including:
during recognition, the panoramic segmentation model is utilized to segment the background image into a lane line mask and a pavement travelable area mask;
and determining position coordinates between two adjacent lane lines for placing the ground identifier template according to the constraint condition.
In an alternative embodiment, the identifying, using the panorama segmentation model to segment the background image into the lane line and the drivable area of the road surface, comprises:
projecting the background image, the lane line mask and a pavement movable region mask into the BEV image by inverse perspective transformation;
the road vehicle is filtered according to the road exercisable area mask, and four position point coordinates for placing the ground identifier template in the BEV image are calculated.
In an alternative embodiment, the determining the position coordinates between two adjacent lane lines for placing the ground identifier template according to the constraint condition includes:
and calculating and determining the position coordinates between two adjacent lane lines for placing the ground identifier template according to the width of the lane lines, the distance between the adjacent lane lines and the size of the traffic identifier.
In an alternative embodiment, the method for synthesizing the ground identifier based on the BEV image further comprises performing enhancement processing on the ground identifier template by means of size transformation, shading transformation, gaussian blur, texture transformation and noise addition; the background of the ground identifier is processed into pixels similar to the background image.
In an alternative embodiment, the fusing the ground identifier and the BEV image according to the location coordinates to a target BEV image comprises:
the ground identifier is fused with the background image projected into the BEV image according to the position coordinates.
In an alternative embodiment, the method for synthesizing the ground identifier based on the BEV image further includes outputting a synthesized data tag corresponding to the ground identifier when the target BEV image is transformed into a normal image under a normal viewing angle.
In a second aspect of the invention, there is provided an apparatus for synthesizing a ground identification based on BEV images, comprising:
the acquisition module is used for acquiring a ground identifier template and a background image, wherein the ground identifier template comprises a ground arrow and a text identifier;
the processing module is used for projecting the background image to an overhead view to obtain a BEV image, and identifying position coordinates for adding the ground identifier in the BEV image;
and the synthesis module is used for fusing the ground identifier and the BEV image into a target BEV image according to the position coordinates, and converting the target BEV image into a common image under a common viewing angle, wherein the common image comprises the ground identifier template.
In a third aspect of the present invention, there is provided an electronic apparatus comprising:
at least one processor; and at least one memory communicatively coupled to the processor, wherein: the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method according to the first aspect of the embodiments of the invention.
In a fourth aspect of the invention, a computer-readable storage medium is provided, on which a computer program is stored which, when run by a computer, performs the method according to the first aspect of the embodiment of the invention.
According to the method, the background image and the ground mark are utilized, the target data with the labels can be automatically synthesized in batches by means of the road surface segmentation results of the lane lines and the drivable areas, and the new type of data synthesis can be directly performed without any label of the data to be synthesized in the process, so that the model training problem in the data shortage is solved to a great extent.
Drawings
FIG. 1 is a flow chart of a method for synthesizing a ground identification based on BEV images in accordance with an embodiment of the present invention.
Fig. 2 is a schematic diagram of background color replacement of a ground identifier in an embodiment of the invention.
Fig. 3 is a schematic diagram of a result of recognition of a background image and panoramic segmentation in an embodiment of the present invention.
FIG. 4 is a schematic illustration of the projection of the diagram of FIG. 3 into BEV space.
Fig. 5 is a diagram showing the comparison between the ground identifier before and after synthesis at a common viewing angle in an embodiment of the present invention.
FIG. 6 is a block diagram of an apparatus for synthesizing a ground identification based on BEV images in accordance with an embodiment of the present invention.
Fig. 7 is a schematic structural view of an electronic device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be understood that the terms "comprises" and "comprising," when used in this specification and the claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The invention relates to 2D data synthesis of ground identification data such as ground arrows, and the like, and the invention realizes 2D data synthesis based on BEV images, namely realizes ground identification data synthesis of 2D common images in 3D scenes, unlike data synthesis in 2D scenes. The invention firstly obtains the template of the synthesized data and determines the placement position of the template. When the template positioning needs to be considered, for example, the pavement identifier should be on the ground and satisfies the camera front view angle rule of 'near-far-small', the background image can be projected to the aerial view angle through an inverse perspective transformation (IPM, inverse Perspective Mapping) method to obtain a corresponding aerial view. Under the view angle of the aerial view, the positions of the lane lines are firstly positioned through the lane line segmentation model, and the positions of vehicles on the road surface are filtered through a mask (mask) of the drivable area of the road surface; and then positioning the position of the template between two adjacent lane lines according to the position relation of the lane lines and a certain priori position constraint condition, so as to determine the final position of the template. The ground identification is then poisson fused with the background image, enabling the ground identification to be fused more naturally with the background. Finally, the synthesized data and labels can be used for downstream detection and segmentation tasks. The method comprises the following steps:
referring to fig. 1, fig. 1 is a flowchart of a method for synthesizing a ground identifier based on BEV images according to an embodiment of the present invention. The invention provides a method for synthesizing a ground identifier based on BEV images, which comprises the following steps:
step 100: a ground identifier template and a background image are acquired, wherein the ground identifier template comprises a ground arrow and a text identifier.
This step is used to obtain templates for data synthesis. For a 2D automatic driving scene, the acquisition of the ground identifier template mainly comprises the steps of acquiring from an existing template library, acquiring a template from an open source data set and manually manufacturing the template; the background image is a road surface image acquired by the front-view camera, and the background image selects a clean road surface without an arrow or a character identifier in a certain range or a ground without an arrow or a character identifier. The template library refers to a manufactured picture template and contains all data types required to be synthesized; the partially open source dataset may release a template that has been already manufactured, for example, a template in which 128 traffic signs have been released by TT100K (traffic sign dataset), or may use a cut target as a template. Illustratively, the ground identifier includes an indication of straight, an indication of straight ahead or right turn, an indication of straight ahead or left turn or right turn, an indication of right and left turn only on a road ahead, an indication of left or right turn ahead, an indication of left and right turn ahead, an indication of straight ahead or right turn ahead, an indication of left and right turn ahead of a road, an indication of right and left turn ahead of a road, an indication of right and right turn ahead of a road, and the like.
The background image of the BEV image can be obtained from a cloud data lake, and the background image can be projected into the BEV image by using a transformation method. Data enhancement is required for the acquired ground identifier template. The enhancement of the ground identifier template comprises common enhancement, background replacement and other methods, and aims to enhance the diversity and generalization of the template, and the details are described later.
Step 200: and projecting the background image to an overhead view to obtain a BEV image, and identifying position coordinates for adding the ground identifier in the BEV image.
In automated/assisted driving, detection of lane lines is very important. In the image captured by the front-view camera, the objects that are originally parallel to each other are intersected in the image due to the existence of the perspective effect. And the inverse perspective transformation (IPM transformation) is to eliminate this perspective effect. In this step, the background image is projected to the bird's eye view angle by the inverse perspective transformation method to obtain a corresponding bird's eye view, that is, the background image is displayed at the bird's eye view angle. The application scene of the invention comprises, but is not limited to, data synthesis under the cloud automatic driving BEV, and is also suitable for the data synthesis scene under the vehicle-end automatic driving BEV under the condition of sufficient vehicle-end calculation power resources.
And after the BEV image is acquired, the lane lines and the pavement drivable areas in the background image are identified by using an image segmentation model. The lane lines may then be projected into the BEV image by an inverse perspective transformation; since the identified lane lines exist in the bird's eye view, the placement position of the ground identifier can be positioned in the bird's eye view according to the lane lines.
Step 300: and fusing the ground identifier and the BEV image into a target BEV image according to the position coordinates, and converting the target BEV image into a common image under a common viewing angle, wherein the common image comprises the ground identifier template.
After the processing of the steps, a lane line mask and a pavement travelable area mask corresponding to each background image can be obtained, and then a basic model for automatic driving training can be obtained. The base model is added with a ground identifier to synthesize training data with labels for automatic driving training.
After determining the location coordinates for adding the ground identifier, the ground identifier and the BEV image are fused into a target BEV image according to the location coordinates.
After synthesizing the pavement identifier under the view angle of the aerial view, the aerial view is transformed back to the normal camera view angle through the inverse matrix perspective of the transformation matrix, so that the pavement image under the view angle of the front-view camera can be obtained, the physical rule of actual 'near-large-far-small' is met, the corresponding synthesized data label is obtained, and the synthesized data can be used for downstream detection and segmentation tasks.
From the above, the invention utilizes the background image and the ground mark, and can realize automatic batch synthesis of the target data with the labels by means of the road surface segmentation results of the lane lines and the drivable region, thereby realizing the data enhancement of BEV images and the image enhancement of common visual angles; in the process, the new type of data synthesis can be directly performed without any label of the data to be synthesized, and the model training problem in the data shortage is solved to a great extent.
Further, the method further comprises the step of carrying out data enhancement and preprocessing on the ground identifier template before carrying out data synthesis, wherein the data enhancement and preprocessing comprises background color processing, background image preprocessing and the like of the ground identifier. Referring to fig. 2, fig. 2 is a schematic diagram of template background color substitution of a ground identifier. For example, the background color of the ground identification template may be directly changed to the same hue as the background image. For example, the number of the arrow templates is 12, and the 12 templates are directly subjected to background color replacement.
In addition, the arrow may also be background color replaced after data synthesis. The ground identifier can be identified by using the yolo series model, and the background of the ground identifier is processed into pixels similar to the background image, so that the effect of the finally synthesized picture can be more real.
The ground identifier and the background image can be enhanced by means of size transformation, shading transformation, gaussian blur, texture transformation and noise addition.
Further, in the above step 200, the projecting the background image to the overhead view to obtain a BEV image, and identifying, by using a panorama segmentation model, a position coordinate for adding the ground identifier in the BEV image specifically includes:
when the model is used for recognition, the background image is divided into a lane line mask (mask) and a pavement travelable area mask (mask) by using a Panoptic-deep Lab model; of course, other models that achieve similar results may be used. As shown in fig. 3, the left side view (1) is a background image, the center view (2) is an identified lane line mask, and the right side view (3) is a road surface travelable region mask. Wherein the road vehicles can be filtered according to the road exercisable area mask to reduce the impact on data.
The lane line of the background image is identified through the panoramic segmentation model, so that the position of the lane line can be determined better, and the placement position of the ground identifier is determined according to the lane line. The default lane lines are parallel to each other in the bird's eye view. And determining position coordinates between two adjacent lane lines for placing the ground identifier template according to the constraint condition.
Further, as shown in FIG. 4, the background image, the lane line mask, and the pavement movable region mask are projected into the BEV image by an inverse perspective transformation. Wherein the left side view (4) is a background image under the bird's eye view, the central view (5) is a lane line mask under the bird's eye view, and the right side view (6) is a pavement drivable region mask under the bird's eye view. As can be seen from fig. 4, the lane lines are parallel to each other in the bird's eye view, and the coordinates of the position points of the ground identifier can be located by the lane lines parallel to each other in the lane line mask.
Specifically, the position where the road identifier can be pasted can be located by setting some constraint conditions. For example, according to the width of the lane lines, the distance between the adjacent lane lines and the size of the traffic identifier, the position coordinates for placing the ground identifier template between the two adjacent lane lines are calculated and determined. The line width of the lane lines is 15cm, the distance between adjacent lane lines of the urban road is 3.5 meters, and the pavement identifier of the urban road is 4.5 meters. The constraining road surface identifier is arranged in the middle between the adjacent lane lines, namely the origin of coordinates of the road surface identifier is positioned at the midpoint between the adjacent lane lines, and the transverse distance of the road surface identifier is smaller than the distance between the adjacent lane lines.
Four location point coordinates in the BEV image for placement of the ground identifier template are calculated based on the intermediate line between adjacent lane lines and the ground identifier template itself size. The homography projective transformation may complete the spatial transformation by eight point coordinates, such as the pavement identifier for background replacement in fig. 2, which is a rectangular icon, and its four corner coordinates may be converted into coordinates in BEV space by the homography transformation algorithm. Since the lane line mask projective transformation uses the coordinate transformation of the background image under the bird's eye view, the coordinates of the road surface identifier on the lane line mask can also be transformed into BEV space based on the coordinates of the original background image. And then fusing the ground identifier with the background image projected into the BEV image according to the position coordinates by using a poisson fusion technology means to obtain a target BEV image.
Further, after synthesizing the pavement arrow identifier under the view angle of the aerial view, the aerial view is transformed back to the normal camera view angle through the inverse matrix perspective of the transformation matrix; and outputting the synthesized data label corresponding to the ground identifier when the target BEV image is converted into a common image under a common visual angle. The common image output by the invention can meet the physical rule of actual 'near-big-far-small', and simultaneously obtains the corresponding synthesized data label, and the synthesized data can be used for downstream detection and segmentation tasks. The effect diagram before and after synthesis is shown in fig. 5, in which the left side (7) is an image before synthesis and the right side (8) is an image after synthesis.
The invention performs a synthetic data ablation experiment of a Ceymo (road marking dataset) public dataset based on class 1 Stright Arrow data, and a model adopts yolov7 as shown in the following table 1. Based on self-collected data as background images, 1614 pieces of synthesized data are synthesized. From table 1, it can be seen that 50% of training sets randomly selected from the streight Arrow category on the public data set Ceymo data set can reach 77% of maps by finetune based on the synthesized data, which proves that the data synthesis method provided by the invention is effective and reliable.
TABLE 1
Type(s) Synthesizing data Ceymo training set Total training set Ceymo test set Map@0.5 Map@0.5:0.95
1(SA) 0 677 677 256 0.975 0.804
1(SA) 1614 677*0.3 = 203 1817 256 0.327 0.215
1(SA) 1614 677*0.5 = 338 1952 256 0.773 0.594
The method provided by the invention has the advantages that the data synthesis speed is high, the data synthesis can be automatically carried out in batches, and the diversity of the data is ensured; the synthesized data is provided with the labels, and when the data is synthesized, the corresponding labels can be generated together without manual labeling, so that a great deal of labeling cost is saved; the synthesized data is safer in privacy protection.
As shown in fig. 6, the present invention further provides an apparatus for synthesizing a ground identifier based on BEV images, including:
the obtaining module 61 is configured to obtain a ground identifier template and a background image, where the ground identifier template includes a ground arrow and a text identifier.
During recognition, the panoramic segmentation model is utilized to segment the background image into a lane line mask and a pavement travelable area mask; and determining position coordinates between two adjacent lane lines for placing the ground identifier template according to the constraint condition.
Specifically, the background image, the lane line mask and the pavement movable area mask are projected into the BEV image through inverse perspective transformation; the road vehicle is filtered according to the road exercisable area mask, and four position point coordinates for placing the ground identifier template in the BEV image are calculated.
A processing module 62 is configured to project the background image into an overhead view to obtain a BEV image, and identify position coordinates in the BEV image for adding the ground identifier.
For example, the position coordinates between two adjacent lane lines for placing the ground identifier template are calculated and determined according to the width of the lane lines, the distance between the adjacent lane lines and the size of the traffic identifier.
And a synthesis module 63, configured to fuse the ground identifier and the BEV image into a target BEV image according to the position coordinates, and transform the target BEV image into a normal image under a normal viewing angle, where the normal image includes the ground identifier template. For example, the ground identifier is fused with the background image projected into the BEV image according to the position coordinates. The synthesis module 63 is further configured to output a synthetic data tag corresponding to the ground identifier when the target BEV image is transformed into a normal image under a normal viewing angle.
Further, the device for synthesizing the ground identifier based on the BEV image further comprises a preprocessing module, a processing module and a processing module, wherein the preprocessing module is used for performing enhancement processing on the ground identifier template in a mode of size transformation, shading transformation, gaussian blur, texture transformation and noise addition; the background of the ground identifier is processed into pixels similar to the background image.
As shown in fig. 7, the present invention further provides an electronic device, including:
at least one processor; and at least one memory communicatively coupled to the processor, wherein: the memory stores program instructions executable by the processor, which invoke the program instructions to perform the method of synthesizing a ground identification based on BEV images described above.
The invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the above method of synthesizing a ground identification based on BEV images.
It is understood that the computer-readable storage medium may include: any entity or device capable of carrying a computer program, a recording medium, a USB flash disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a software distribution medium, and so forth. The computer program comprises computer program code. The computer program code may be in the form of source code, object code, executable files, or in some intermediate form, among others. The computer readable storage medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a software distribution medium, and so forth.
In some embodiments of the present invention, the apparatus for synthesizing a ground identification based on BEV images may include a controller, which is a single-chip microcomputer chip, integrated with a processor, a memory, a communication module, etc. The processor may refer to a processor comprised by the controller. The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and additional implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for synthesizing a ground identification based on BEV images, comprising:
acquiring a ground identifier template and a background image, wherein the ground identifier template comprises a ground arrow and a text identifier;
projecting the background image to an overhead view to obtain a BEV image, and identifying position coordinates for adding the ground identifier in the BEV image;
and fusing the ground identifier and the BEV image into a target BEV image according to the position coordinates, and converting the target BEV image into a common image under a common viewing angle, wherein the common image comprises the ground identifier template.
2. The method of claim 1, wherein projecting the background image into an overhead view to obtain a BEV image and identifying position coordinates in the BEV image for adding the ground identifier comprises:
during recognition, the panoramic segmentation model is utilized to segment the background image into a lane line mask and a pavement travelable area mask;
and determining position coordinates between two adjacent lane lines for placing the ground identifier template according to the constraint condition.
3. The method for synthesizing a ground identification based on BEV images according to claim 2, wherein the identifying, using a panorama segmentation model to segment the background image into a lane line and a drivable region of a road surface, comprises:
projecting the background image, the lane line mask and a pavement movable region mask into the BEV image by inverse perspective transformation;
the road vehicle is filtered according to the road exercisable area mask, and four position point coordinates for placing the ground identifier template in the BEV image are calculated.
4. The method of synthesizing a ground identification based on BEV images according to claim 2, wherein said determining the position coordinates between two adjacent lane lines for placing the ground identifier template according to the constraint condition comprises:
and calculating and determining the position coordinates between two adjacent lane lines for placing the ground identifier template according to the width of the lane lines, the distance between the adjacent lane lines and the size of the traffic identifier.
5. The method of claim 1, further comprising enhancing the ground identifier template by size conversion, shading, gaussian blur, texture conversion, noise addition; the background of the ground identifier is processed into pixels similar to the background image.
6. The method of claim 1, wherein the fusing the ground identifier with the BEV image to a target BEV image according to the location coordinates comprises:
the ground identifier is fused with the background image projected into the BEV image according to the position coordinates.
7. The method of claim 1, further comprising outputting a composite data tag corresponding to the ground identifier when transforming the target BEV image into a normal image at normal viewing angles.
8. An apparatus for synthesizing a ground identification based on BEV images, comprising:
the acquisition module is used for acquiring a ground identifier template and a background image, wherein the ground identifier template comprises a ground arrow and a text identifier;
the processing module is used for projecting the background image to an overhead view to obtain a BEV image, and identifying position coordinates for adding the ground identifier in the BEV image;
and the synthesis module is used for fusing the ground identifier and the BEV image into a target BEV image according to the position coordinates, and converting the target BEV image into a common image under a common viewing angle, wherein the common image comprises the ground identifier template.
9. An electronic device, comprising:
at least one processor; and at least one memory communicatively coupled to the processor, wherein: the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being run by a computer, performs the method according to any one of claims 1 to 7.
CN202310951584.XA 2023-07-31 2023-07-31 Method and device for synthesizing ground identification based on BEV image Pending CN117078800A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310951584.XA CN117078800A (en) 2023-07-31 2023-07-31 Method and device for synthesizing ground identification based on BEV image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310951584.XA CN117078800A (en) 2023-07-31 2023-07-31 Method and device for synthesizing ground identification based on BEV image

Publications (1)

Publication Number Publication Date
CN117078800A true CN117078800A (en) 2023-11-17

Family

ID=88703315

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310951584.XA Pending CN117078800A (en) 2023-07-31 2023-07-31 Method and device for synthesizing ground identification based on BEV image

Country Status (1)

Country Link
CN (1) CN117078800A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150103173A1 (en) * 2013-10-16 2015-04-16 Denso Corporation Synthesized image generation device
US20180129887A1 (en) * 2016-11-07 2018-05-10 Samsung Electronics Co., Ltd. Method and apparatus for indicating lane
CN112017262A (en) * 2020-08-10 2020-12-01 当家移动绿色互联网技术集团有限公司 Pavement marker generation method and device, storage medium and electronic equipment
CN114445592A (en) * 2022-01-29 2022-05-06 重庆长安汽车股份有限公司 Bird view semantic segmentation label generation method based on inverse perspective transformation and point cloud projection
CN114677458A (en) * 2022-03-28 2022-06-28 智道网联科技(北京)有限公司 Road mark generation method and device for high-precision map, electronic equipment and storage medium
CN115713678A (en) * 2022-11-23 2023-02-24 武汉中海庭数据技术有限公司 Arrow picture data augmentation method and system, electronic device and storage medium
CN116245960A (en) * 2023-02-24 2023-06-09 武汉光庭信息技术股份有限公司 BEV top view generation method, system, electronic equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150103173A1 (en) * 2013-10-16 2015-04-16 Denso Corporation Synthesized image generation device
US20180129887A1 (en) * 2016-11-07 2018-05-10 Samsung Electronics Co., Ltd. Method and apparatus for indicating lane
CN112017262A (en) * 2020-08-10 2020-12-01 当家移动绿色互联网技术集团有限公司 Pavement marker generation method and device, storage medium and electronic equipment
CN114445592A (en) * 2022-01-29 2022-05-06 重庆长安汽车股份有限公司 Bird view semantic segmentation label generation method based on inverse perspective transformation and point cloud projection
CN114677458A (en) * 2022-03-28 2022-06-28 智道网联科技(北京)有限公司 Road mark generation method and device for high-precision map, electronic equipment and storage medium
CN115713678A (en) * 2022-11-23 2023-02-24 武汉中海庭数据技术有限公司 Arrow picture data augmentation method and system, electronic device and storage medium
CN116245960A (en) * 2023-02-24 2023-06-09 武汉光庭信息技术股份有限公司 BEV top view generation method, system, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CAO P: "MULTI-VIEW FRUSTUM POINTNET FOR OBJECT DETECTION IN AUTONOMOUS DRIVING", 《2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING(ICIP)》, 15 April 2020 (2020-04-15) *
元海文;肖长诗;修素朴;文元桥;周春辉;徐周华;: "基于单应性平面投影的高速车辆检测与定位方法", 武汉理工大学学报(交通科学与工程版), no. 01, 15 February 2017 (2017-02-15) *

Similar Documents

Publication Publication Date Title
Shin et al. Vision-based navigation of an unmanned surface vehicle with object detection and tracking abilities
Taneja et al. Image based detection of geometric changes in urban environments
Muad et al. Implementation of inverse perspective mapping algorithm for the development of an automatic lane tracking system
CN107784038B (en) Sensor data labeling method
CN111666805B (en) Class marking system for autopilot
US10185880B2 (en) Method and apparatus for augmenting a training data set
Bruls et al. The right (angled) perspective: Improving the understanding of road scenes using boosted inverse perspective mapping
Kum et al. Lane detection system with around view monitoring for intelligent vehicle
CN109741241B (en) Fisheye image processing method, device, equipment and storage medium
CN113096003B (en) Labeling method, device, equipment and storage medium for multiple video frames
WO2021155558A1 (en) Road marking identification method, map generation method and related product
CN112651881B (en) Image synthesizing method, apparatus, device, storage medium, and program product
CN111160328A (en) Automatic traffic marking extraction method based on semantic segmentation technology
CN112258610B (en) Image labeling method and device, storage medium and electronic equipment
CN110809766B (en) Advanced driver assistance system and method
CN107798010A (en) A kind of annotation equipment of sensing data
CN117078800A (en) Method and device for synthesizing ground identification based on BEV image
CN116245960A (en) BEV top view generation method, system, electronic equipment and storage medium
CN115861733A (en) Point cloud data labeling method, model training method, electronic device and storage medium
CN112767412B (en) Vehicle part classification method and device and electronic equipment
CN110827340A (en) Map updating method, device and storage medium
Du et al. Validation of vehicle detection and distance measurement method using virtual vehicle approach
Li et al. Lane detection and road surface reconstruction based on multiple vanishing point & symposia
CN202058178U (en) Character and image correction device
JP7383659B2 (en) Navigation target marking methods and devices, electronic equipment, computer readable media

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination