CN117456509A - Signal lamp detection method, device and equipment for vehicle and storage medium - Google Patents
Signal lamp detection method, device and equipment for vehicle and storage medium Download PDFInfo
- Publication number
- CN117456509A CN117456509A CN202311502757.6A CN202311502757A CN117456509A CN 117456509 A CN117456509 A CN 117456509A CN 202311502757 A CN202311502757 A CN 202311502757A CN 117456509 A CN117456509 A CN 117456509A
- Authority
- CN
- China
- Prior art keywords
- signal lamp
- detection result
- lamp detection
- current frame
- image data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 284
- 235000004522 Pentaglottis sempervirens Nutrition 0.000 claims abstract description 47
- 238000000034 method Methods 0.000 claims abstract description 29
- 238000010586 diagram Methods 0.000 claims abstract description 26
- 230000008447 perception Effects 0.000 claims abstract description 25
- 238000004590 computer program Methods 0.000 claims description 16
- 230000000977 initiatory effect Effects 0.000 claims description 16
- 230000000007 visual effect Effects 0.000 claims description 15
- 238000013145 classification model Methods 0.000 claims description 10
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 5
- 240000004050 Pentaglottis sempervirens Species 0.000 description 9
- 238000013527 convolutional neural network Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 8
- 238000012545 processing Methods 0.000 description 7
- 230000004927 fusion Effects 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 241001300198 Caperonia palustris Species 0.000 description 2
- 235000000384 Veronica chamaedrys Nutrition 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000016776 visual perception Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
- G06V20/584—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of vehicle lights or traffic lights
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a signal lamp detection method, a device, equipment and a storage medium of a vehicle, wherein the method comprises the following steps: acquiring surrounding image data based on a bird's eye view perception backbone network, and extracting a front view characteristic diagram in the surrounding image data; signal lamp detection is carried out on the forward-looking characteristic diagram, and a signal lamp detection result of the current frame is obtained; and acquiring a history signal lamp detection result, and determining a final signal lamp detection result according to the current frame signal lamp detection result and the history signal lamp detection result. According to the signal lamp detection method for the vehicle, disclosed by the invention, the signal lamp detection and the bird's eye view sensing network are fused, so that the simplification of the whole sensing framework is realized, the complexity of the whole system is reduced, the reasoning time is shortened, the data utilization efficiency is improved, the signal lamp detection result of the current frame is judged in an auxiliary manner through the history signal lamp detection result, the problem of stroboscopic effect of the signal lamp can be solved, the detection result is stable, and the accuracy of the detection result is improved.
Description
Technical Field
The present invention relates to the field of automatic driving technologies, and in particular, to a method, an apparatus, a device, and a storage medium for detecting a signal lamp of a vehicle.
Background
The detection and identification of traffic lights are an indispensable part of unmanned systems, and the identification accuracy of the traffic lights is directly related to the safety of intelligent driving. Perception is also one of the crucial modules in an autopilot system. The BEV perception framework is a perception method based on a bird's eye view, and can model the environment around the vehicle into a two-dimensional plane, map the point cloud data around the vehicle onto the bird's eye view, and realize understanding and recognition of the environment around the vehicle.
The existing automatic driving perception technology based on the aerial view only supports tasks such as 3D object detection and segmentation, and does not support signal lamp detection tasks. The network based on the aerial view and the signal lamp detection network are independent from each other and have no unified framework, so that an independent network model is needed for detection when signal lamp detection is carried out.
The existing method has the defects that the model reasoning speed is low, a large amount of data cannot be utilized efficiently, and meanwhile, an end-to-end development mode cannot be realized, so that the development work is complex and complicated.
Disclosure of Invention
The invention provides a signal lamp detection method, device and equipment of a vehicle and a storage medium, so as to realize signal lamp detection fused with a bird's eye view sensing network.
According to an aspect of the present invention, there is provided a signal lamp detection method of a vehicle, including:
acquiring looking-around image data based on a bird's eye view perception backbone network, and extracting a front-view feature map in the looking-around image data;
signal lamp detection is carried out on the forward-looking characteristic diagram, and a signal lamp detection result of the current frame is obtained;
and acquiring a history signal lamp detection result, and determining a final signal lamp detection result according to the current frame signal lamp detection result and the history signal lamp detection result.
Further, signal lamp detection is performed on the forward-looking feature map to obtain a signal lamp detection result of the current frame, including:
according to the forward-looking characteristic diagram, combining a trained signal lamp detection model to obtain signal lamp position and size information;
intercepting the front view characteristic diagram according to the signal lamp position and size information to obtain a signal lamp characteristic diagram;
and according to the signal lamp feature map, combining the trained signal lamp classification model to obtain signal lamp category and confidence information, and taking the signal lamp category and the confidence information as the signal lamp detection result of the current frame.
Further, the history signal lamp detection result includes signal lamp detection results of continuously setting a frame number before the current frame signal lamp detection result, and determining a final signal lamp detection result according to the current frame signal lamp detection result and the history signal lamp detection result includes:
and comparing the current frame signal lamp detection result with the history signal lamp detection result, and if the current frame signal lamp detection result is consistent with the history signal lamp detection result, determining the current frame signal lamp detection result as the final signal lamp detection result.
Further, acquiring the looking-around image data based on the bird's eye view perception backbone network, and before extracting the front view feature map in the looking-around image data, further comprising:
and initiating a signal lamp detection request according to the high-definition map data.
Further, initiating a signal light detection request according to the high definition map data, including:
acquiring the high-definition map data, wherein the high-definition map data comprises signal lamp position information;
and when the signal lamp appears in front of the vehicle according to the signal lamp position information and the current vehicle position information in the high-definition map data, initiating a signal lamp detection request.
Further, after the backbone network is perceived to obtain the panoramic image data based on the bird's eye view, the method further comprises:
and performing bird's eye view sensing according to the surrounding image data.
Further, performing bird's eye view sensing according to the panoramic image data, including:
extracting the feature of the looking-around image according to the looking-around image data;
performing coordinate system conversion according to the surrounding image characteristics to obtain visual aerial view characteristics;
and fusing the visual aerial view features with the laser point cloud features extracted according to the laser point cloud data and the radar point cloud features extracted according to the radar point cloud data, and taking the obtained fused features as aerial view sensing results.
According to another aspect of the present invention, there is provided a signal lamp detection apparatus of a vehicle, including:
the front view feature map extraction module is used for obtaining the surrounding image data based on the bird's eye view perception backbone network and extracting the front view feature map in the surrounding image data;
the current frame signal lamp detection result acquisition module is used for carrying out signal lamp detection on the forward-looking characteristic diagram to obtain a current frame signal lamp detection result;
and the final signal lamp detection result determining module is used for acquiring a history signal lamp detection result and determining a final signal lamp detection result according to the current frame signal lamp detection result and the history signal lamp detection result.
Optionally, the current frame signal lamp detection result obtaining module is further configured to:
according to the forward-looking characteristic diagram, combining a trained signal lamp detection model to obtain signal lamp position and size information;
intercepting the front view characteristic diagram according to the signal lamp position and size information to obtain a signal lamp characteristic diagram;
and according to the signal lamp feature map, combining the trained signal lamp classification model to obtain signal lamp category and confidence information, and taking the signal lamp category and the confidence information as the signal lamp detection result of the current frame.
Optionally, the history signal lamp detection result includes a signal lamp detection result of a continuous set frame number before the current frame signal lamp detection result, and the final signal lamp detection result determining module is further configured to:
and comparing the current frame signal lamp detection result with the history signal lamp detection result, and if the current frame signal lamp detection result is consistent with the history signal lamp detection result, determining the current frame signal lamp detection result as the final signal lamp detection result.
Optionally, the device further comprises a signal lamp detection request initiating module, which is used for initiating a signal lamp detection request according to the high-definition map data.
Optionally, the signal light detection request initiating module is further configured to:
acquiring the high-definition map data, wherein the high-definition map data comprises signal lamp position information;
and when the signal lamp appears in front of the vehicle according to the signal lamp position information and the current vehicle position information in the high-definition map data, initiating a signal lamp detection request.
Optionally, the device further includes a bird's-eye view sensing module, configured to sense a bird's-eye view according to the surrounding image data.
Optionally, the bird's eye view sensing module is further configured to:
extracting the feature of the looking-around image according to the looking-around image data;
performing coordinate system conversion according to the surrounding image characteristics to obtain visual aerial view characteristics;
and fusing the visual aerial view features with the laser point cloud features extracted according to the laser point cloud data and the radar point cloud features extracted according to the radar point cloud data, and taking the obtained fused features as aerial view sensing results.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the traffic light detection method of the vehicle according to any one of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute the signal detection method of the vehicle according to any one of the embodiments of the present invention.
According to the signal lamp detection method of the vehicle, firstly, the look-around image data is obtained based on the bird's eye view perception backbone network, the front view feature map in the look-around image data is extracted, then the front view feature map is subjected to signal lamp detection to obtain a current frame signal lamp detection result, finally, a history signal lamp detection result is obtained, and a final signal lamp detection result is determined according to the current frame signal lamp detection result and the history signal lamp detection result. According to the signal lamp detection method for the vehicle, disclosed by the invention, the signal lamp detection and the bird's eye view sensing network are fused, so that the simplification of the whole sensing framework is realized, the complexity of the whole system is reduced, the reasoning time is shortened, the data utilization efficiency is improved, the signal lamp detection result of the current frame is judged in an auxiliary manner through the history signal lamp detection result, the problem of stroboscopic effect of the signal lamp can be solved, the detection result is stable, and the accuracy of the detection result is improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a signal lamp detection method of a vehicle according to a first embodiment of the present invention;
fig. 2 is a schematic diagram of a signal light detection and bird's eye view sensing fusion framework according to a first embodiment of the present invention;
fig. 3 is a flowchart of a signal lamp detection method of a vehicle according to a second embodiment of the present invention;
fig. 4 is a schematic structural view of a signal lamp detection device of a vehicle according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device implementing a signal lamp detection method of a vehicle according to a fourth embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 1 is a flowchart of a signal light detection method for a vehicle according to an embodiment of the present invention, where the method may be performed by a signal light detection device of a vehicle, the signal light detection device of the vehicle may be implemented in hardware and/or software, and the signal light detection device of the vehicle may be configured in an electronic device. As shown in fig. 1, the method includes:
s110, acquiring the surrounding image data based on the bird' S eye view perception backbone network, and extracting a front view feature map in the surrounding image data.
Where Bird Eye View perception (BEV) refers to the algorithm converting these perspective inputs into BEV features and performing a perception task, such as detecting a 3D bounding box of a target or generating a semantic View of the surrounding of the Bird Eye View, given a sequence of input images. The looking-around image data may be RGB images of a plurality of angles around the vehicle obtained by an image data acquisition module such as a camera, and the forward looking feature map is a feature image of the front of the vehicle extracted from the looking-around image data.
In this embodiment, the feature image for signal lamp detection may be acquired through the bird's-eye view sensing backbone network, and when the bird's-eye view sensing backbone network performs the bird's-eye view sensing task, the surrounding image data around the vehicle may be captured through a plurality of surrounding cameras installed around the vehicle. The panoramic image data is provided to the signal light detection module for signal light detection in addition to the bird's eye view perception.
Optionally, the signal lamp detection module may be connected to the aerial view perception backbone network, and the looking-around image data is processed by the image encoder to obtain looking-around image features, and then the front view feature map is extracted from the looking-around image features to be used as the input of the signal lamp detection module.
And S120, signal lamp detection is carried out on the forward-looking characteristic diagram, and a signal lamp detection result of the current frame is obtained.
The signal lamp detection result of the current frame is a detection result obtained by performing signal lamp detection on the currently obtained forward-looking feature map, such as information of the type, confidence, position, size and the like of the signal lamp.
In this embodiment, after the front view feature map is obtained, the front view feature map may be input as an input variable to a trained signal lamp detection module, where the output variable of the module may be a category of the signal lamp and a confidence score. And the output variable can be used as the detection result of the signal lamp of the current frame.
Preferably, the signal detection module may employ a convolutional neural network (Convolutional Neural Networks, CNN) model. The convolutional neural network imitates the biological visual perception (visual perception) mechanism to construct, can carry on and supervise and study, its intra-implicit convolution kernel parameter sharing and sparsity of the interlaminar connection make the convolutional neural network can learn the grid-like feature (such as pixel and audio) with less calculation amount, have stable effect and have no additional feature engineering (feature engineering) requirement to the data.
S130, acquiring a history signal lamp detection result, and determining a final signal lamp detection result according to the current frame signal lamp detection result and the history signal lamp detection result.
The history signal lamp detection result may be a detection result obtained by performing signal lamp detection on one or more frame of characteristic images before the current frame of forward-looking characteristic image.
In this embodiment, since the camera may have a strobe problem when acquiring the photo, the photo acquired by the camera may include a signal lamp that is not turned on, which results in a decrease in detection accuracy. Therefore, in order to ensure the accuracy of the detection result, the current frame signal lamp detection result and the history signal lamp detection result can be combined, and the signal lamp detection result after comprehensive consideration is taken as the final signal lamp detection result.
Optionally, a tracking module may be introduced based on the signal detection module, where the tracking module may record a history signal detection result, for example, the first six signal detection results before the current signal detection result. Comparing the detection result of the signal lamp of the first six frames with the detection result of the signal lamp of the current frame, and if the detection results are consistent, indicating that the detection result of the signal lamp of the current frame is accurate; if the detection result is jumped, the signal lamp stroboscopic may occur, and the detection result may be wrong. Specifically, the stroboscopic phenomenon may occur when the camera collects traffic lights, a normally-bright signal lamp is displayed as a frame bright and a frame not bright in the camera, and the frame not bright still has a detection result by introducing the tracking module, so that the detection result is stable.
Further, before the look-around image data is acquired based on the bird's eye view perception backbone network and the front view feature map in the look-around image data is extracted, the method can also include: and initiating a signal lamp detection request according to the high-definition map data.
The high-definition map data may be a map created according to actual environment information, and road condition information, such as actual positions of signal lamps in a road, may be obtained from the high-definition map data.
In this embodiment, the signal lamp detection module may initiate a detection request such as a signal lamp when the signal lamp appears in front of the vehicle according to the established high-definition map, so that the signal lamp detection module may perform a detection task in response to the signal lamp detection request.
Optionally, the method for initiating the signal lamp detection request according to the high-definition map data may be: acquiring high-definition map data, wherein the high-definition map data comprises signal lamp position information; and when the signal lamp appears in front of the vehicle according to the signal lamp position information and the current vehicle position information in the high-definition map data, initiating a signal lamp detection request.
Specifically, the high-definition map data includes actual positions of each signal lamp in the road, and the signal lamp in front of the vehicle can be judged according to the current vehicle position information of the vehicle and the signal lamp position information in the high-definition map, and at the moment, a signal lamp detection request can be initiated.
Further, with the high-precision map data, the detected signal lamp positions can be de-mapped to the signal lamp positions in the map to ensure a correct detection result, for example, the signal lamp detection module may detect 5 signal lamps very much, but only 4 signal lamps on the map are matched according to the signal lamp positions on the map, and only 4 signal lamps are reserved in the detection result.
Further, after the backbone network is perceived based on the bird's eye view to obtain the looking-around image data, the following steps are also possible: and performing bird's eye view sensing according to the surrounding image data.
In this embodiment, the bird's eye view sensing backbone network may perform its own bird's eye view sensing task in addition to providing the ring-looking image data to the signal detection module for signal detection.
Optionally, the method for performing bird's eye view sensing according to the looking-around image data may be: extracting the feature of the looking-around image according to the looking-around image data; converting a coordinate system according to the surrounding image characteristics to obtain visual aerial view characteristics; and fusing the visual aerial view features with the laser point cloud features extracted according to the laser point cloud data and the radar point cloud features extracted according to the radar point cloud data, and taking the obtained fused features as aerial view sensing results.
Fig. 2 is a schematic diagram of a signal lamp detection and aerial view perception fusion frame provided by an embodiment of the present invention, where as shown in the drawing, surrounding image data 01 may be processed by an image encoder to obtain surrounding image features 02, coordinate system conversion is performed on the surrounding image features 02, and after the camera coordinate system is converted into aerial view coordinate system, visual aerial view features 03 may be obtained; the looking-around image data 01 and the looking-around image features 02 are also connected to a signal lamp detection module 09 for extracting a front view feature map. Meanwhile, the bird's eye view perception backbone network can also acquire laser point cloud data 04 and Lei Dadian cloud data 06, wherein the laser point cloud data 04 can be processed by a laser point cloud encoder to obtain laser point cloud characteristics 05, and the Lei Dadian cloud data 06 can be processed by the Lei Dadian cloud encoder to obtain Lei Dadian cloud characteristics 07.
The panoramic image data 01, the laser point cloud data 04 and the Lei Dadian cloud data 06 are independently acquired by each sensor and are synchronized in the time dimension. Wherein the looking-around image data 01 is acquired by a plurality of looking-around cameras installed around the vehicle; the laser point cloud data 04 is acquired by a laser radar installed on the top of the vehicle; lei Dadian cloud data 06 is acquired by millimeter wave radars mounted on the head and tail of the vehicle. The image encoder, the laser point cloud encoder and the Lei Dadian cloud encoder are mutually independent and work simultaneously, and the result characteristic visual aerial view characteristic 03, the laser point cloud characteristic 05 and the Lei Dadian cloud characteristic 07 are also synchronous in the time dimension; the encoder is basically constructed as a convolutional neural network. The storage form of the visual aerial view feature 03, the laser point cloud feature 05 and the Lei Dadian cloud feature 07 is a data matrix after the rasterization processing, and the dimensions are the same in the space dimension.
Then, the visual aerial view feature 03, the laser point cloud feature 05 and the Lei Dadian cloud feature 07 are transversely spliced and then input into an aerial view encoder to obtain an aerial view fusion feature 08, and the aerial view fusion feature 08 can be used for various downstream tasks such as 3D object detection, map segmentation, track prediction, positioning and the like.
According to the signal lamp detection method of the vehicle, firstly, the look-around image data is obtained based on the bird's eye view perception backbone network, the front view feature map in the look-around image data is extracted, then the front view feature map is subjected to signal lamp detection to obtain a current frame signal lamp detection result, finally, a history signal lamp detection result is obtained, and a final signal lamp detection result is determined according to the current frame signal lamp detection result and the history signal lamp detection result. According to the signal lamp detection method for the vehicle, disclosed by the invention, the signal lamp detection and the bird's eye view sensing network are fused, so that the simplification of the whole sensing framework is realized, the complexity of the whole system is reduced, the reasoning time is shortened, the data utilization efficiency is improved, the signal lamp detection result of the current frame is judged in an auxiliary manner through the history signal lamp detection result, the problem of stroboscopic effect of the signal lamp can be solved, the detection result is stable, and the accuracy of the detection result is improved.
Example two
Fig. 3 is a flowchart of a signal lamp detection method for a vehicle according to a second embodiment of the present invention, and this embodiment is a refinement of the foregoing embodiment. As shown in fig. 3, the method includes:
s210, acquiring the surrounding image data based on the bird' S eye view perception backbone network, and extracting a front view feature map in the surrounding image data.
In this embodiment, after the look-around image data is acquired based on the bird's eye view perception backbone network, the look-around image features may be extracted from the look-around image data by the image encoder, and then the front view feature map is extracted from the look-around image features as the input of the signal lamp detection model.
Preferably, the image encoder may use the convolutional neural network Resnet50 as an image feature extractor, since residual units in the Resnet50 network may solve the gradient problem and make the quality of the extracted image features higher, and the corresponding detection or classification performance stronger. The image encoder sequentially inputs the looking-around images obtained by the looking-around cameras into the feature extractor, and after the corresponding looking-around image features are extracted, the front-view feature map is screened out to serve as the final input of signal lamp detection.
According to the scheme, the backbone network is shared by the signal lamp detection network and the bird's eye view sensing network, redundant calculation is reduced, and reasoning time is shortened.
S220, according to the front view feature diagram, combining the trained signal lamp detection model to obtain signal lamp position and size information.
In this embodiment, the mathematical model adopted in the signal lamp detection module may include a signal lamp detection model and a signal lamp classification model, where the signal lamp detection model is used for outputting the position and the size of the signal lamp, and the signal lamp classification model is used for outputting the category and the confidence information of the signal lamp.
Preferably, the front view feature map obtained through the bird's eye view perception backbone network can be used as an input of a signal lamp detection model, and the model can detect the position and the size of the signal lamp by using a convolutional neural network without classification.
S230, the front view feature map is intercepted according to the position and the size information of the signal lamp, and the signal lamp feature map is obtained.
The signal lamp characteristic image is a characteristic image of the signal lamp after redundant information is removed.
In this embodiment, the position and size information of the signal lamp obtained according to the signal lamp detection model may be intercepted on the front view feature map, so as to obtain the image feature only including the signal lamp part.
S240, according to the signal lamp feature map, combining the trained signal lamp classification model to obtain signal lamp type and confidence information, and taking the signal lamp type and the confidence information as a signal lamp detection result of the current frame.
In this embodiment, after the signal lamp feature map is obtained, the signal lamp feature map may be input into the signal lamp classification model.
Preferably, the signal classification model also uses a convolutional neural network, which can output the classification of the signal as well as the confidence score. So far, the signal lamp detection module obtains complete output.
S250, acquiring a history signal lamp detection result, comparing the current frame signal lamp detection result with the history signal lamp detection result, and determining the current frame signal lamp detection result as a final signal lamp detection result if the current frame signal lamp detection result is consistent with the history signal lamp detection result.
The history signal lamp detection results comprise signal lamp detection results of continuously set frame numbers before the current frame signal lamp detection results.
In this embodiment, a tracking module may be added to the signal lamp detection module, where the tracking module may record the current frame and the signal lamp detection result of the frame number set continuously before the current frame.
Preferably, the signal light detection result of the first six frames preceding the current frame may be regarded as the history signal light detection result. The tracking module can compare the detection results of the first six frames with the current frame, so that the stable and effective detection results of the current frame are ensured, and the signal lamp detection and identification performance is improved.
According to the signal lamp detection method of the vehicle, firstly, a backbone network is perceived based on a bird's eye view, looking-around image data is obtained, a front view feature map in the looking-around image data is extracted, then, according to the front view feature map, a trained signal lamp detection model is combined, signal lamp position and size information is obtained, then, according to the signal lamp position and size information, the front view feature map is intercepted, a signal lamp feature map is obtained, according to the signal lamp feature map, a trained signal lamp classification model is combined, signal lamp category and confidence information are obtained, the signal lamp category and the confidence information are used as current frame signal lamp detection results, finally, historical signal lamp detection results are obtained, the current frame signal lamp detection results are compared with the historical signal lamp detection results, and if the current frame signal lamp detection results are consistent with the historical signal lamp detection results, the current frame signal lamp detection results are determined to be final signal lamp detection results. According to the signal lamp detection method for the vehicle, disclosed by the invention, the signal lamp detection and the bird's eye view sensing network are fused, so that the simplification of the whole sensing framework is realized, the complexity of the whole system is reduced, the reasoning time is shortened, the data utilization efficiency is improved, the signal lamp detection result of the current frame is judged in an auxiliary manner through the history signal lamp detection result, the problem of stroboscopic effect of the signal lamp can be solved, the detection result is stable, and the accuracy of the detection result is improved.
Example III
Fig. 4 is a schematic structural diagram of a signal lamp detection device for a vehicle according to a third embodiment of the present invention. As shown in fig. 4, the apparatus includes: the device comprises a front view feature map extracting module 310, a current frame signal lamp detection result obtaining module 320 and a final signal lamp detection result determining module 330.
The front view feature map extracting module 310 is configured to obtain the around view image data based on the bird's eye view perception backbone network, and extract a front view feature map in the around view image data.
The current frame signal lamp detection result obtaining module 320 is configured to perform signal lamp detection on the forward-looking feature map, so as to obtain a current frame signal lamp detection result.
The final signal detection result determining module 330 is configured to obtain a history signal detection result, and determine a final signal detection result according to the current frame signal detection result and the history signal detection result.
Optionally, the current frame signal lamp detection result obtaining module 320 is further configured to:
according to the front view feature diagram, combining the trained signal lamp detection model to obtain signal lamp position and size information; intercepting a front view characteristic diagram according to the position and the size information of the signal lamp to obtain the signal lamp characteristic diagram; and according to the signal lamp feature map, combining the trained signal lamp classification model to obtain signal lamp category and confidence information, and taking the signal lamp category and the confidence information as the signal lamp detection result of the current frame.
Optionally, the history signal detection results include signal detection results of a number of frames set consecutively before the signal detection result of the current frame, and the final signal detection result determining module 330 is further configured to:
and comparing the current frame signal lamp detection result with the history signal lamp detection result, and determining the current frame signal lamp detection result as a final signal lamp detection result if the current frame signal lamp detection result is consistent with the history signal lamp detection result.
Optionally, the apparatus further includes a signal light detection request initiating module 340, configured to initiate a signal light detection request according to the high definition map data.
Optionally, the signal light detection request initiating module 340 is further configured to:
acquiring high-definition map data, wherein the high-definition map data comprises signal lamp position information; and when the signal lamp appears in front of the vehicle according to the signal lamp position information and the current vehicle position information in the high-definition map data, initiating a signal lamp detection request.
Optionally, the apparatus further includes a bird's-eye view sensing module 350, configured to perform bird's-eye view sensing according to the looking-around image data.
Optionally, the bird's eye view sensing module 350 is further configured to:
extracting the feature of the looking-around image according to the looking-around image data; converting a coordinate system according to the surrounding image characteristics to obtain visual aerial view characteristics; and fusing the visual aerial view features with the laser point cloud features extracted according to the laser point cloud data and the radar point cloud features extracted according to the radar point cloud data, and taking the obtained fused features as aerial view sensing results.
The signal lamp detection device of the vehicle provided by the embodiment of the invention can execute the signal lamp detection method of the vehicle provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example IV
Fig. 5 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the respective methods and processes described above, such as a signal light detection method of a vehicle.
In some embodiments, the signal detection method of the vehicle may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more of the steps of signal light detection of the vehicle described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the signal detection method of the vehicle in any other suitable way (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.
Claims (10)
1. A signal lamp detection method of a vehicle, characterized by comprising:
acquiring looking-around image data based on a bird's eye view perception backbone network, and extracting a front-view feature map in the looking-around image data;
signal lamp detection is carried out on the forward-looking characteristic diagram, and a signal lamp detection result of the current frame is obtained;
and acquiring a history signal lamp detection result, and determining a final signal lamp detection result according to the current frame signal lamp detection result and the history signal lamp detection result.
2. The method of claim 1, wherein performing signal light detection on the forward-looking feature map to obtain a signal light detection result of a current frame, comprises:
according to the forward-looking characteristic diagram, combining a trained signal lamp detection model to obtain signal lamp position and size information;
intercepting the front view characteristic diagram according to the signal lamp position and size information to obtain a signal lamp characteristic diagram;
and according to the signal lamp feature map, combining the trained signal lamp classification model to obtain signal lamp category and confidence information, and taking the signal lamp category and the confidence information as the signal lamp detection result of the current frame.
3. The method of claim 1, wherein the historical signal detection results include signal detection results that were consecutively set a number of frames before the current frame signal detection result, and wherein determining a final signal detection result based on the current frame signal detection result and the historical signal detection result comprises:
and comparing the current frame signal lamp detection result with the history signal lamp detection result, and if the current frame signal lamp detection result is consistent with the history signal lamp detection result, determining the current frame signal lamp detection result as the final signal lamp detection result.
4. The method of claim 1, wherein acquiring the panoramic image data based on the bird's eye view perceived backbone network, prior to extracting the front view feature map in the panoramic image data, further comprises:
and initiating a signal lamp detection request according to the high-definition map data.
5. The method of claim 4, wherein initiating a signal light detection request based on high definition map data comprises:
acquiring the high-definition map data, wherein the high-definition map data comprises signal lamp position information;
and when the signal lamp appears in front of the vehicle according to the signal lamp position information and the current vehicle position information in the high-definition map data, initiating a signal lamp detection request.
6. The method of claim 1, further comprising, after acquiring the panoramic image data based on the bird's eye view perceived backbone network:
and performing bird's eye view sensing according to the surrounding image data.
7. The method of claim 6, wherein performing bird's eye view perception from the look-around image data comprises:
extracting the feature of the looking-around image according to the looking-around image data;
performing coordinate system conversion according to the surrounding image characteristics to obtain visual aerial view characteristics;
and fusing the visual aerial view features with the laser point cloud features extracted according to the laser point cloud data and the radar point cloud features extracted according to the radar point cloud data, and taking the obtained fused features as aerial view sensing results.
8. A signal lamp detection device for a vehicle, comprising:
the front view feature map extraction module is used for obtaining the surrounding image data based on the bird's eye view perception backbone network and extracting the front view feature map in the surrounding image data;
the current frame signal lamp detection result acquisition module is used for carrying out signal lamp detection on the forward-looking characteristic diagram to obtain a current frame signal lamp detection result;
and the final signal lamp detection result determining module is used for acquiring a history signal lamp detection result and determining a final signal lamp detection result according to the current frame signal lamp detection result and the history signal lamp detection result.
9. An electronic device, the electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the traffic light detection method of the vehicle of any one of claims 1-7.
10. A computer readable storage medium storing computer instructions for causing a processor to perform the signal detection method of a vehicle according to any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311502757.6A CN117456509A (en) | 2023-11-10 | 2023-11-10 | Signal lamp detection method, device and equipment for vehicle and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311502757.6A CN117456509A (en) | 2023-11-10 | 2023-11-10 | Signal lamp detection method, device and equipment for vehicle and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117456509A true CN117456509A (en) | 2024-01-26 |
Family
ID=89594633
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311502757.6A Pending CN117456509A (en) | 2023-11-10 | 2023-11-10 | Signal lamp detection method, device and equipment for vehicle and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117456509A (en) |
-
2023
- 2023-11-10 CN CN202311502757.6A patent/CN117456509A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11841921B2 (en) | Model training method and apparatus, and prediction method and apparatus | |
CN111222395A (en) | Target detection method and device and electronic equipment | |
CN116664997A (en) | Perception fusion system, electronic equipment and storage medium | |
CN116661477A (en) | Substation unmanned aerial vehicle inspection method, device, equipment and storage medium | |
CN114005098A (en) | Method and device for detecting lane line information of high-precision map and electronic equipment | |
CN115861755A (en) | Feature fusion method and device, electronic equipment and automatic driving vehicle | |
CN117456509A (en) | Signal lamp detection method, device and equipment for vehicle and storage medium | |
CN114612544A (en) | Image processing method, device, equipment and storage medium | |
CN114429631A (en) | Three-dimensional object detection method, device, equipment and storage medium | |
CN113705432A (en) | Model training and three-dimensional target detection method, device, equipment and medium | |
CN115829898B (en) | Data processing method, device, electronic equipment, medium and automatic driving vehicle | |
CN115049895B (en) | Image attribute identification method, attribute identification model training method and device | |
CN114495042B (en) | Target detection method and device | |
CN113392795B (en) | Combined detection model training, combined detection method, device, equipment and medium | |
CN117372988B (en) | Road boundary detection method, device, electronic equipment and storage medium | |
CN117351450B (en) | Monocular 3D detection method and device, electronic equipment and storage medium | |
CN116935366B (en) | Target detection method and device, electronic equipment and storage medium | |
CN118552860B (en) | Obstacle detection method and device, electronic equipment and storage medium | |
CN116597213A (en) | Target detection method, training device, electronic equipment and storage medium | |
CN117152560A (en) | Training and target detection method and device for target detection model | |
CN118506029A (en) | Icing detection method, icing detection device, icing detection equipment and storage medium | |
CN117557669A (en) | Method, apparatus, device, storage medium and program product for generating top view | |
CN117710459A (en) | Method, device and computer program product for determining three-dimensional information | |
CN118429299A (en) | Intelligent extraction method and device for road facilities, electronic equipment and storage medium | |
CN117928574A (en) | Map generation method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |