CN116958639A

CN116958639A - Method for detecting and enhancing false behavior of traffic light recognition model

Info

Publication number: CN116958639A
Application number: CN202310537149.2A
Authority: CN
Inventors: 陈碧欢; 黄凯锋; 王粞宇; 鲁游; 彭鑫; 赵文耘
Original assignee: Fudan University
Current assignee: Fudan University
Priority date: 2023-05-13
Filing date: 2023-05-13
Publication date: 2023-10-27

Abstract

The invention belongs to the technical field of software engineering, and particularly relates to a traffic light recognition model error behavior detection and enhancement method based on road scene image amplification. Based on a real road scene image, designing two types of metamorphic relations and three types of image transformation methods through system understanding of weather environment, camera attributes and traffic light attributes, and generating an amplified real road scene image; detecting the error behavior of the traffic light identification model by using the amplified road scene image through two types of metamorphic relations; and the performance of the traffic light recognition model is improved through retraining.

Description

Method for detecting and enhancing false behavior of traffic light recognition model

Technical Field

The invention belongs to the technical field of software engineering, and particularly relates to a method for detecting and enhancing false behaviors of a traffic light recognition model.

Background

In recent years, the technology of automatically driving automobiles has been gaining more and more attention. Recent advances in sensor and computing technology have led to significant advances in autopilot technology. An automated system (ADS) for automatically driving an automobile is generally divided into two subsystems, namely a perception system and a decision system. The sensing system uses data captured by vehicle sensors (e.g., cameras, lidar, radar, and GPS) to determine the state of the vehicle itself and the surrounding environment, while the decision system navigates the vehicle from an initial location to a final destination specified by the user.

For an autopilot system, it is important to ensure its accuracy and robustness. As the real world driving environment is complex and diverse, ADS is easily affected by a real extreme case, thereby exhibiting erroneous behavior. These erroneous behaviors can lead to catastrophic consequences and irrecoverable losses. Therefore, many companies perform as many road tests as possible to ensure the quality safety of ADS. However, it is difficult and expensive to replicate the extreme situation and further test in a real world environment, so enterprises are widely testing with analog methods.

To ensure accuracy and robustness of ADS, the software engineering community has developed a number of test methods. One of the main efforts is to attempt to apply a search-based approach to detect security violations of ADS. The method formulates a test scenario as variables (e.g., vehicle speed and fog concentration) and applies an algorithm to search for a test scenario that violates security requirements. Another major work focus is to test DNN (deep neural network) based modules in ADS. For example, using a transformed approach to generate an image and/or point cloud of a real driving scene, which is used to test the corresponding DNN module in the ADS to find out the wrong behavior exhibited by the module.

Despite some progress in testing the false behavior in the various modules of the ADS, little has been concerned with testing of traffic light detection models in ADS. Traffic lights are used to control movement of traffic vehicles and thus play an important role in ensuring traffic safety. ADS typically employ a traffic light detection model to detect the position of one or more traffic lights in a driving scene and identify their status (e.g., red, green, and yellow). When ADS cannot properly detect and identify traffic lights, serious traffic accidents may result. It is therefore important to specifically test the model of traffic light detection in ADS.

The testing of traffic light detection relies to a large extent on marked traffic light data (i.e. traffic light images), which are typically collected manually, i.e. by cameras of road tests capturing images of traffic lights. However, collecting and annotating road scene data under different driving environments requires significant resource costs and is difficult to cover all real scenes.

Disclosure of Invention

The invention aims to provide a traffic light identification model error behavior detection and enhancement method based on road scene image augmentation, so as to improve the performance of a traffic light identification model.

The traffic light identification model error behavior detection and enhancement method based on road scene image augmentation provided by the invention is used for automatically generating marked traffic light images in batches and testing traffic light detection models in ADS. Based on the real road scene image, designing two types of metamorphic relations and three types of image transformation methods through system understanding of weather environment, camera attributes and traffic light attributes, and generating an amplified real road scene image; detecting the error behavior of the traffic light identification model by using the amplified road scene image through two types of metamorphic relations; and the performance of the traffic light recognition model is improved through retraining.

The invention provides a traffic light recognition model error behavior detection and enhancement method based on road scene image amplification, which comprises a data amplification method based on image transformation, a model error behavior detection method based on metamorphism and a model performance enhancement method based on retraining, and comprises the following steps:

(1) Image transformation-based data augmentation

Based on the traffic light images of the real world markers, through research on weather environment, camera attributes and traffic light attributes, the influence of the traffic light images captured by the cameras in the real environment is analyzed, twelve conversions are designed and implemented to generate synthetic data close to real world data in a short time; these transformations include Rain (RN), snow (SW), fog (FG), lens halo (LF), overexposure (OE), underexposure (UE), motion Blur (MB), changing traffic light color (CC), moving traffic light position (MP), adding traffic light (AD), rotating traffic light (RT), scaling traffic light (SC); these transforms fall into three categories, namely weather transforms (simulating effects of different weather environments), camera transforms (simulating different camera effects) and traffic light transforms (enriching different positions and states of traffic lights); the method specifically comprises the following steps:

(1.1) designing 3 kinds of image transformation methods, namely weather transformation, camera transformation and traffic light transformation according to the understanding of how traffic light images in a real road scene are imaged under different weather environments, camera properties and traffic light properties;

(1.2) realizing weather changes, simulating different weather environments, namely Rain (RN), snow (SW), fog (FG) and lens halation (LF); firstly, using Python library imgauge to realize RN, SW and FG; specifically, a layer of disturbance is generated and overlapped on each pixel of the image so as to simulate weather conditions covered by raindrops, snowflakes and dense fog; in addition, an LF is realized by using Adobe Photoshop, and a glare effect captured by a camera is simulated; specifically, a lens halation layer provided by Adobe Photoshop is synthesized on the real world traffic light image and the brightness and contrast of the image are automatically adjusted;

(1.3) implementing camera transformations simulating different camera shooting effects, namely Overexposure (OE), underexposure (UE) and Motion Blur (MB); similar to weather transformation, OE, UE, and MB are implemented using Python library imgauge; for OE and UE, image brightness is linearly adjusted by adjusting brightness in HSL color space, simulating overexposure and underexposure, respectively; for MB, the degree of blurring is controlled using a convolution kernel of kernel size k=15 (i.e., using a convolution kernel of 15×15 pixels);

(1.4) realizing traffic light transformation to enrich the diversity of traffic light positions and states in the image, including changing traffic light color (CC), moving traffic light position (MP), adding traffic light (AD), rotating traffic light (RT), and scaling traffic light (SC); specifically, the method is realized in the following manner.

Change traffic light color (CC): firstly, randomly selecting part of traffic lights in an image, positioning and extracting the traffic lights according to the position information of a boundary box in a label of the traffic lights; then, the blank part at the original traffic light position is complemented by the adjacent pixel point; secondly, using HSV color space to change the tone of the extracted traffic light, changing red into green, changing green into red, and correspondingly changing the state in the corresponding label; the extracted traffic light is then flipped over to ensure that the red light bulb is on the top or left side of the traffic light. Finally, the transformed traffic light is pasted back to the original area using an image fusion algorithm (e.g., poisson blending).

Moving traffic light position (MP): firstly, randomly selecting part of traffic lights in an image, positioning and extracting the traffic lights according to the position information of a boundary box in a label of the traffic lights; then, the blank part at the original traffic light position is complemented by the adjacent pixel point; secondly, moving the extracted traffic lights according to a certain offset and fusing the traffic lights to a new position (set as the width of the original boundary box here); and finally, modifying the corresponding boundary box coordinates in the label file.

Adding traffic light (AD): firstly, randomly selecting part of traffic lights in an image, positioning and extracting the traffic lights according to the position information of a boundary box in a label of the traffic lights; then the extracted traffic lights are moved according to a certain offset and fused to a new position (set as the width of the original bounding box here); and finally, adding newly added red-green boundary frame coordinates and state information into the label file.

Rotating traffic light (RT): firstly, randomly selecting part of traffic lights in an image, positioning and extracting the traffic lights according to the position information of a boundary box in a label of the traffic lights; then, the blank part at the original traffic light position is complemented by the adjacent pixel point; secondly, the extracted traffic light is rotated by 90 degrees clockwise or anticlockwise around the center point of the traffic light according to the rule, so that the red bulb is arranged at the top or left side of the traffic light after rotation; finally, we paste the transformed traffic light to the center of the original area and modify the bounding box coordinates in the label accordingly.

Scaling traffic light (SC): first import each image into Adobe Photoshop and expand the canvas according to its original size (e.g., increase the width of a 16:9 image by 320 pixels and increase the height by 180 pixels); then filling the extension area with a single color; secondly, repairing the image by using an image repairing technology in Adobe Photoshop; finally, the new image is scaled to the original size, and the coordinate values of the boundary frame in the image label file are modified correspondingly.

(2) Model error behavior detection based on metamorphic relation

The detected object is an identification model trained by using the traffic light image of the original real world as a training data set; preparing a test data set, amplifying by using a method (step 1), and taking an amplified image as a test case; inputting the test case into the model to identify and output an identification result, and detecting the error behavior of the traffic light identification model according to the metamorphic relation between the original real world image and the transformed image defined in the step (1); the method specifically comprises the following steps:

(2.1) training an identification model (for example, a YOLOv5 model) for an original data set by using a manually acquired real-world traffic light image to obtain an original traffic light identification model (hereinafter referred to as an original model);

(2.2) preparing a test data set of the traffic light image (for example, extracting a LISA data set part as the test data set), amplifying the test data set by using the method (1) to generate a new image, and taking the newly generated image as a test case;

(2.3) inputting the test case into the original model for identification, and outputting the identification result of the test case, wherein the result contains the tag information related to the traffic light (for example, box tag information and tag information such as stop, go and the like in the LISA data set);

(2.4) comparing the output identification result with an expected output result of the test case based on the metamorphic relation defined in the step (1);

(2.5) analyzing the comparison result, such as calculating mAP (mean Average Precision) value, etc., to determine accuracy and robustness of the recognition model; the difference between the actual output and the expected output is recorded to detect the false behavior of the original model.

(3) Model performance enhancement based on retraining

Amplifying the original training data set by the method in the step (1) to obtain an amplified data set, and using the amplified data set for retraining the traffic light recognition model to achieve the aims of enhancing the recognition capability of the model, enlarging the recognition range of the model and recognizing more traffic lights in complex scenes; the method specifically comprises the following steps:

(3.1) amplifying the training data set of the original model by using the method of the step (1) to obtain an amplified data set;

(3.2) randomly extracting the image data of the amplified data set in a proportion of 20% and adding the image data into the original training data set to form a new enhanced data set;

training the recognition model again by using the enhancement data set to obtain a traffic light recognition model (hereinafter referred to as enhancement model) with stronger recognition capability, larger recognition range and more complex recognition scene;

(3.4) testing the enhancement model using the method of step (2) to determine if the accuracy and robustness of the enhancement model is improved.

A schematic of the basic process of the present invention is shown in fig. 1. The method designs two types of metamorphic relation and three types of image transformation methods to simulate the influence of weather environment, camera attribute and traffic light attribute on the captured traffic light image in the real driving environment. The method can be used for generating the traffic light amplification image under the complex scene with high efficiency and low cost. The generated traffic light augmentation image is used for detecting the false behavior of the model and improving the performance of the traffic light detection model in the ADS.

Drawings

Fig. 1 is a schematic diagram of the basic process of the present invention.

FIG. 2 is a graph showing the image amplification effect of LISA data set according to the present invention.

Detailed Description

The invention is further introduced aiming at LISA and YOLOv5 models of traffic light data sets of open source road scenes, and the main process is as follows:

(1) Image amplification

The images of the LISA traffic lights in the open source road scene traffic light data set are 36256, and are divided into a training set, a verification set and a test set according to a ratio of 4:1:1. Firstly, adding raindrop noise on each original image by utilizing a Rain method in a Python library imgauge to realize RN transformation, setting a drop_size parameter as (0.1, 0.2) to control the size of raindrops, and setting a speed parameter as (0.2,0.3) to control the density of the raindrops; adding snow scene effect and cloud and Fog effect to each original image by utilizing a snowflag method and a Fog method in an imgauge respectively to realize SW (switch) and FG (field control) conversion, and setting a quality parameter to 2 to control the concentration of snow or Fog; LF is realized by using Adobe Photoshop, a lens halation layer is synthesized on the original image, and the brightness and contrast of the image are automatically adjusted. Then, adjusting the brightness of an original image by using an imgcorrruptillinke.Brightness method in a Python library imgauge, setting a quality parameter to 4 and 1 to control the exposure degree, and respectively simulating overexposure and underexposure to realize conversion of OE and UE; the motion blur method in imgauge is utilized to add a motion blur effect to each original image to realize MB conversion, and the kernel size parameter k is set to 15 to control the blur degree. Finally, randomly selecting a traffic light from each original image, positioning and extracting the traffic light according to the boundary frame position information in the label file, and pasting the traffic light into the image through poisson fusion according to the offset of the boundary frame width to realize AD conversion; randomly selecting a traffic light, extracting the traffic light, complementing images by using an image patching algorithm at the original traffic light position by using adjacent pixel points, and fusing and pasting the images into the images by poisson according to the offset of the width of the boundary frame to realize MP conversion; randomly selecting a traffic light, extracting the traffic light, carrying out image complementation at the original traffic light position by using an image patching algorithm and using adjacent pixel points, changing the tone of the extracted traffic light by using an HSV color space to change red into green, changing green into red, correspondingly changing the state in a corresponding label, and then turning over the extracted traffic light to ensure that a red bulb is at the top or the left side of the traffic light, and pasting the converted traffic light back to an original area by using poisson fusion, so as to realize CC conversion; randomly selecting a traffic light, extracting the traffic light, carrying out image complementation on the original traffic light position by using an image patching algorithm and using adjacent pixel points, rotating the extracted traffic light clockwise or anticlockwise by 90 degrees around the center point of the traffic light according to a rule so as to ensure that a red bulb is at the top or left side of the traffic light after rotation, pasting the converted traffic light to the center of an original area, and correspondingly modifying the boundary frame coordinates in a label to realize RT conversion; the original LISA dataset image is imported into Adobe Photoshop and the width of the image is increased by 320 pixels, the height is increased by 180 pixels, then the extension area is filled with a single color, then the image restoration technology is used for repairing the extension area, the new image is scaled to the original size, and the boundary box coordinate values in the image label file are correspondingly modified to realize the SC transformation. So far, 12 amplification modes (the effect is shown in figure 2) are completed on the LISA dataset image, 36256 images are respectively generated and are divided into a training set, a verification set and a test set according to a ratio of 4:1:1.

(2) Metamorphic relationship based testing

Firstly, training a Yolov5 recognition model by using the part of data of a training set in the original LISA data set before amplification in the step (1) to obtain an original model; then using the test set generated by the amplification in the step (1) as a test case; then inputting the test case into the original model for identification, and outputting the identification result of the test case, wherein the result contains the box coordinate label information of the traffic light and label information such as stop, go and the like; comparing the output identification result with an expected output result of the test case based on the metamorphic relation defined in the step (1); and finally, analyzing the comparison result, calculating mAP (mean Average Precision) value of the model to determine the accuracy and the robustness of the identification model, and recording the difference between the actual output and the expected output so as to detect the error behavior of the original model.

(3) Model retraining

Firstly, using the training set generated by amplification in the step (1) as an amplified training data set, randomly extracting data of the amplified data set at a proportion of 20%, and adding the data into the training data set to obtain an enhanced data set; then retraining the Yolov5 recognition model by using the enhancement data set to obtain an enhancement model; and finally, testing the enhancement model by using the method in the step (2) to determine whether the accuracy and the robustness of the enhancement model are improved.

Claims

1. The traffic light recognition model error behavior detection and enhancement method based on road scene image augmentation is characterized by comprising the following specific steps:

(1) Image transformation-based data augmentation

Based on the real-world traffic road scene image and the corresponding traffic light marks, analyzing the influence of the weather environment, the camera attribute and the traffic light attribute on the traffic light image captured by the camera in the real environment through researching the weather environment, the camera attribute and the traffic light attribute, designing and implementing 12 kinds of conversion to generate synthetic data close to real-world data in a short time; the transformations fall into three categories: weather transformation, simulating the effects of different weather environments; camera transformation, simulating traffic light transformation with different camera effects, enriching different positions and states of traffic lights; the 12 kinds of conversion are specifically Rain (RN), snow (SW), fog (FG), lens halation (LF), overexposure (OE), underexposure (UE), motion Blur (MB), changing traffic light color (CC), moving traffic light position (MP), adding traffic light (AD), rotating traffic light (RT), scaling traffic light (SC);

(2) Model error behavior detection based on metamorphic relation

The detected object is a traffic light identification model trained by using an original real-world traffic light image as a training data set; preparing a test data set, and amplifying by using the method in the step (1), wherein the amplified image is used as a test case; inputting the test case into the model for recognition, outputting a result of recognizing the traffic light target, and detecting the error behavior of the traffic light recognition model according to the metamorphic relation between the original real world image and the transformed image defined in the step (1);

(3) Model performance enhancement based on retraining

Amplifying the original training data set by the method in the step (1) to obtain an amplified data set, and using the amplified data set for retraining the traffic light recognition model to achieve the aims of enhancing the recognition capability of the model, enlarging the recognition range of the model and recognizing more traffic lights in complex scenes.

2. The method for detecting and enhancing false behavior of traffic light recognition model based on road scene image augmentation as claimed in claim 1, wherein said image transformation-based data augmentation of step (1) specifically comprises the following sub-steps:

(1.1) designing three types of image transformation, namely weather transformation, camera transformation and traffic light transformation according to the understanding of how traffic light images in a real road scene are imaged under different weather environments, camera properties and traffic light properties;

(1.2) weather transformation simulating different weather environments, namely Rain (RN), snow (SW), fog (FG) and lens halation (LF); firstly, using Python library imgauge to realize RN, SW and FG; generating and superposing a layer of disturbance on each pixel of the image so as to simulate weather conditions covered by raindrops, snowflakes and dense fog; the method comprises the steps of realizing LF by using Adobe Photoshop, simulating a glare effect captured by a camera, specifically synthesizing a lens halation layer provided by the Adobe Photoshop on a real-world traffic light image, and automatically adjusting the brightness and contrast of the image;

(1.3) camera transforms simulating different camera shooting effects, namely Overexposure (OE), underexposure (UE), and Motion Blur (MB); similar to weather transformation, OE, UE, and MB are implemented using Python library imgauge; for OE and UE, image brightness is linearly adjusted by adjusting brightness in HSL color space, simulating overexposure and underexposure, respectively; for MB, the degree of blurring is controlled using a convolution kernel of kernel size k=15, i.e. a convolution kernel of 15×15 pixels is used;

(1.4) traffic light transformation to enrich the diversity of traffic light positions and states in the image, including changing traffic light color (CC), moving traffic light position (MP), adding traffic light (AD), rotating traffic light (RT), and scaling traffic light (SC); the method is realized by the following steps:

change traffic light color (CC): firstly, randomly selecting part of traffic lights in an image, positioning and extracting the traffic lights according to the position information of a boundary box in a label of the traffic lights; then, the blank part at the original traffic light position is complemented by the adjacent pixel point; secondly, using HSV color space to change the tone of the extracted traffic light, changing red into green, changing green into red, and correspondingly changing the state in the corresponding label; then, turning over the extracted traffic light to ensure that the red bulb is at the top or left side of the traffic light; finally, pasting the transformed traffic light back to the original area by using an image fusion algorithm;

moving traffic light position (MP): firstly, randomly selecting part of traffic lights in an image, positioning and extracting the traffic lights according to the position information of a boundary box in a label of the traffic lights; then, the blank part at the original traffic light position is complemented by the adjacent pixel point; secondly, moving the extracted traffic lights according to a certain offset and fusing the traffic lights to a new position; finally, modifying the corresponding boundary frame coordinates in the label file;

adding traffic light (AD): firstly, randomly selecting part of traffic lights in an image, positioning and extracting the traffic lights according to the position information of a boundary box in a label of the traffic lights; then moving the extracted traffic lights according to a certain offset and fusing the traffic lights to a new position; finally, adding newly added red-green boundary frame coordinates and state information in the label file;

rotating traffic light (RT): firstly, randomly selecting part of traffic lights in an image, positioning and extracting the traffic lights according to the position information of a boundary box in a label of the traffic lights; then, the blank part at the original traffic light position is complemented by the adjacent pixel point; secondly, the extracted traffic light is rotated by 90 degrees clockwise or anticlockwise around the center point of the traffic light according to the rule, so that the red bulb is arranged at the top or left side of the traffic light after rotation; finally, pasting the transformed traffic light to the center of the original area, and correspondingly modifying the boundary frame coordinates in the label;

scaling traffic light (SC): firstly, importing each image into Adobe Photoshop and expanding canvas according to the original size of the image; then filling the extension area with a single color; secondly, repairing the image by using an image repairing technology in Adobe Photoshop; finally, the new image is scaled to the original size, and the coordinate values of the boundary frame in the image label file are modified correspondingly.

3. The method for detecting and enhancing false behavior of traffic light recognition model based on road scene image augmentation as claimed in claim 1, wherein said detecting false behavior of model based on metamorphic relation in step (2) specifically comprises the following sub-steps:

training a traffic light recognition model by using a manually acquired real-world traffic light image as an original data set to obtain an original traffic light recognition model, which is hereinafter referred to as an original model;

(2.2) preparing a test data set of the traffic light image, amplifying the test data set by using the method of the step (1) to generate a new image, and taking the newly generated image as a test case;

(2.3) inputting the test case into the original model for identification, and outputting an identification result of the test case, wherein the result contains label information related to the traffic light;

(2.4) comparing the output identification result with the expected output result of the test case obtained by the metamorphic relation defined in the step (1);

(2.5) analyzing the comparison result to determine the accuracy and robustness of the recognition model; the difference between the actual output and the expected output is recorded to detect the false behavior of the original model.

4. The method for detecting and enhancing false behavior of traffic light recognition model based on road scene image augmentation as claimed in claim 1, wherein said re-training based model performance enhancement of step (3) specifically comprises the sub-steps of:

(3.3) retraining the recognition model by using the enhancement data set to obtain a traffic light recognition model which has stronger recognition capability, larger recognition range and more complex recognition scene, namely an enhancement model for short;