CN110532903B

CN110532903B - Traffic light image processing method and equipment

Info

Publication number: CN110532903B
Application number: CN201910741204.3A
Authority: CN
Inventors: 庄明磊
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2019-08-12
Filing date: 2019-08-12
Publication date: 2022-02-22
Anticipated expiration: 2039-08-12
Also published as: CN110532903A

Abstract

The invention discloses a traffic light image processing method and equipment, which are used for solving the problems that the detection precision of the existing traffic light state identification algorithm is low, the position of a traffic light is possibly mistakenly coated, and the like. Determining a first position area and a color state of each traffic light in a first rectangular frame through a neural network model, if the color state of each traffic light is red, comparing the first position area corresponding to the traffic light detected by the neural network model with a second position area corresponding to the traffic light in the previous N frames of traffic light images when the color state of the traffic light is red, and if the distance between the first position area and the second position area does not exceed a second preset threshold, determining a pixel to be processed through the first position area; otherwise, the pixel to be processed is determined through the second position area, and the position of the pixel to be processed is determined through the multi-dimensional information, so that the accuracy of image processing is improved, and image error processing caused by error position information output through a single neural network model is avoided.

Description

Traffic light image processing method and equipment

Technical Field

The invention relates to the technical field of image processing, in particular to a method and equipment for processing traffic light images.

Background

At present, an electric alarm camera in traffic management is widely applied to snapshot of violation behaviors of vehicles at intersections, and provides a penalty judgment basis for traffic violation punishment. The method mainly comprises the step of generating a violation evidence graph by recording violation vehicle behaviors and traffic light color state information, so that the penalty vehicle violation changes are reasonably justified, and a penalty dispute is avoided.

Under different illumination intensities, the color imaging of the traffic lights in the electric police cameras is not necessarily three intuitive colors of red, green and yellow. For example, under the scenes of insufficient ambient light, over-strong ambient light and the like, the traffic light has the phenomenon that the center color of a red light is more yellow or more white, and the center color of a green light and a yellow light is more white, so that the phenomena that the red light is not red, the green light is not green and the yellow light is not yellow are easily caused in the violation evidence map, and the traffic violation punishment behavior is lack of persuasion. Therefore, the strengthening of the color of the traffic lights in the electric police cameras is also an indispensable function of the electric police cameras.

In the prior art, the actual color state and position of a traffic light can be detected through a traffic light state identification algorithm, and the red painting operation is performed on the position of the detected traffic light in a red light state.

In summary, the existing traffic light state identification algorithm has low detection accuracy, and the traffic light position is possibly mistakenly painted.

Disclosure of Invention

The invention provides a traffic light image processing method and traffic light image processing equipment, which are used for solving the problems that the existing traffic light state identification algorithm is low in detection precision and color of a traffic light is possibly mistakenly painted.

In a first aspect, an embodiment of the present invention provides a method for processing a traffic light image, including:

aiming at any first rectangular frame containing the traffic lights in the current frame traffic light image, determining a first position area of each traffic light contained in the first rectangular frame through a neural network model and acquiring the color state of the traffic light when the traffic light image is acquired;

determining a second position area corresponding to the traffic light with the color state of red light output by the neural network model when the color state of the traffic light with the color state of red light is red light in the first N frames of traffic light images;

comparing the first position area with the second position area of the same traffic light;

if the distance between the first position area and the second position area of the traffic light does not exceed a second preset threshold value, determining that the first position area is a target position area containing pixels to be processed in the current traffic light image;

and if the distance between the first position area and the second position area of the traffic light exceeds a second preset threshold, determining that the second position area is a target position area containing pixels to be processed in the current traffic light image.

The method roughly screens out the positions of all traffic lights in the current frame of traffic light image through a second rectangular frame, determines a first rectangular frame corresponding to the current frame of traffic light image according to the second rectangular frame corresponding to all traffic lights, determines the accurate positions (namely, first position areas) and the corresponding color states of all traffic lights in the first rectangular frame through a neural network model, wherein the color states are not the colors displayed by the traffic lights in the current frame of traffic light image but the actual colors of the traffic lights when the frame of traffic light image is shot, compares a first position area corresponding to the traffic lights detected by the neural network model with a second position area corresponding to the traffic lights when the color states of the traffic lights in the previous N frames of traffic light images are red lights if the traffic lights are determined to be red lights, if the distances between the first position area and the second position area are too large, if the difference does not exceed the second preset threshold, the pixel to be processed is determined through the first position area, so that the image processing precision is improved, and image error processing caused by error position information output by the neural network model is avoided.

In an optional embodiment, the first rectangular frame includes at least one second rectangular frame, a distance between any two second rectangular frames included in the same first rectangular frame is smaller than a first preset threshold, and the second rectangular frames include at least one traffic light in the traffic light image.

In an alternative embodiment, the first rectangular box is determined by:

determining the distance between two second rectangular frames according to a point at the same position on any two second rectangular frames in the traffic light image;

dividing a second rectangular frame where points with the distance not exceeding a first preset threshold are located into the same first rectangular frame; wherein the second rectangular frame is located in one of the first rectangular frames.

In an alternative embodiment, the determining, by the neural network model, a first position area of each traffic light contained in the first rectangular frame includes:

and determining the first position area of the traffic light in the first rectangular frame according to the coordinates of the central point of the first position area corresponding to each traffic light output by the neural network model in the first rectangular frame and the size information of the first position area.

In an optional embodiment, the determining a second position region corresponding to the traffic light with the color status of red light output by the neural network model when the color status of the traffic light in the first N frames of traffic light images is red light includes:

and performing weighted calculation on a first position area corresponding to the traffic light with the color state of red light output by the neural network model when the traffic light with the color state of red light is positioned in the first N frames of traffic light images, so as to obtain a second position area corresponding to the traffic light.

In an optional implementation manner, after determining that the first location area is a target location area containing a pixel to be processed, the method further includes:

determining pixels, the brightness of which exceeds a third preset threshold, contained in the target position area as pixels to be processed;

and adjusting the HS value of the pixel to be processed to display the adjusted color as the actual color state of the traffic light when the traffic light image is acquired, which is determined by the neural network model.

In an optional implementation manner, the HS value of each pixel included in the target position or the target pixel point to be processed is adjusted by:

determining the average brightness value of the pixel to be processed;

determining a color correction parameter corresponding to the pixel to be processed according to the determined average brightness value;

if the color correction parameter is not 0, adjusting the H value of the pixel to be processed to be a preset H value; judging whether the S value of the pixel to be processed is smaller than the color correction parameter or not;

if the S value of the pixel to be processed is determined to be smaller than the color correction parameter, adjusting the S value of the pixel to be processed to be m1 times of the current S value, and returning to the step of judging whether the S value of the pixel to be processed is smaller than the color correction parameter or not until the S value of the pixel to be processed is not smaller than the color correction parameter or the preset number of times of adjusting the S value of the pixel to be processed is reached;

if the S value of the pixel to be processed is determined to be not smaller than the color correction parameter, adjusting the S value of the pixel to be processed to be m2 times of the current S value.

In a second aspect, an embodiment of the present invention further provides an apparatus for traffic light image processing, where the apparatus includes: a processor and a memory, wherein the memory stores program code that, when executed by the processor, causes the apparatus to perform the following:

In a possible implementation manner, the first rectangular frame includes at least one second rectangular frame, a distance between any two second rectangular frames included in the same first rectangular frame is smaller than a first preset threshold, and the second rectangular frame includes at least one traffic light in the traffic light image.

In one possible implementation, the processor determines the first rectangular box by:

In one possible implementation, the processor is specifically configured to:

In one possible implementation, the processor is further configured to:

In a possible implementation manner, the processor adjusts the HS value of each pixel included in the target position or the target pixel point to be processed by:

determining the average brightness value of the pixel to be processed;

In a third aspect, an embodiment of the present invention further provides an apparatus for traffic light image processing, where the apparatus includes:

a first determination module: the method comprises the steps that for any first rectangular frame containing traffic lights in a current frame traffic light image, a first position area of each traffic light contained in the first rectangular frame is determined through a neural network model, and the color state of the traffic light is acquired when the traffic light image is acquired;

a second determination module: the second position area is used for determining the traffic light with the color state of red light output by the neural network model when the color state of the traffic light in the first N frames of traffic light images is the red light;

a comparison module: the first position area and the second position area of the same traffic light are compared; if the distance between the first position area and the second position area of the traffic light does not exceed a second preset threshold value, determining that the first position area is a target position area containing pixels to be processed in the current traffic light image; and if the distance between the first position area and the second position area of the traffic light exceeds a second preset threshold, determining that the second position area is a target position area containing pixels to be processed in the current traffic light image.

In a fourth aspect, the present application also provides a computer storage medium having a computer program stored thereon, which when executed by a processor, performs the steps of the method of the first aspect.

In addition, for technical effects brought by any one implementation manner of the second aspect to the fourth aspect, reference may be made to technical effects brought by different implementation manners of the first aspect, and details are not described here.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic diagram of a method for processing traffic light images according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a traffic light image collected in accordance with an embodiment of the present invention;

FIG. 3 is a schematic diagram of a traffic light candidate box configured for each traffic light in a traffic light image according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a first rectangular frame corresponding to a traffic light image determined according to a clustering algorithm according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of the related process flow operations of a method for detecting the color status and position of a traffic light through a neural network model according to an embodiment of the present invention;

FIG. 6 is a schematic processing flow diagram of a neural network model detection process according to an embodiment of the present invention;

FIG. 7 is a schematic diagram illustrating position information of a traffic light detected by a neural network model according to an embodiment of the present invention;

fig. 8 is a flowchart illustrating a method for determining second location information corresponding to a traffic light according to an embodiment of the present invention;

fig. 9 is a schematic view of a scenario for determining second position information corresponding to a traffic light according to an embodiment of the present invention;

fig. 10 is a schematic view of a scene for determining a first location area corresponding to a traffic light group according to an embodiment of the present invention;

fig. 11 is a schematic view of a scene for determining a first position area corresponding to a traffic light set with interference according to an embodiment of the present invention;

fig. 12 is a schematic view illustrating a comparison between a green light and a red light of a traffic light group according to an embodiment of the present invention;

fig. 13 is a schematic view of a scene for removing an interfering object of a straight red light in a traffic light group according to an embodiment of the present invention;

FIG. 14 is a schematic structural diagram of a first traffic light image processing apparatus according to an embodiment of the present invention;

fig. 15 is a schematic structural diagram of a second traffic light image processing device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Some of the words that appear in the text are explained below:

1. the term "and/or" in the embodiments of the present invention describes an association relationship of associated objects, and indicates that three relationships may exist, for example, a and/or B may indicate: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

2. In the embodiments of the present application, the term "plurality" means two or more, and other terms are similar thereto.

The application scenario described in the embodiment of the present invention is for more clearly illustrating the technical solution of the embodiment of the present invention, and does not form a limitation on the technical solution provided in the embodiment of the present invention, and it can be known by a person skilled in the art that with the occurrence of a new application scenario, the technical solution provided in the embodiment of the present invention is also applicable to similar technical problems. In the description of the present invention, the term "plurality" means two or more unless otherwise specified.

At present, in a traffic monitoring scene, in order to clearly distinguish information such as vehicle types, vehicle body colors, license plate numbers, faces in vehicles and the like, exposure time of a camera needs to be increased when video streams and snap-shot images are collected, and the increase of the exposure time can cause oversaturation of pixels in a traffic light area, so that the red light color in the snap-shot images is yellow or white. If the red light color is displayed as yellow or white, the red light color cannot be used as a basis for punishing the red light running behavior, so that the color state of the traffic light in the snapshot image needs to be detected, and if the actual color of the traffic light is detected to be the red light and the color displayed by the traffic light in the snapshot image is not red, the color of the traffic light needs to be modified into red, so that the red light color can be used as a basis for punishing the red light running behavior.

At present, can realize the detection to the actual colour of traffic light in the snapshot image through traffic light state identification algorithm, scribble red operation by force to the position of the red light that detects, because traffic monitoring scene is very complicated, there are various interference regions around the lantern eye region, for example: the interference sources are easy to cause interference, so that the position of a red light detected by a traffic light state identification algorithm is inaccurate, and phenomena such as coating errors or coating wrong positions occur, so that the reliability of a snapshot image is directly influenced.

The embodiment of the invention detects the actual color state and the position area of the traffic light through the neural network model, carrying out weighted fusion through the position area determined when the traffic light position is red light in the previous N frames of snap images, if the neural network model detects that the actual color state of the traffic light is red light, comparing the position area of the red light output by the neural network model with the reference position of the traffic light determined by the previous N frames of snap images, if the difference value exceeds a preset threshold value, the neural network model indicates that the position detection of the red light in the current captured image has errors, the corresponding position of the current captured image can be processed according to the position of the red light determined by the previous N frames of captured images, if the difference value does not exceed the preset threshold value, the current frame snapshot image can be directly processed according to the position of the red light detected by the neural network model. Thereby improving the accuracy of position detection and reducing the number of false fills.

The embodiments of the present invention will be described in further detail with reference to the drawings attached hereto.

As shown in fig. 1, an embodiment of the present invention provides a method for processing a traffic light image, which specifically includes the following steps:

step 100: aiming at any one first rectangular frame containing traffic lights in the current frame traffic light image, determining a first position area of each traffic light contained in the first rectangular frame through a neural network model and acquiring the color state of the traffic light when the traffic light image is acquired; the first rectangular frame comprises at least one second rectangular frame, the distance between any two second rectangular frames included in the same first rectangular frame is smaller than a first preset threshold value, and the second rectangular frames comprise traffic lights in at least one traffic light image.

Step 101: determining a second position area corresponding to the traffic light with the color state of red light output by the neural network model when the color state of the traffic light with the color state of red light is red light in the first N frames of traffic light images;

step 102: comparing the first position area with the second position area of the same traffic light; if the distance between the first position area and the second position area of the traffic light does not exceed a second preset threshold value, determining that the first position area is a target position area containing pixels to be processed in the current traffic light image; and if the distance between the first position area and the second position area of the traffic light exceeds a second preset threshold, determining that the second position area is a target position area containing pixels to be processed in the current traffic light image.

Through the scheme, the positions of the traffic lights in the current frame traffic light image are roughly screened out through the second rectangular frame, the first rectangular frame corresponding to the current frame traffic light image is determined according to the second rectangular frame corresponding to the traffic lights, the accurate position (namely the first position area) and the corresponding color state of each traffic light in the first rectangular frame are determined through the neural network model, the color state is not the color displayed by the traffic light in the current frame traffic light image but the actual color of the traffic light when the frame traffic light image is shot, if the traffic light is determined to be red, the first position area corresponding to the traffic light detected by the neural network model is compared with the second position area corresponding to the traffic light when the color state of the traffic light in the previous N frames traffic light image is the red light, and if the distance difference between the first position area and the second position area is too large, if the difference does not exceed the second preset threshold, the pixel to be processed is determined through the first position area, so that the image processing precision is improved, and image error processing caused by error position information output by the neural network model is avoided.

Embodiments of the invention may be presented in three aspects, a first aspect: determining a first rectangular frame in the current frame traffic light image; in a second aspect: detecting the color state of the traffic light in the current traffic light image and the position of the traffic light in the first rectangular frame through a neural network model; in a third aspect: judging the position of the traffic light with the detected red color state; in a fourth aspect: and carrying out color correction on the pixel to be processed.

In a first aspect: determining a first rectangular frame in the current frame traffic light image;

as shown in fig. 2, the traffic light image collected by the traffic camera is shown, wherein each circle represents a traffic light, and each traffic light is assumed to have three states, that is, the same traffic light can display three states of red, yellow and green. As shown in fig. 3, the rectangular frame outside the traffic light is a traffic light candidate frame, i.e., the second rectangular frame in the embodiment of the present invention. The second rectangular frame includes at least one traffic light, and the following description will describe a specific manner of determining the first rectangular frame by taking the case of including one traffic light as an example:

before the position detection and state identification are performed on the current traffic light image, the position of the traffic light is roughly framed through a traffic light candidate frame, generally, each traffic camera shoots the traffic light at a fixed intersection, so that the positions of the traffic lights in the shot traffic light image are basically the same, and therefore, when the candidate area of each traffic light in the traffic light image collected by the camera is set, the traffic light candidate frame can be configured through the web interface, such as: dragging the rectangular frame to cover the whole traffic light area. Since the traffic light candidate frame may be separated from the traffic light area due to the shake and shake of the surveillance camera, and thus the traffic light candidate frame is larger than the actual traffic light area, referring to fig. 3, frames 1, 2, 3, 4, and 5 are traffic light candidate frames (i.e., second rectangular frames) and are larger than the respective traffic light areas.

The traffic light candidate frames are divided through a clustering algorithm to obtain N first rectangular frames with preset specifications, namely the first rectangular frames are rectangles with the same length and width values, at least one traffic light candidate frame is contained in each first rectangular frame, and one traffic light candidate frame can only exist in one first rectangular frame. The following process of dividing the traffic light candidate frame by the clustering algorithm to obtain N first rectangular frames mainly comprises the following steps:

step 1: inputting information of each traffic light candidate frame, wherein the information comprises an index value ID of the traffic light candidate frame, position coordinates of the traffic light candidate frame and the number of the traffic light candidate frames; wherein, the position coordinates of the traffic light candidate frame can be represented by (x _ top, y _ top, x _ bot, y _ bot), (x _ top, y _ top) being the coordinates of the upper left corner of the traffic light candidate frame, and (x _ bot, y _ bot) being the coordinates of the lower right corner of the traffic light candidate frame; or (x _ top, y _ top) is the coordinate of the upper right corner of the traffic light candidate frame and (x _ bot, y _ bot) is the coordinate of the lower left corner of the traffic light candidate frame. Other points may be used for illustration only.

For example, the following steps are carried out: with reference to fig. 3, the information of the traffic light candidate frame of the current frame traffic light image is input as follows: the number of the traffic light candidate frames is 5, the index value ID of the frame 1 is 1, and the position coordinate of the frame 1 is (x)_top1，y_top1，x_bot1，y_bot1) The index value ID of frame 2 is 2, and the position coordinate of frame 2 is (x)_top2，y_top2，x_bot2，y_bot2) The index value ID of the frame 3 is 3, and the position coordinate of the frame 3 is (x)_top3，y_top3，x_bot3，y_bot3) And so on, the information of the box 4 and the box 5 is input.

Step 2: and clustering the input traffic light candidate frames under the condition that the distance between any two traffic light candidate frames is calculated, and dividing the two traffic light candidate frames with the distance not exceeding a first preset threshold value into the same first rectangular frame. That is, a point at the same position on the two traffic light candidate frames is selected, and the distance between the two points is the distance between the two traffic light candidate frames. Such as: the distance between two traffic light candidate frames is calculated from the coordinates (x _ top, y _ top) of the upper left corners of any two traffic light candidate frames.

The specific clustering algorithm is as follows:

if the distance d between two traffic light candidate frames is less than a first preset threshold, the two traffic light candidate frames are divided into the same first rectangular frame, as shown in fig. 4, for a schematic diagram of the first rectangular frame obtained after clustering the frame 1, the frame 2, the frame 3, the frame 4, and the frame 5 in the current traffic light image, it is determined according to a clustering algorithm that the distance between any two frames of the frame 1, the frame 2, and the frame 3 is less than the first preset distance, the distance between the frame 4 and the frame 5 is also less than the first preset distance, but the distances between the frame 1, the frame 2, the frame 4, and the frame 5 are all greater than the first preset distance, and one traffic light candidate frame can only exist in one first rectangular frame, so that the frame 1, the frame 2, and the frame 3 are divided into the first rectangular frame 1, and the frame 5 are divided into the first rectangular frame 2. The first rectangular frame 1 and the first rectangular frame 2 have the same specification, and are both rectangular frames with the length H and the width W, so that it can be understood that the first rectangular frame 1 can be overlapped with the first rectangular frame 2 through rotation and/or translation.

Wherein, the position of the first rectangular frame is determined according to the position of the second rectangular frame, such as: with reference to fig. 4, the first rectangular frame 1 includes a frame 1, a frame 2, and a frame 3, and the vertex coordinates of the upper left corner of the first rectangular frame 1 coincide with the vertex coordinates of the upper left corner of the frame 1.

In a second aspect: detecting the color state of the traffic light in the current traffic light image and the position of the traffic light in the first rectangular frame through a neural network model;

the first rectangular frames contained in the current frame traffic light image are determined through the steps, the position information of each corresponding first rectangular frame in the current frame image is input into the neural network model, whether the color state of the traffic light in each first rectangular frame is a red light, a green light or a yellow light is detected in sequence through the neural network model, and the traffic light is detected to be located at a more accurate position in the first rectangular frame.

As shown in fig. 5, the main process for determining the color state and the position information of each traffic light in the first rectangle through the neural network model includes the following steps:

step 500: inputting the position information of the first rectangular frame into a neural network model;

since the first rectangular frame input to the neural network model is a preset fixed specification, when the position information of the first rectangle is input, the position of the first rectangular frame can be represented by the coordinates of the left vertex of the first rectangular frame, such as: with reference to fig. 2, the coordinates of the left vertex of the first rectangular frame 1 and the coordinates of the left vertex of the first rectangular frame 2 are input into the neural network model, the neural network model sequentially detects the first rectangular frame 1 and the first rectangular frame 2, determines the actual color state and the position of the traffic light in the first rectangular frame when the traffic light image is collected, and detects the color state, for example, the color displayed by the traffic light in the traffic light image is yellow light, and the actual color state is red light. For the position detection, for example, a minimum bounding rectangle of the traffic light is determined, and coordinates of a center point of the minimum bounding rectangle in the first rectangle and length and width values of the minimum bounding rectangle are output.

Step 501: performing data conversion on the image corresponding to the first rectangular frame, and inputting the converted image data into a neural network model;

converting the YUV image corresponding to the first rectangular frame into an RGB image, and inputting the converted RGB image of the first rectangular frame into a neural network model, wherein the YUV data of the image corresponding to the first rectangular frame can be converted into the RGB image through the following formula:

R＝Y+1.402*V

G＝Y-0.344*U-0.714*V

B＝Y+1.772*U

step 502: the neural network model detects the color state of the traffic light in the first rectangular frame and the position information of the traffic light in the first rectangular frame according to the position information of the first rectangular frame and the RGB data;

the neural network model in the embodiment of the present invention is composed of a plurality of convolutional layers, pooling layers, upsampling layers, and cascade layers, and as shown in fig. 6, the structure of the neural network model for detecting the color state and position of a traffic light provided in the embodiment of the present invention is mainly as follows:

1) firstly, performing a first layer convolution operation on input RGB image data, wherein the filter size is r multiplied by r, r is a number between 1 and 7, outputting c1 channel feature maps, c1 is a number between 8 and 1024, and the output result of the layer 1 convolution is as follows:

wherein the content of the first and second substances,

representing a convolution operation, I_iDenotes R, G, B ith channel, w_1,i,jRepresents the weight of the ith filter of the jth channel of the 1 st layer convolution, b_1,jDenotes the jth offset, F, of the layer 1 convolution_1,jIs the output result of the jth channel of the 1 st layer convolution;

2) and performing activation processing on the result of the first layer convolution operation, wherein the activation function is as follows:

F_1,jthe results after activation were:

where α represents a gain factor for controlling F_1,jA fraction less than 0, alpha is between 0 and 1; where F1, j is the output result of the jth channel of the 1 st layer convolution.

3) Performing pooling treatment on the data after the activation treatment of the result of the first layer of convolution operation;

4) and sequentially executing the convolution processing of the 2 nd layer, the activation processing of the convolution result of the 2 nd layer, and performing pooling processing on the result of the activated convolution operation of the 2 nd layer, and so on until the convolution processing of the nth time is completed.

5) Performing up-sampling processing on the result of the nth convolution processing;

suppose that the size of the characteristic graph after the nth layer convolution processing is W_n×H_n×C_nThen the output size after upsampling is: w_n×N×H_n×N×C_nWherein N is an upsampling multiple;

where the sampling process can be seen as the inverse of pooling, which is actually image magnification, e.g., an upsampled input is a 6x6 size image, an upsampled multiple is 2, and then the upsampled output is a 12x12 image.

6) Performing cascade operation after the up-sampling treatment;

the cascade connection is to perform channel splicing on the result after the previous t-th layer convolution and the up-sampling result, and the result after the t-th layer convolution is assumed to be W_t×H_t×C_tThe up-sampling output result is W_t×H_t×C_up1Then the result after concatenation is: w_t×H_t×(C_t+C_up1) Then circularly executing the upper acquisition operation and the cascade operation for the set times, and finally executing convolution processing.

7) And according to the result output by the last layer of convolution processing, calculating the position coordinate and the score of the traffic light with the highest confidence coefficient by adopting a non-maximum suppression algorithm, obtaining the color state of the traffic light according to the score, judging whether the color state of the current traffic light is a red light, a yellow light or a green light, obtaining the position coordinate of the minimum external rectangular frame of each traffic light in the first rectangular frame, and outputting the color state of each traffic light in the first rectangular frame, the position coordinate and the length and width value of the corresponding minimum external rectangular frame.

Through the processing, the color state corresponding to each traffic light in the first rectangular frame and the position area which is more accurate than the candidate area of the traffic light in the first rectangle can be detected, and in order to detect whether the position output by the neural network model is accurate, further detection is needed, and a specific process for detecting the position area of the traffic light output by the neural network model is introduced through the content of the third part.

In a third aspect: judging the position of the traffic light with the detected red color state;

as shown in fig. 7, there is detected position information of the traffic light 1 in the first rectangular frame 1 for the neural network model, wherein (x)_mid，y_mid) The coordinates of the center point of the minimum bounding matrix frame of the traffic light 1 with respect to the first rectangular frame 1, and the length a and width b of the minimum bounding rectangular frame.

If the neural network model detects that the color state of the traffic light 1 is red, the position area of the traffic light 1 in the first rectangular frame 1 can be determined according to the position information of the traffic light 1 output by the neural network model, that is, the minimum circumscribed rectangular frame of the traffic light 1 shown in fig. 7 is the position area (i.e., the first position area) occupied by the traffic light 1 in the first rectangular frame 1 according to the neural network model.

And determining a position area corresponding to the traffic light 1 (namely a second position area of the traffic light 1) according to the position of the traffic light 1 in the first N frames of traffic light images when the color state is red, and comparing the first position area and the second position area of the traffic light 1 to judge whether the first position area of the traffic light 1 output by the neural network model is accurate.

There are various ways of determining the second position of the traffic light 1, as exemplified below:

the determination method is as follows: updating the second position area of the traffic light in real time;

fig. 8 shows a way of updating the second position of the traffic light in real time, comprising the following steps:

step 800: determining the color state and the position of the traffic light 1 in the first rectangular frame corresponding to the mth frame of traffic light image;

step 801: determining the color state and the position of the traffic light 1 in the first rectangular frame corresponding to the (m + 1) th frame of traffic light image;

step 802: performing weighted fusion according to the position information of the traffic light 1 corresponding to the mth frame of traffic light image and the position information of the traffic light 1 corresponding to the (m + 1) th frame of traffic light image to determine a second position area i of the traffic light 1;

step 803: determining the color state and the position of the traffic light 1 in the first rectangular frame corresponding to the (m + 2) th frame of traffic light image;

step 804: and performing weighted fusion according to the position information of the traffic light 1 corresponding to the (m + 2) th frame of traffic light image and the second position area i to obtain the second position area i +1 corresponding to the traffic light 1.

For example, the following steps are carried out: assuming that in a first frame of traffic light image collected by the traffic camera, the color state and the position information of each traffic light in a first rectangular frame 1 are detected through a neural network model, and if the color state of the traffic light 1 is detected to be a red light, the position information 1 of the traffic light 1 is stored;

detecting a first rectangular frame 1 in a second frame of traffic light image collected by the same traffic camera through a neural network model, and if the color state of the traffic light 1 in the second frame of traffic light image is red, storing position information 2 of the traffic light 1 detected by the neural network model;

and comparing the position area 1 determined by the position information 1 with the position area 2 determined by the position information 2, calculating the distance between the position area 1 and the position area 2, and if the distance does not exceed a second preset threshold, performing weighted fusion on the position area 1 and the position area 2 to obtain a second position area (hereinafter referred to as a reference position area 1) corresponding to the traffic light 1. As shown in fig. 9 (a), position information 1 of a traffic light 1 corresponding to a first frame of traffic light image is determined for the neural network model; as shown in fig. 9 (b), position information 2 of the traffic light 1 corresponding to the second frame of traffic light image is determined for the neural network model; fig. 9 (c) shows second position information of the traffic light 1 determined by the position information 1 and the position information 2.

If the distance between the position areas 1 and 2 exceeds the second preset threshold, the update process of the second position area is not performed.

If the traffic camera collects a third frame of traffic light image at the moment, detecting a first rectangular frame 1 in the third frame of traffic light image through a neural network model, and if the color state of the traffic light 1 in the third frame of traffic light image is red, storing position information 3 of the traffic light 1 detected by the neural network model;

comparing the position area 3 determined by the position information 3 with the reference position area 1, calculating the distance between the position area 3 and the reference position area 1, if the distance does not exceed a second preset threshold, performing weighted fusion on the position area 3 and the reference position area 1 to obtain a reference position area 2 corresponding to the traffic light 1, and updating the second position area corresponding to the traffic light 1.

By analogy, when a traffic light image which meets the requirement and has the color state of the red light is detected, the position information of the traffic light 1 determined according to the first rectangular frame corresponding to the traffic light image of the frame is weighted and fused with the second position area corresponding to the traffic light 1, and the second position area corresponding to the traffic light 1 is updated.

Determining a second mode: carrying out weighted fusion on the positions of the traffic lights 1 determined by the N frames of traffic light images;

and determining the corresponding position area of the traffic light 1 when the color state of the traffic light 1 in the first N frames of traffic light images is red light, and performing weighted fusion on the position areas corresponding to the traffic light 1 in the first N frames to obtain a second position area corresponding to the traffic light 1.

Such as: assuming that N is 3, the current frame traffic light image is the 6 th frame, and it is determined through the neural network model that the color states of the traffic light 1 in the 5 th, 4 th and 3 rd frame traffic light images are all red lights, the corresponding position area in the 5 th frame traffic light image is the position area a, the corresponding position area in the 4 th frame traffic light image is the position area b, and the corresponding position area in the 3 rd frame traffic light image is the position area c, the position area a, the position area b and the position area c are subjected to weighted calculation, and the second position area corresponding to the traffic light 1 is determined.

If the number of frames of the traffic light 1 in which the color state of the traffic light is red in the traffic light image before the current frame image is less than N, the process of comparing and judging the position information of the traffic light output by the neural network model with the second position area is not executed, and the position area determined by the position information output by the neural network model is directly used as the target position area.

If the number of frames of the traffic light 1 in the traffic light image before the current frame image is that the color state of the traffic light is red light is not less than N, comparing a first position area determined by the position information of the traffic light output by the neural network model with a second position area of the traffic light determined by the traffic light image of the previous N frames, judging whether the distance between the first position area and the second position area exceeds a second preset threshold value, and if not, taking the first position area as a target position area containing pixels to be processed in the current frame traffic light image; otherwise, the second position area is the target position area containing the pixels to be processed in the current frame traffic light image.

The distance between the two reference points is calculated, for example, according to a center point of a minimum circumscribed rectangular frame of the traffic light corresponding to the first position area and a center point of a minimum circumscribed rectangular frame of the traffic light corresponding to the second position area, and the distance between the two center points is calculated as the distance between the first rectangular area and the second rectangular area.

After the target position area is determined, pixels to be processed in the target position area, that is, identifications such as a circular traffic light, a number, an arrow and the like in the target position area need to be determined, and a description is given below of a manner of determining the pixels to be processed according to the brightness of each pixel in the target position area, which is provided by the embodiment of the present invention, by taking a traffic light group composed of traffic lights with only one color state as an example:

as shown in fig. 10, in which fig. 10(a) is a schematic view of a first rectangular frame determined according to a second rectangular frame corresponding to a traffic light group including three traffic lights, wherein the traffic light group is a straight red light, a straight yellow light and a straight green light from the left, respectively; the black area is an unlighted area, and the white area is a highlight area corresponding to the straight red arrow mark. And 10(b) a first position area corresponding to the traffic light group in the first rectangular frame, which is obtained by inputting the first rectangular frame shown in fig. 10(a) into the neural network model, if the first position area is determined to be a target position area containing pixels to be processed after comparison is performed according to a second position area corresponding to the traffic light, the first position area is determined to be a target position area containing the pixels to be processed according to the brightness of each pixel in the target position area, the pixels with the brightness exceeding a third preset threshold are the pixels to be processed, and a white area shown in fig. 10(b) is an area corresponding to a straight red arrow, namely an area formed by the pixels to be red.

And judging whether the brightness of each pixel in the target position area exceeds a third preset threshold, and if so, determining the pixel with the brightness exceeding the third preset threshold as a pixel to be processed. Further, the gray value of the pixel to be processed can be set to 255, the gray value of the pixel of which the brightness does not exceed the third preset threshold is set to 0, a binary image corresponding to the target position area is generated, and through morphological processing, the interference point in the binary image is removed, and the more accurate pixel to be processed is obtained. There are various ways of morphological processing, such as dilation in image processing.

Further, an embodiment of the present invention further provides a method for removing an interference point in a traffic light group, as shown in fig. 11, a first rectangular frame 1 determined by a traffic light image acquired by a traffic camera is provided, a straight red light arrow in the traffic light group is attached with a barrier similar to a tree branch, and the barrier is in a reflective state when shooting at night, so that a pixel to be processed determined by a target position area corresponding to the frame of traffic light image is a pixel corresponding to a straight red light crop mark and a pixel corresponding to the tree branch, which results in inaccurate and low-precision determined pixel to be processed, and after red-coating operation is performed on the determined pixel to be processed, the barrier such as the tree branch can be coated with red, which results in distortion of image processing. The embodiment of the invention provides a way for removing traffic light interference, wherein a first rectangular frame 1 corresponding to a current traffic light image is input into a neural network model, if the traffic light group is detected to be green light, as shown in (a) in fig. 12, a background model is established and updated according to pixels corresponding to the traffic light group when a straight-ahead green light is on, and the formula for updating the background model is as follows:

background(t)＝(1-w)*background(t-1)+w*pixel(t)

wherein background (t-1) is a background model of a previous frame, background (t) is a background model of a current frame, pixel (t) is a current pixel point value, w is a learning rate, and the larger w is, the faster the updating speed of the background model is.

As shown in fig. 12 (b), when the neural network model detects that the traffic light group in the first rectangular frame 1 is lit by red light, the minimum circumscribed rectangular frame corresponding to the red light in fig. 12 (b) and the minimum circumscribed rectangular frame corresponding to the red light in the background model updated according to the lighting of the green light in the straight line are cut out, as shown in fig. 13, each pixel in the minimum circumscribed rectangular frame of the red light in the target position area is compared with each pixel in the minimum circumscribed rectangular frame of the red light corresponding to the background model, and if the brightness difference value of the pixels exceeds a third preset threshold value, the pixel is determined to be a pixel to be processed, that is, the pixel corresponding to the white area in fig. 13 (c), so as to remove the interference of the night obstacle.

After the determination of the pixel to be processed is completed, the pixel to be processed is subjected to color correction, which is described in detail in the following through a fourth aspect.

In a fourth aspect: and carrying out color correction on the pixel to be processed.

Calculating the average brightness corresponding to the pixel area to be processed, recording the average brightness as y _ avg, and determining the color correction grade corresponding to the average brightness y _ avg through the following formula;

wherein x is₁、x₂、y₁、y₂、D₁、D₂、D₃Is a constant;

then, traversing the detected pixel points in the red light region, when R, G corresponding to the pixel point is greater than threshold values thr and thg, determining that the pixel to be processed needs to be coated with red, converting RGB of the pixel to be processed into HSV, and adjusting the value of H, S of the pixel point, wherein the specific adjustment mode is as follows:

1) adjusting the H value;

if the average brightness is 0, keeping the H of the pixel unchanged;

and if the average brightness is not equal to 0, adjusting the H of the pixel point to be a preset H value.

2) Adjusting the value of S, wherein m1 is a, and m2 is 1/b;

if the average brightness is 0, keeping the S of the pixel unchanged;

if the average brightness is > 0, S is adjusted as follows:

if S is larger than or equal to the degree, adjusting S to a S, and returning to the step of judging whether S is larger than or equal to the degree until the value of S is not smaller than the degree or the set number of times of adjusting S is reached;

if S is less than degree, adjusting S to be S/b.

After the H, S value of the pixel to be processed is adjusted, HSV is converted into RGB.

The above-mentioned image processing flow for the traffic light whose one color state is the red light in the first rectangular frame 1 corresponding to the current frame traffic light image may refer to the above-mentioned flow for the traffic light whose other color state is the red light in the first rectangular frame 1, which is not described herein again, and correspondingly, the processing for the first rectangular frame 1 may refer to the processing for the other first rectangular frames corresponding to the current frame traffic light image, and when the processing for all the first rectangular frames in the current frame traffic light image is completed, the processing for the current frame traffic light image is ended.

Based on the same inventive concept, an embodiment of the present invention further provides an apparatus for traffic light image processing, as shown in fig. 14, the apparatus including: at least one processing unit 1400 and at least one storage unit 1401, wherein said storage unit 1401 stores program code, which when executed by said processing unit 1400, causes said processing unit 1400 to perform the following:

Optionally, the first rectangular frame includes at least one second rectangular frame, a distance between any two second rectangular frames included in the same first rectangular frame is smaller than a first preset threshold, and the second rectangular frame includes at least one traffic light in the traffic light image.

Optionally, the processing unit 1400 determines the first rectangular frame by:

Optionally, the processing unit 1400 is specifically configured to:

Optionally, the processing unit 1400 is further configured to:

Optionally, the processing unit 1400 adjusts the HS value of each pixel included in the target position or the target pixel point to be processed in the following manner:

determining the average brightness value of the pixel to be processed;

Based on the same concept, fig. 15 is a schematic structural diagram of another traffic light image processing apparatus according to an embodiment of the present invention, and the apparatus includes:

the first determination module 1500: the method comprises the steps that for any first rectangular frame containing traffic lights in a current frame traffic light image, a first position area of each traffic light contained in the first rectangular frame is determined through a neural network model, and the color state of the traffic light is acquired when the traffic light image is acquired;

second determination module 1501: the second position area is used for determining the traffic light with the color state of red light output by the neural network model when the color state of the traffic light in the first N frames of traffic light images is the red light;

an alignment module 1502: the first position area and the second position area of the same traffic light are compared; if the distance between the first position area and the second position area of the traffic light does not exceed a second preset threshold value, determining that the first position area is a target position area containing pixels to be processed in the current traffic light image; and if the distance between the first position area and the second position area of the traffic light exceeds a second preset threshold, determining that the second position area is a target position area containing pixels to be processed in the current traffic light image.

Optionally, the first determining module 1500 determines the first rectangular frame by:

Optionally, the first determining module 1500 is specifically configured to:

Optionally, the second determining module 1501 is specifically configured to:

Optionally, the alignment module 1503 is further configured to:

after the first position area is determined to be a target position area containing pixels to be processed, determining pixels, of which the brightness exceeds a third preset threshold value, contained in the target position area as the pixels to be processed;

Optionally, the comparison module 1503 adjusts the HS values of the pixels included in the target position or the target pixel point to be processed in the following manner:

determining the average brightness value of the pixel to be processed;

Based on the same technical concept, the embodiment of the application also provides a computer readable storage medium. The computer-readable storage medium stores computer-executable instructions for causing the computer to execute the flow in the method of traffic light image processing in the foregoing embodiment.

The present application is described above with reference to block diagrams and/or flowchart illustrations of methods, apparatus (systems) and/or computer program products according to embodiments of the application. It will be understood that one block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, and/or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer and/or other programmable data processing apparatus, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks.

Accordingly, the subject application may also be embodied in hardware and/or in software (including firmware, resident software, micro-code, etc.). Furthermore, the present application may take the form of a computer program product on a computer-usable or computer-readable storage medium having computer-usable or computer-readable program code embodied in the medium for use by or in connection with an instruction execution system. In the context of this application, a computer-usable or computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A method of traffic light image processing, the method comprising:

if the distance between the first position area and the second position area of the traffic light exceeds a second preset threshold value, determining that the second position area is a target position area containing pixels to be processed in the current traffic light image;

the method further comprises the following steps: the method comprises the following steps:

2. The method of claim 1, wherein the first rectangular frame comprises at least one second rectangular frame, a distance between any two second rectangular frames comprised by the same first rectangular frame is smaller than a first preset threshold, and the second rectangular frames comprise at least one traffic light in the traffic light image.

3. The method of claim 2, wherein the first rectangular box is determined by:

dividing a second rectangular frame where points with the distance not exceeding a first preset threshold are located into the same first rectangular frame; wherein the second rectangular frame is positioned in one first rectangular frame; the first rectangular frame coincides with the same-orientation vertex of one of the second rectangular frames that the first rectangular frame contains.

4. The method of claim 1, wherein said determining, by a neural network model, a first location area of each traffic light contained within the first rectangular box comprises:

5. The method of claim 1, wherein after determining that the first location area is a target location area containing pixels to be processed, further comprising:

6. The method of claim 5, wherein the HS value of each pixel or the pixel to be processed included in the target location area is adjusted by:

determining the average brightness value of the pixel to be processed;

7. An apparatus for traffic light image processing, the apparatus comprising: at least one processing unit and at least one memory unit, wherein the memory unit stores program code that, when executed by the processor, causes the apparatus to perform the following:

the processor is specifically configured to:

8. The apparatus of claim 7, wherein the first rectangular frame comprises at least one second rectangular frame, a distance between any two second rectangular frames included in the same first rectangular frame is smaller than a first preset threshold, and the second rectangular frames include at least one traffic light in the traffic light image.

9. The device of claim 7, wherein the processor determines the first rectangular box by:

10. The device of claim 7, wherein the processor is specifically configured to:

11. The device of claim 7, wherein the processor is further configured to:

12. The apparatus of claim 11, wherein the processor adjusts the HS value of each pixel or the pixel to be processed contained in the target location area by:

determining the average brightness value of the pixel to be processed;