CN107729801B

CN107729801B - Vehicle color recognition system based on multitask deep convolution neural network

Info

Publication number: CN107729801B
Application number: CN201710558817.4A
Authority: CN
Inventors: 汤一平; 王辉; 吴越; 温晓岳; 柳展
Original assignee: Enjoyor Co Ltd
Current assignee: Yinjiang Technology Co.,Ltd.
Priority date: 2017-07-11
Filing date: 2017-07-11
Publication date: 2020-12-18
Anticipated expiration: 2037-07-11
Also published as: CN107729801A

Abstract

A vehicle color recognition system based on a multitask deep convolutional neural network comprises a high-definition camera arranged above a road traffic line, a traffic cloud server and a vehicle color vision detection subsystem; the visual detection subsystem of the vehicle color comprises a vehicle positioning detection module, a license plate background color identification module, a color difference calculation module, a vehicle color correction module and a vehicle color identification module, wherein the vehicle positioning detection module, the license plate positioning detection module and the vehicle color identification module share a deep convolutional neural network of the same Faster R-CNN, vehicles on a road are quickly segmented by adopting the deep convolutional neural network, license plates on the road are quickly segmented by further adopting a deep convolutional neural network by using a vehicle image, and then space position information occupied by the vehicles and the license plates in the road image is given. The invention provides a system for detecting the color vision of a vehicle by using a multitask deep convolution neural network, which has higher detection precision and higher robustness.

Description

Vehicle color recognition system based on multitask deep convolution neural network

Technical Field

The invention relates to application of artificial intelligence, digital image processing, a convolutional neural network and computer vision in the aspect of vehicle color recognition, and belongs to the field of intelligent traffic.

Background

Color is an important appearance characteristic of a vehicle. In the real world, due to the influence of various uncertain factors such as the color temperature of a light source, the intensity of light, the shooting angle, the setting of a camera and the like, the color of a vehicle can have color cast to a certain degree when the finally obtained result is compared with the ideal condition; the existing vehicle color recognition method is very sensitive to the change of the vehicle posture and the change of the illumination environment where the vehicle is located, when the illumination environment where the vehicle is located changes, the color recognition accuracy of the existing vehicle color recognition method is sharply reduced, and the color of the vehicle cannot be recognized accurately.

The chinese patent application No. 200810041097.5 discloses a method for identifying a color of a vehicle, and a method for identifying a color of a vehicle body. The method comprises the following steps:

1. according to texture features and structural features of an image, a complex energy function is constructed, and a point with the maximum energy is searched;

2. positioning the identification areas of the vehicle color and the vehicle color depth according to the point with the maximum energy;

3. and identifying the color and the color depth of the pixel points in the area, and counting to finally obtain the color and the color depth of the identified area.

However, in the early stage of sample collection, the color identification of the vehicle under different illumination conditions is not processed; obtaining different characteristic attributes through a plurality of color spaces when selecting the characteristic vector; when the model is trained, a plurality of types of classifiers are used for training; when the identification area is positioned, only the area of the front cover of the vehicle is selected, and possible light reflection phenomena are not processed, so that certain deviation is generated between final vehicle color identification and vehicle color depth identification.

The Chinese patent application with the application number of 200810240292.0 discloses a method and a system for recognizing the color of a vehicle body in a vehicle video image, and provides a method for recognizing the color of the vehicle body in the vehicle video image. This patent has adopted the step-by-step training when training the model, includes the following step:

1. roughly dividing the vehicle body sample by clustering according to the color template to obtain a sample with a certain color or a mixed sample with a plurality of similar colors;

2. subdividing the mixed sample by adopting a nearest neighbor classification method;

3. carrying out coarse identification on the color of the vehicle according to the model obtained by training;

4. and carrying out detailed identification by adopting a nearest neighbor classification method.

However, the patent also does not take into account the variation in vehicle color under different lighting conditions; when the feature vector is selected, the feature vector is used by three color spaces of HSV, YIQ and YCbCr in steps; when the model is trained, the model is trained by combining clustering and nearest neighbor classification technology; the vehicle color identification stage does not explain how to adopt what strategy to process the color of each pixel point in the identification area; moreover, the patent only describes a vehicle color recognition method, and does not describe the recognition of the vehicle color depth.

At present, the identification of the body color generally comprises two main modules: one is detection and positioning of the area to be identified, determination of the vehicle body color reference area, and the other is color classification and identification of the image of the reference area.

The detection and the positioning of the region to be identified have a plurality of modes, and in a vehicle color identification method 103544480A, a vehicle body color identification method 105160691A based on a color histogram and a vehicle body color identification method and device 105354530A, a license plate is firstly detected and positioned, and then a vehicle color identification reference region is determined according to the position information of the license plate. In the vehicle color recognition method and apparatus 102737221B, a reference region for vehicle color recognition is located based on texture and structure information of an image, and then a main recognition region and an auxiliary recognition region are located. In a vehicle body color recognition method 105005766a, a bounding rectangle of a moving object is determined as a reference region for color recognition based on a method for detecting the moving object in a video. In the method for automatically identifying the colors of vehicles in road intersection videos and pictures 104680195a, the positioning mode of color candidate regions is not clearly described, and only a plurality of candidate regions are clearly defined and mainly concentrated on an engine cover.

The visual detection technology belongs to the visual detection technology in the early deep learning era, and has the problems of low detection precision and detection robustness, and particularly, the key problems of illumination change, camera imaging conditions and the like are not solved well. In addition, the above patent contents only disclose some technical summaries, and there are many technical details and key problems in practical application, especially the solution to various detailed problems of the road traffic safety law.

In recent years, the technology of deep learning in the field of computer vision is rapidly developed, and the deep learning can utilize a large number of training samples and hidden layers to deeply learn abstract information of an image layer by layer so as to more comprehensively and directly acquire image characteristics. The digital image is described by a matrix, and the convolutional neural network better starts from a local information block to further describe the overall structure of the image, so that the convolutional neural network is mostly adopted to solve the problem in the field of computer vision and deep learning methods. The deep convolutional neural network technology is from R-CNN, fast R-CNN to Fasterer R-CNN around improving the detection precision and the detection time. The method is characterized by further precision improvement, acceleration, end-to-end and more practicability, and almost covers all fields from classification to detection, segmentation and positioning. The application of the deep learning technology to the vehicle color visual detection is a research field with practical application value.

The human visual system has color constancy and can obtain the invariant characteristics of the surface color of an object under certain changed lighting environments and imaging conditions. However, the bayonet monitoring imaging device does not have such a "regulation" function, and different illumination environments will cause a certain deviation between the color of the acquired image and the real color of the object. Such deviations will affect the accuracy and robustness of subsequent vehicle color analysis. Therefore, it has become a current research focus to seek a suitable color correction algorithm to eliminate the influence of the illumination environment and the like on the color appearance, so that the processed image can correctly reflect the real color of the object.

The national standard GA 36-2014 specifies various details of motor vehicle number plates, among which large civil automobiles: black characters with yellow background; small-sized civil automobiles: blue background white characters; special armed police cars: white background red "WJ", black; other foreign vehicles: white characters on black background; messenger, forecourt and foreign automobile: black background white characters and hollow 'enable' character marks; testing license plate: white bottom and red characters, wherein a 'trying' character mark is arranged in front of the number; temporary license plate: white bottom and red characters, and temporary characters are arranged in front of the numbers; repairing the license plate of the automobile: white background and black characters. The interval between the characters of the license plate is 12 mm. These regulations regarding license plates, in particular the regulations regarding color, provide a reference standard for vehicle color recognition. Under the same illumination condition, the color difference of the same stratification degree can occur in the colors of the vehicle and the license plate; the color of the vehicle is corrected by detecting the color difference of the color of the license plate, which has very important significance for improving the recognition rate of the color of the vehicle.

Disclosure of Invention

In order to overcome the defects of low detection precision and low detection robustness of the existing visual detection mode of vehicle color, the invention provides a multitask deep convolution neural network system with high detection precision and high robustness for vehicle color visual detection.

The technical scheme adopted by the invention for solving the technical problems is as follows:

a vehicle color recognition system based on a multitask deep convolutional neural network comprises a high-definition camera arranged above a road traffic line, a traffic cloud server and a vehicle color vision detection subsystem;

the high-definition camera is used for acquiring video data on a road, is configured above a driving line and transmits the video image data on the road to the traffic cloud server through a network;

the traffic cloud server is used for receiving the video data on the road obtained from the high-definition camera and submitting the video data to the vehicle color visual detection system for vehicle color identification;

the visual detection subsystem of the vehicle color comprises a vehicle positioning detection module, a license plate background color identification module, a color difference calculation module, a vehicle color correction module and a vehicle color identification module, wherein the vehicle positioning detection module, the license plate positioning detection module and the vehicle color identification module share a deep convolutional neural network of the same Faster R-CNN, vehicles on a road are quickly segmented by adopting the deep convolutional neural network, license plates on the road are quickly segmented by using a vehicle image and further adopting the deep convolutional neural network, and then space position information occupied by the vehicles and the license plates in the road image is given.

Further, the vehicle and license plate segmentation and positioning are composed of two models, one model is a selective search network for generating the RoI; the other model is a fast R-CNN vehicle and a license plate target detection network; after the two-classification identification network, a multi-task learning network with multi-level, multi-label and multi-feature fusion and progressive layer by layer is realized;

the selective searching network, i.e. RPN; the RPN takes an image with any scale as input, and outputs a set of rectangular target suggestion boxes, wherein each box comprises 4 position coordinate variables and a score; the targets of the target suggestion frame refer to vehicle objects and license plate objects;

the estimated probability that each suggestion box is a target/non-target is a classification layer realized by two classified softmax layers; the k suggestion boxes are parameterized by the corresponding k suggestion boxes called anchors;

each anchor is centered at the center of the current sliding window and corresponds to a scale and an aspect ratio, and 3 scales and 3 aspect ratios are used, so that k is 9 anchors at each sliding position;

training an RPN network, and assigning a binary label to each anchor so as to mark whether the anchor is a target or not; positive labels are then assigned to both types of anchors: (I) the ratio of intersection-over-Union, overlapping anchor, with a real target bounding box, i.e. Ground Truth, GT, has the highest IoU; (II) an anchor with IoU overlap of greater than 0.7 with any GT bounding box; note that one GT bounding box may assign positive labels to multiple anchors; assigning negative labels to anchors whose IoU ratio to all GT bounding boxes is below 0.3; if the non-positive and non-negative anchors have no effect on the training target, abandoning the anchors;

following the multitask loss in Faster R-CNN, minimizing the objective function; the loss function for an image is defined as:

where i is the index of an anchor, p_iIs the predicted probability that anchor is the ith target, and if anchor is positive, GT label

That is, 1, if anchor is negative,

is 0; t is t_iIs a vector, representing the 4 parameterized coordinates of the predicted bounding box,

is the coordinate vector of the GT bounding box corresponding to the positive anchor; λ is a balance weight, where λ is 10, N_clsIs the normalized value of the cls term, where N is the size of the mini-batch_cls＝256，N_regIs the number of anchor positions normalized by the reg term, N_reg2,400, class penalty function L_clsAre three categories, vehicle target object, license plate target object vs. log loss of road background:

in the formula, L_clsAs a function of classification loss, P_iThe predicted probability that anchor is the ith target; p_i ^*The predicted probability of the ith target being the true target bounding box;

for the regression loss function L_regDefined by the following function:

in the formula, L_regAs a function of the regression lossR is a robust loss function, and smooth L is calculated by formula (4)₁；

In the formula, smooth_L1Is smooth L₁A loss function, x being a variable;

the fast R-CNN network obtains a feature map after an input image passes through a deep convolutional neural network, corresponding RoIs can be obtained according to the feature map and the RPN network, and finally the corresponding RoIs pass through a RoI pooling layer; the RoI is an area of interest, which refers to a vehicle target object and a license plate target object;

for the Faster R-CNN network, the inputs are N feature maps and R RoIs; n feature maps are from the last convolutional layer, and the size of each feature map is w × h × c;

each RoI is a tuple (N, r, c, h, w), where N is the index of the feature map, N is an element (0,1,2,.., N-1), r, c is the upper left-hand corner coordinate, and h, w are height and width, respectively;

outputting the feature mapping obtained by the maximum pooling; corresponding the RoI in the original image with the blocks in the feature map; the feature map is down-sampled to a fixed size and then passed into the full connection.

Furthermore, the selective search network and the Faster R-CNN are trained independently, and a 4-step training algorithm is used for learning shared characteristics through alternate optimization; firstly, according to the training RPN, the network is initialized by an ImageNet pre-trained model, and end-to-end fine tuning is used for a regional proposal task; secondly, training a single detection network by fast R-CNN by using the suggestion frame generated by the RPN in the first step, wherein the detection network is initialized by a model pre-trained by ImageNet, and the two networks do not share a convolutional layer; thirdly, initializing RPN training by using a detection network, but fixing a shared convolution layer, and only finely adjusting a layer unique to the RPN, wherein the two networks share the convolution layer; fourthly, keeping the shared convolution layer fixed, and finely adjusting fc of the Faster R-CNN, namely a full connection layer; thus, two networks share the same convolution layer to form a unified network;

through the processing of the two networks, detecting a vehicle target object and a license plate target object in a frame of video image, and framing the size and the spatial position of the vehicle target object and the license plate target object to obtain the size and the spatial position of the vehicle target object and the license plate target object; r of which_v,c_vIs the upper left corner coordinate of the vehicle target object in the image, h_v,w_vThe projection sizes, i.e., height and width, of the vehicle target object on the image plane, respectively; r of which_p,c_pIs the upper left corner coordinate of the license plate in the image, h_p,w_pThe projection sizes of the license plate on the image plane are respectively, namely the height and the width;

the progressive cascade relation among tasks is utilized in a Faster R-CNN network, namely, vehicle accurate positioning, vehicle type, brand and vehicle series identification, license plate accurate positioning, license plate identification and license plate color identification, chromatic aberration detection, vehicle color correction and vehicle color identification are sequentially carried out.

Furthermore, the license plate background color recognition module is used for processing the license plate image to obtain the license plate background color under the current environmental condition; carrying out gray level histogram processing on the license plate image, wherein the peak valley position in the gray level histogram is the interval between characters in the license plate, namely the background color of the license plate; and averaging the RGB color components of the pixels in the interval to finally obtain the background color of the license plate under the current environmental condition.

The color difference calculation module is used for comparing and calculating the license plate background color specified by national standard with the license plate background color under the current environmental condition to obtain the color difference under the current environmental condition; firstly, comparing the background color of the license plate under the current environmental condition with several types of the background color of the license plate specified by the national standard to obtain the background color of the license plate specified by the closest national standard, and taking the background color as the background color of the license plate under standard light; the calculation of the color difference is performed on the CIE1976Lab color space; in order to realize the conversion from the RGB color space to the Lab color space quickly, a quick conversion mode is adopted, as shown in formula (5);

wherein R, G, B are color components in the RGB color space, L is the lightness component of CIE1976Lab color space, and a and b are the chroma components of CIE1976Lab color space, respectively;

calculating the license plate background color specified by the national standard and the license plate background color under the current environmental condition according to a formula (5) to obtain respective values of L, a and b; wherein L is_NPAnd L_RPRespectively the brightness values a of the license plate background color and the license plate background color under the current environmental condition specified by the national standard_NPAnd a_RP、b_NPAnd b_RPRespectively the color of the license plate background color specified by the national standard and the chroma of the license plate background color under the current environmental condition, and the color difference delta E between the two_abCalculating CIE1976Lab color difference by using formula (6);

wherein Δ L ═ L_NP-L_RPIs lightness difference, Δ a ═ a_NP-a_RP、Δb＝b_NP-b_RPIs the difference in chroma, Δ E_abIs color difference, in NBS.

The vehicle color correction module is used for correcting the vehicle color under the current environmental condition according to the detected color difference to obtain a vehicle color image under an ideal environment specified by the national standard; first, the color difference Δ E is judged_abWhether or not a threshold value is exceeded

If the threshold value is exceeded, the vehicle color correction is carried out, and the correction calculation is shown as an equation (7);

in the formula, L_NMFor specifying the color of the vehicle under ideal environment for national standardValue of L_RPIs the lightness value of the vehicle color under the current environmental condition, Delta L is the lightness difference between the license plate background color specified by the national standard and the license plate background color under the current environmental condition, a_NMAnd b_NMThe color value of the vehicle color in an ideal environment, a, is specified for the national standard_RPAnd b_RPThe chroma value of the vehicle color under the current environmental condition, and the chroma difference of the license plate background color specified by the national standards of delta a and delta b and the license plate background color under the current environmental condition;

further, the vehicle color after correction is inversely transformed from the Lab color space to the RGB color space, as shown in formula (8);

the equation set of the formula (8) is an optimized formula, floating point operation is converted into a mode of constant integer multiplication and shift, and the shift in the formula is written as div2^23, which means that the shift is 23 bits to the right; in the formula, the value ranges of RGB and Lab are both [0,255], and the RGB value of the vehicle color under the ideal environment specified by the national standard is obtained through an inverse Gamma function.

The vehicle color identification module is used for identifying the corrected vehicle color, and marking the vehicle image after color correction with a corresponding color label for training in order to effectively share a fast R-CNN deep convolution neural network; when the vehicle color is identified, after the steps of accurate vehicle positioning, accurate license plate positioning, license plate background color identification, chromatic aberration detection and vehicle color correction processing, finally the vehicle color is identified through a fast R-CNN deep convolution neural network; the color of the vehicle under the standard illumination condition can be obtained by the identification of the vehicle image after the color correction through the fast R-CNN deep convolution neural network.

The invention has the following beneficial effects: the chromatic aberration caused by illumination and camera setting is eliminated fundamentally, and the detection robustness of the vehicle color is effectively improved; the multi-task deep learning convolutional neural network is adopted, a progressive cascade mode among tasks is utilized, vehicles are accurately positioned, vehicle types, brands and vehicle systems are identified, license plates are accurately positioned, license plates are identified, license plate colors are identified, the same Faster R-CNN deep convolutional neural network is shared by the vehicle color identification, the positioning identification precision of each task is improved, meanwhile, the overall identification time is effectively shortened, and the real-time performance of detection identification is improved.

Drawings

FIG. 1 is a structural diagram of fast R-CNN;

FIG. 2 is a diagram of a selective search network;

FIG. 3 is a diagram of a multitask Faster R-CNN progressive cascade relationship;

FIG. 4 is a diagram of a multitask Faster R-CNN vehicle color vision inspection network architecture;

FIG. 5 is an illustration of a method for extracting background colors of a license plate under current environmental conditions using a histogram of gray levels;

FIG. 6 is a flow chart of a vehicle color identification process for a multitask Faster R-CNN deep convolutional network.

Detailed Description

The invention is further described below with reference to the accompanying drawings.

Referring to fig. 1 to 6, a vehicle color recognition system based on a multitask depth convolution neural network comprises a high-definition camera installed above a road traffic line, a traffic cloud server and a vehicle color visual detection subsystem;

the traffic cloud server is used for receiving the video data on the road obtained from the high-definition camera and submitting the video data to the vehicle color visual detection system for vehicle color identification; as shown in FIG. 6, the process flow first divides and positions the video image, extracts the vehicle map in the imageAn image; secondly, segmenting and positioning the license plate image in the vehicle image to extract the license plate image; then, processing the license plate image by using the gray histogram to obtain the gray histogram of the license plate; extracting a background color area of the license plate in the gray level histogram according to the character distribution characteristics of the license plate; further, matching the license plate background color in the closest national standard according to the extracted license plate background color; further, converting the national standard license plate background color and the RGB color space of the license plate background color obtained by detection into Lab color space respectively; then calculating the color difference between the two, if the color difference exceeds the threshold value T_ΔEabConverting the vehicle image of the RGB color space to the Lab color space, carrying out chromatic aberration correction on the vehicle image by using the chromatic aberration to obtain a vehicle image without chromatic aberration, and then converting the vehicle image after the color correction to the RGB color space; and finally, identifying the vehicle type, brand, series and vehicle body color of the vehicle image after color correction to finally obtain accurate external feature description.

The China's republic of China public safety industry standard GA 36-2014 China's republic of China motor vehicle number plate, hereinafter referred to as the national standard for short, has the following detailed stipulations on the color of the number plate, and the national standard stipulates that the chromaticity coordinate of the metal material number plate under the irradiation of an A light source should accord with the stipulation of 4.4.1 in GA 666-; the color difference between the chromaticity coordinate of the reflecting surface of the blue bottom color number plate under the irradiation of a D65 light source and the standard color plate is regulated not to exceed 8.0 NBS; the luminance factor should comply with the specification of 4.4.2 in GA 666-. The above specification provides a standard color reference for the present invention.

TABLE 1

TABLE 2

The visual detection subsystem of the vehicle color comprises a vehicle positioning detection module, a license plate background color identification module, a color difference calculation module, a vehicle color correction module and a vehicle color identification module; the vehicle positioning detection module, the license plate positioning detection module and the vehicle color identification module share the same depth convolution neural network of Faster R-CNN, vehicles on a road are quickly segmented by adopting the depth convolution neural network, license plates on the road are quickly segmented by using a vehicle image further by adopting the depth convolution neural network, and then space position information occupied by the vehicles and the license plates in the road image is given.

The vehicle and license plate segmentation and positioning are composed of two models, one model is a selective search network for generating RoI; the other model is a fast R-CNN vehicle and license plate target detection network, and the structure diagram of a detection unit is shown in FIG. 1; in the invention, the model is further modified in the Faster R-CNN network; after the two-classification identification network, a multi-layer, multi-label and multi-feature fusion layer-by-layer progressive multi-task learning network is realized, as shown in fig. 4;

selectively searching for a network, i.e., an RPN; the RPN network takes an image of any scale as an input and outputs a set of rectangular target suggestion boxes, wherein each box comprises 4 position coordinate variables and a score. To generate the region suggestion box, a small net is slid over the convolution signature output by the last shared convolution layer, this net being fully connected to the n spatial window of the input convolution signature. Each sliding window is mapped to a low-dimensional vector, and one sliding window of each feature map corresponds to a numerical value. This vector is output to the fully connected layers of the two siblings.

At the position of each sliding window, k proposed regions are predicted simultaneously, so the position regression layer has 4k outputs, i.e. coordinates encoding of k bounding boxes. The classification layer outputs scores for 2k bounding boxes, i.e., the estimated probability of each suggestion box being a target/non-target, is a classification layer implemented with a softmax layer of binary classifications, and can also use logistic regression to generate k scores. The k suggestion boxes are parameterized by the corresponding k suggestion boxes called anchors. Each anchor is centered at the current sliding window center and corresponds to one scale and aspect ratio, using 3 scales and 3 aspect ratios, so that at each sliding position there are k-9 anchors. For example, for a convolutional eigenmap of size w × h, there are a total of w × h × k anchors. The RPN network architecture is shown in fig. 2.

To train the RPN network, each anchor is assigned a binary label to mark whether the anchor is a target. Positive labels are then assigned to both types of anchors: (I) the ratio of intersection-over-Union, overlapping anchor, with a real target bounding box, i.e. Ground Truth, GT, has the highest IoU; (II) anchors with IoU overlap of greater than 0.7 with any GT bounding box. Note that one GT bounding box may assign positive labels to multiple anchors. Assigning negative labels to anchors for which the IoU ratio to all GT bounding boxes is below 0.3. And the anchors which are not positive or negative have no effect on the training target and are discarded.

With these definitions, the objective function is minimized following the multitasking loss in the Faster R-CNN. The loss function for an image is defined as:

That is, 1, if anchor is negative,

is the coordinate vector of the GT bounding box corresponding to the positive anchor; λ is a balance weight, where λ is 10, N_clsIs the normalized value of the cls term mini-Size of batch, where N_cls＝256，N_regIs the number of anchor positions normalized by the reg term, N_reg2,400, class penalty function L_clsIs 3 categories, i.e. vehicle target, license plate target vs. logarithmic loss of road background:

for the regression loss function L_regDefined by the following function:

in the formula, L_regFor the regression loss function, R is a robust loss function, and smooth L is calculated by equation (4)₁；

In the formula, smooth_L1Is smooth L₁A loss function, x being a variable;

the fast R-CNN network structure is shown in fig. 3, and a feature map can be obtained after an input image passes through a deep convolutional neural network, and corresponding RoIs can be obtained according to the feature map and the RPN network, and finally, the corresponding RoIs pass through a RoI pooling layer. This layer is a process with only one level of spatial "pyramid" pooling. The inputs are N feature maps and R rois. The N feature maps are from the last convolutional layer, each having a size of w × h × c. Each RoI is a tuple (N, r, c, h, w), where N is the index of the feature map, N ∈ (0,1,2,.., N-1), r, c are the top left corner coordinates, and h, w are height and width, respectively. The output is the feature map resulting from the maximum pooling. The layer has two main functions, namely, the RoI in the original image is corresponding to the blocks in the feature map; another is to down-sample the feature map to a fixed size and then pass it into the full connection.

And (3) weight sharing of the selective search network and the detection network: the selective search network and the Faster R-CNN are both independently trained, modifying their convolutional layers in different ways. It is therefore desirable to employ a technique that allows the convolutional layer to be shared between two networks, rather than learning the two networks separately. A practical 4-step training algorithm is used in the invention to learn shared features through alternate optimization. First, the network is initialized with ImageNet pre-trained models and fine-tuned end-to-end for regional proposal tasks, according to the above training RPN. And secondly, training a single detection network by the Faster R-CNN by using the suggestion box generated by the RPN in the first step, wherein the detection network is also initialized by a model pre-trained by ImageNet, and at the moment, the two networks do not share the convolutional layer. Third, the RPN training is initialized with the detection network, but the shared convolutional layer is fixed, and only the layer unique to the RPN is fine-tuned, now both networks share the convolutional layer. And fourthly, keeping the shared convolution layer fixed, and finely adjusting fc of the Faster R-CNN, namely the full connection layer. Thus, the two networks share the same convolutional layer to form a unified network.

Considering the problem of multi-scale of the object, three simple scales are adopted for each feature point on the feature map, the areas of the bounding boxes are respectively 128 × 128, 256 × 256 and 512 × 512, and the aspect ratios are respectively 1:1, 1:2 and 2: 1. By the design, a large area can be predicted without multi-scale features or multi-scale sliding windows, and a large amount of running time can be saved.

Through the processing of the two networks, the vehicle in one frame of video image is detected and the size and the spatial position of the vehicle are framed, namely the size and the spatial position of the vehicle, and the r of the vehicle is obtained_v,c_vIs the upper left corner coordinate of the vehicle in the image, h_v,w_vThe projection sizes of the vehicle on the image plane, i.e. height and width, respectively;

because the objects concerned in the invention are vehicles and license plates, namely interested objects, hereinafter referred to as RoI, in order to locate and segment various RoIs on the road, when learning and training a convolutional neural network, various vehicles, license plates and road background images are respectively marked with corresponding labels for training; therefore, the vehicles and the license plates can be automatically segmented and positioned through the fast R-CNN deep convolution neural network. In order to improve the positioning accuracy of the license plate, the license plate of the vehicle is segmented and positioned by the segmented and positioned vehicle image through the Faster R-CNN deep convolution neural network;

the invention adopts a multi-task deep learning convolutional neural network, because the image recognition of the multi-task deep learning network is often superior to that of a single-task deep learning network, the multi-task has relevance among tasks in the learning process, namely, information sharing exists among the tasks, which is also a necessary condition of the multi-task deep learning; when a plurality of tasks are trained simultaneously, the network utilizes shared information among the tasks to enhance the induction bias capability of the system and the generalization capability of the classifier; as shown in fig. 3, the present invention fully utilizes the progressive cascade relationship among tasks, namely, vehicle accurate positioning, vehicle type, brand and vehicle series identification, license plate accurate positioning, license plate identification and license plate color identification, chromatic aberration detection, vehicle color correction, and vehicle color identification in sequence;

the vehicle positioning detection module is used for segmenting and positioning a vehicle object image on a road, and processing a video image on the road by adopting an Faster R-CNN deep convolution neural network to obtain the size and the spatial position of a vehicle, wherein R is the size and the spatial position of the vehicle_v,c_vIs the upper left corner coordinate of the vehicle in the image, h_v,w_vRespectively the projection size of the vehicle on the image plane;

the license plate positioning detection module is used for segmenting and positioning a license plate image in a vehicle object image, and processing the vehicle image by adopting an Faster R-CNN deep convolution neural network to obtain the size and the spatial position of the license plate, wherein R is the size and the spatial position of the license plate_p,c_pIs the upper left corner coordinate of the license plate in the image, h_p,w_pThe projection sizes of the license plate on the image plane are respectively;

the license plate background color recognition module is used for processing the license plate image to obtain the license plate background color under the current environmental condition; specifically, the license plate image is subjected to gray level histogram processing, as shown in fig. 5, the peak valley in the gray level histogram is the interval between characters in the license plate, that is, the background color of the license plate; averaging RGB color components of pixels in the interval to finally obtain the background color of the license plate under the current environmental condition;

the color difference calculation module is used for comparing and calculating the license plate background color specified by national standard with the license plate background color under the current environmental condition to obtain the color difference under the current environmental condition; firstly, comparing the background color of the license plate under the current environmental condition with several types of the background color of the license plate specified by the national standard to obtain the background color of the license plate specified by the closest national standard, and taking the background color as the background color of the license plate under standard light; the calculation of the chromatic aberration is carried out on a CIE1976Lab color space, and generally, the conversion from the RGB color space to the Lab color space needs to be realized in two steps; the first step is to convert the RGB color space of 24bit true color into XYZ color space, and the second step is to convert the XYZ color space into Lab color space; in order to realize the conversion from the RGB color space to the Lab color space quickly, the invention adopts a quick conversion mode, as shown in a formula (5);

calculating the license plate background color specified by the national standard and the license plate background color under the current environmental condition according to a formula (5) to obtain respective values of L, a and b; wherein L is_NPAnd L_RPRespectively the brightness values a of the license plate background color and the license plate background color under the current environmental condition specified by the national standard_NPAnd a_RP、b_NPAnd b_RPRespectively the color of the license plate background color specified by the national standard and the chroma of the license plate background color under the current environmental condition, and the color difference delta E between the two_abCIE1976Lab color difference can be calculated by formula (6);

wherein Δ L ═ L_NP-L_RPIs lightness difference, Δ a ═ a_NP-a_RP、Δb＝b_NP-b_RPIs the difference in chroma, Δ E_abIs color difference, in NBS;

in the formula, L_NMThe lightness value, L, of the vehicle color in an ideal environment is specified for the national standard_RPIs the lightness value of the vehicle color under the current environmental condition, Delta L is the lightness difference between the license plate background color specified by the national standard and the license plate background color under the current environmental condition, a_NMAnd b_NMThe color value of the vehicle color in an ideal environment, a, is specified for the national standard_RPAnd b_RPThe chroma value of the vehicle color under the current environmental condition, and the chroma difference of the license plate background color specified by the national standards of delta a and delta b and the license plate background color under the current environmental condition;

the equation set of the formula (8) is an optimized formula, floating point operation is converted into a mode of constant integer multiplication and shift, and the shift in the formula is written as div2^23, which means that the shift is 23 bits to the right; in the formula, the value ranges of RGB and Lab are both [0,255], and the RGB value of the vehicle color under the ideal environment specified by the national standard is obtained through an inverse Gamma function;

the vehicle color identification module is used for identifying the corrected vehicle color, and marking the vehicle image after color correction with a corresponding color label for training in order to effectively share the fast R-CNN deep convolution neural network; when the vehicle color is identified, after the steps of accurate vehicle positioning, accurate license plate positioning, license plate background color identification, chromatic aberration detection and vehicle color correction processing, finally the vehicle color is identified through a fast R-CNN deep convolution neural network; the vehicle color under the standard illumination condition can be obtained by the identification of the vehicle image corrected by the color through the fast R-CNN deep convolution neural network.

The above description is only exemplary of the preferred embodiments of the present invention, and is not intended to limit the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A vehicle color identification system based on a multitask deep convolutional neural network is characterized in that: the system comprises a high-definition camera arranged above a road traffic line, a traffic cloud server and a visual detection subsystem of vehicle colors;

the traffic cloud server is used for receiving the video data on the road obtained from the high-definition camera and submitting the video data to the visual detection subsystem of the vehicle color for vehicle color identification;

the visual detection subsystem of the vehicle color comprises a vehicle positioning detection module, a license plate background color identification module, a color difference calculation module, a vehicle color correction module and a vehicle color identification module, wherein the vehicle positioning detection module, the license plate positioning detection module and the vehicle color identification module share a deep convolutional neural network of the same Faster R-CNN, vehicles on a road are quickly segmented by adopting the deep convolutional neural network, license plates on the road are quickly segmented by using a vehicle image and further adopting the deep convolutional neural network, and then space position information occupied by the vehicles and the license plates in the road image is given;

the license plate background color recognition module is used for processing the license plate image to obtain the license plate background color under the current environmental condition; carrying out gray level histogram processing on the license plate image, wherein the peak valley position in the gray level histogram is the interval between characters in the license plate, namely the background color of the license plate; averaging RGB color components of pixels in the interval to finally obtain the background color of the license plate under the current environmental condition;

calculating the license plate background color specified by the national standard and the license plate background color under the current environmental condition according to a formula (5) to obtain respective values of L, a and b; wherein L is_NPAnd L_RPNumber plate background respectively specified by national standardLightness value of color and background color of license plate under present environmental conditions, a_NPAnd a_RP、b_NPAnd b_RPRespectively the color of the license plate background color specified by the national standard and the chroma of the license plate background color under the current environmental condition, and the color difference delta E between the two_abCalculating CIE1976Lab color difference by using formula (6);

in the formula, L_NMThe lightness value, L, of the vehicle color in an ideal environment is specified for the national standard_RPIs the lightness value of the vehicle color under the current environmental condition, Delta L is the lightness difference between the license plate background color specified by the national standard and the license plate background color under the current environmental condition, a_NMAnd b_NMThe color value of the vehicle color in an ideal environment, a, is specified for the national standard_RPAnd b_RPChroma value of vehicle color under present environment, delta a and delta b license plate regulated by national standardThe chroma difference of the background color and the license plate background color under the current environmental condition;

2. The multitasking deep convolutional neural network-based vehicle color recognition system of claim 1, wherein:

the vehicle and license plate segmentation and positioning are composed of two models, one model is a selective search network for generating RoI; the other model is a fast R-CNN vehicle and a license plate target detection network; after the two-classification identification network, a multi-task learning network with multi-level, multi-label and multi-feature fusion and progressive layer by layer is realized;

training an RPN network, and assigning a binary label to each anchor so as to mark whether the anchor is a target or not; positive labels are then assigned to both types of anchors: (I) the ratio of intersection-over-Union, overlapping anchor, with a real target bounding box, i.e. Ground Truth, GT, has the highest IoU; (II) an anchor with IoU overlap of greater than 0.7 with any GT bounding box; note that one GT bounding box assigns positive labels to multiple anchors; assigning negative labels to anchors whose IoU ratio to all GT bounding boxes is below 0.3; if the non-positive and non-negative anchors have no effect on the training target, abandoning the anchors;

That is, 1, if anchor is negative,

is the coordinate vector of the GT bounding box corresponding to the positive anchor; λ is a balance weight, where λ is 10, N_clsIs the normalized value of the cls term, where N is the size of the mini-batch_cls＝256，N_regIs the number of anchor positions normalized by the reg term, N_reg2,400, minClass loss function L_clsThere are three categories, namely vehicle target object, license plate target object and logarithmic loss of road background:

for the regression loss function L_regDefined by the following function:

In the formula, smooth_L1Is smooth L₁A loss function, x being a variable;

for the Faster R-CNN network, the inputs are N feature maps and R1 RoIs; n feature maps are from the last convolutional layer, and the size of each feature map is w × h × c;

3. The multitasking deep convolutional neural network-based vehicle color recognition system of claim 2, wherein: the selective search network and the Faster R-CNN are independently trained, and a 4-step training algorithm is used for learning shared characteristics through alternate optimization; firstly, according to the training RPN, the network is initialized by an ImageNet pre-trained model, and end-to-end fine tuning is used for a regional proposal task; secondly, training a single detection network by fast R-CNN by using the suggestion frame generated by the RPN in the first step, wherein the detection network is initialized by a model pre-trained by ImageNet, and the two networks do not share a convolutional layer; thirdly, initializing RPN training by using a detection network, but fixing a shared convolution layer, and only finely adjusting a layer unique to the RPN, wherein the two networks share the convolution layer; fourthly, keeping the shared convolution layer fixed, and finely adjusting fc of Fast R-CNN, namely a full connection layer; thus, two networks share the same convolution layer to form a unified network;

4. The vehicle color identification system based on the multitask deep convolutional neural network according to one of claims 1-3, wherein: the vehicle color identification module is used for identifying the corrected vehicle color, and marking the vehicle image after color correction with a corresponding color label for training in order to effectively share a fast R-CNN deep convolution neural network; when the vehicle color is identified, after the steps of accurate vehicle positioning, accurate license plate positioning, license plate background color identification, chromatic aberration detection and vehicle color correction processing, finally the vehicle color is identified through a fast R-CNN deep convolution neural network; the color of the vehicle under the standard illumination condition can be obtained by the identification of the vehicle image after the color correction through the fast R-CNN deep convolution neural network.