CN106971563B

CN106971563B - Intelligent traffic light control method and system

Info

Publication number: CN106971563B
Application number: CN201710212449.8A
Authority: CN
Inventors: 王书强; 王永灿; 杨岳; 申妍燕; 胡明辉
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2017-04-01
Filing date: 2017-04-01
Publication date: 2020-05-19
Anticipated expiration: 2037-04-01
Also published as: CN106971563A

Abstract

The invention relates to an intelligent traffic signal light control method, comprising: acquiring video image information of road vehicles; preprocessing the acquired video image information, constructing a pre-training network model, and constructing a training network on the basis of the pre-training network model , pre-train the constructed pre-training network, initialize the training network with the pre-trained network parameters, and train the training network with the constructed data set until the number of vehicles in the corresponding lane in the picture is detected in real time; On the basis of , by substituting the detected number of vehicles in the corresponding lane as a parameter into the traffic signal traffic standard at the intersection, the traffic signal traffic time of the intersection is adjusted in real time. The invention also relates to an intelligent traffic signal light control system. The invention adjusts the green light passing time of the traffic signal lights at the crossroads in real time on the basis of vehicle identification, reduces road congestion, avoids traffic accidents and improves traffic efficiency.

Description

Intelligent traffic signal lamp control method and system

Technical Field

The invention relates to an intelligent traffic signal lamp control method and system.

Background

Current traffic light control systems attempt to control traffic lights by identifying vehicles on the roadway in various ways. The method is roughly divided into three types, namely magnetic frequency detection, wave frequency detection and video detection. The different basic technologies are relied on, and each has advantages.

At present, the most widely used at home and abroad is magnetic frequency detection, namely, a vehicle detector is an annular coil which is buried under a road surface to detect the traffic volume, the road occupation rate and the passing speed of vehicles above the road, but the equipment needs to be paved to damage the ground, is inconvenient to maintain and upgrade at the later stage, and has poor expansibility; the wave frequency detection is based on microwave, ultrasonic wave or infrared wave and the like, and detects the vehicle by sensing the electromagnetic wave emitted by the vehicle, the accuracy is high, but the cost is high; with the wide application of deep learning in the field of image recognition, image processing based on video detection has become a hotspot in recent years, but has not yet been sufficiently practically applied.

In a word, the traditional detection method for vehicle identification and then traffic signal lamp control cannot achieve satisfactory effects on efficiency and accuracy.

Disclosure of Invention

In view of the above, it is desirable to provide an intelligent traffic signal control method and system, which can adjust the green time of the traffic signal at the intersection in real time based on vehicle identification.

The invention provides an intelligent traffic signal lamp control method, which comprises the following steps: a. acquiring a road vehicle video in real time to acquire video image information of the road vehicle; b. preprocessing the acquired video image information, constructing a pre-training network model, constructing a training network on the basis of the pre-training network model, pre-training the constructed pre-training network, initializing the training network according to pre-trained network parameters, and training the training network according to the constructed data set until the number of vehicles in a corresponding lane in the picture is detected in real time; c. on the basis of the original passing time, the detected number of vehicles in the corresponding lane is substituted into a preset intersection traffic signal lamp passing standard as a parameter, and the passing time of the intersection traffic signal lamp is adjusted in real time.

Specifically, the pretreatment in step b is as follows: the acquired video image information is processed to be reserved only for the lane and its vehicle information.

Specifically, the step b further includes:

and selecting pictures of the intersection traffic conditions collected by the monitoring camera to construct a data set.

Specifically, the initializing a training network with pre-trained network parameters in the step b, training the training network with a constructed data set until the number of vehicles in a corresponding lane in the picture is detected in real time, and specifically includes: initializing training network parameters; using a self-built traffic intersection vehicle data set as a training set of the training network, unifying the size specification of pictures in the data set, and inputting the pictures into the model for training; continuously convolving and pooling the vehicle pictures at the traffic intersection, and outputting a tensor A of SxSxm on the last layer of convolution layer; performing modulo-3 matrix multiplication operation on the convolution layer output tensor A by using 3 factor matrixes U, V, W along different dimensions to obtain a kernel tensor B; inputting the kernel tensor B into a nonlinear activation function to find out corresponding potential vehicle features in hidden nodes, and outputting a feature tensor Z; outputting a tensor of SxSx (5a) by the improved B-layer full link; and (4) adjusting the network parameters to a specified precision by using a back propagation algorithm according to a loss function L formed by errors between the output predicted value and the vehicle real labeled value in the original image, and then storing the network parameters.

Specifically, the step c specifically includes: establishing a standard for adjusting the time of traffic signal lamps at the crossroads; and adjusting the time of the traffic lights in real time according to the number of vehicles on lanes in all directions identified by the intelligent traffic lights at the crossroads.

The invention also provides an intelligent traffic signal lamp control system, which comprises an image acquisition module, an identification module and a control module, wherein: the image acquisition module is used for acquiring road vehicle videos in real time and acquiring video image information of the road vehicles; the recognition module is used for preprocessing the acquired video image information, constructing a pre-training network model, constructing a training network on the basis of the pre-training network model, pre-training the constructed pre-training network, initializing the training network according to pre-trained network parameters, and training the training network according to the constructed data set until the number of vehicles in a corresponding lane in the picture is detected in real time; the control module is used for substituting the detected number of the vehicles in the corresponding lanes as parameters into a preset intersection traffic signal lamp passing standard on the basis of the original passing time, and adjusting the passing time of the intersection traffic signal lamp in real time.

Specifically, the pretreatment comprises the following steps: the acquired video image information is processed to be reserved only for the lane and its vehicle information.

Specifically, the identification module is further configured to: and selecting pictures of the intersection traffic conditions collected by the monitoring camera to construct a data set.

Specifically, the control module is specifically configured to: establishing a standard for adjusting the time of traffic signal lamps at the crossroads; and adjusting the time of the traffic lights in real time according to the number of vehicles on lanes in all directions identified by the intelligent traffic lights at the crossroads.

The invention can adjust the green light passing time of the traffic signal light at the crossroad in real time on the basis of vehicle identification. The number of vehicles at the crossroad is detected by the target detection and image recognition technology, so that road congestion is reduced, traffic accidents are avoided, and traffic efficiency is improved. In addition, the invention has the advantages of flexible installation and configuration, low cost, better applicability and popularization and higher robustness in the practical application process.

Drawings

FIG. 1 is a flow chart of an intelligent traffic signal control method of the present invention;

FIG. 2 is a schematic diagram of image information collected only within the range of m rows of lanes in the image collection process according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating a pre-trained network model according to an embodiment of the present invention;

FIG. 4 is a diagram illustrating a training network according to an embodiment of the present invention;

fig. 5 is a system architecture diagram of the intelligent traffic signal lamp control system of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.

Fig. 1 is a flowchart illustrating the operation of the intelligent traffic light control method according to a preferred embodiment of the present invention.

And step S1, real-time acquisition of road vehicle videos is carried out by mounting a camera on a traffic signal lamp, video image information of the road vehicle is obtained, the acquisition target is the road and the vehicle within a certain distance in four directions of the intersection, and analysis data are provided for subsequent identification. Specifically, the method comprises the following steps:

firstly, a camera is installed on a traffic signal lamp to configure a video image information acquisition device. The crossroad of the embodiment is provided with four traffic signal lamps respectively, and the four traffic signal lamps are collected through monitoring cameras arranged on each traffic signal lamp. However, it is not excluded that two traffic signal lamps are arranged at a crossing with small traffic flow in order to save cost, the strategy of installing the monitoring camera is different from that of installing four traffic signal lamps under the condition, and the front side and the back side of each traffic signal lamp are respectively provided with a camera for respectively collecting video image information of vehicles on a road.

And then, adjusting the video image information acquisition angle of the camera. And adjusting the acquisition angle of the camera so that m rows of coming lanes are presented in the video acquisition range to the maximum extent.

And finally, setting the video image information acquisition range of the camera. In the video image information acquisition process, the acquisition range distance of each camera is limited, and the limited range is set to be n meters from the lane to identify vehicles on the road within the distance range.

In the image acquisition process, the video image information acquisition range of each camera is limited, and only the image information in the range of the m rows of lanes is acquired, so that the workload in the subsequent identification process can be simplified through the preprocessing, and the identification accuracy and speed are improved. As m is 2 in fig. 2.

And step S2, detecting the obtained video image information, thereby identifying the specific number of vehicles in each direction of the intersection and providing parameters for the control of the intelligent traffic signal lamp. That is, firstly, the vehicle in the video frame image is accurately detected in real time through the improved convolutional neural network, the vehicle identification frame is circled out, and then the identification frame in the video frame image is counted. The method specifically comprises the following steps:

step S21, preprocessing the acquired video image information, reserving only the information of the lane and the vehicle, and using the processed video image information for vehicle detection. The acquisition range of video image information is limited, and on the basis, the video image is preprocessed by methods such as cutting and background removal, so that the identification accuracy can be improved. The ground color after cutting is defaulted as the road ground color so as to improve the detection accuracy of the improved convolutional neural network.

And step S22, constructing a pre-training network model, constructing a training network on the basis of the pre-training network model, pre-training the constructed pre-training network, initializing the training network by using pre-trained network parameters, and training the training network by using the constructed data set until the number of vehicles in the corresponding lane in the picture is detected in real time. In the embodiment, the improved convolutional neural network is used for carrying out real-time vehicle detection on the preprocessed video image information, and the improved convolutional neural network is used for carrying out vehicle target detection and identification. The principle of the improved convolutional neural network is to detect video frame images within a deep network framework. The method comprises the following specific steps:

and step S221, selecting the pictures of the intersection traffic conditions collected by the monitoring camera, and constructing a data set.

In view of the lack of related data sets at present, although some data sets such as ImageNet contain some vehicle pictures, the difference between scenes and visual angles is large, and in order to improve the system effect, the embodiment constructs a small data set by itself.

And acquiring intersection traffic condition pictures through deployed monitoring cameras at various intersections. An appropriate amount of pictures (1000-10000 in this embodiment) are selected, the pictures are selected as uniformly as possible, various scenes and traffic density can be covered, and the generalization capability of the system is improved. And then manually marking the selected picture to mark the vehicle information on the corresponding picture lane.

Step S222, constructing a pre-training network, and constructing a training network on the basis of the pre-training network model.

(1) Pre-training network

In this embodiment, the pre-training network constructs a two-class network model in which N layers of convolution layers are connected with one layer of full connection layer, only judges whether the picture is a vehicle class, and respectively outputs the probabilities of being a vehicle and not being a vehicle in the picture. Taking the network in fig. 3 as an example, N convolutional layers are constructed, and then are all connected to an output layer with 2 nodes to construct a pre-training network model.

(2) Training network

Adding a convolution layer A and a full-connection layer B on the basis of the pre-trained convolution layers N. To preserve the data space structure and reduce the network parameters, Turker mode decomposition is performed on the full link layer calculation, and then the corresponding full link layer parameters become 3 factor matrices represented as U, V, W. A detection network model is formed, the tensor of S x (5a) is output, the detection network model corresponds to Sx S squares in the picture, and 4 coordinate parameters of a bounding boxes in each square and the confidence of whether a vehicle exists are formed in the training network shown in fig. 4.

And step S223, training the constructed pre-training network.

Due to the high dimensionality of the deep-learning model network parameters, it is often required that the amount of input data must be sufficient to avoid overfitting. However, the time consumption and the cost are high when the data set is built by itself and the pictures are collected and marked manually. From the perspective of economy, applicability and feasibility, the embodiment firstly uses the related pictures in the existing ImageNet data set and Pascal VOC data set for pre-training, and then carries out training fine adjustment on the constructed traffic intersection vehicle data set, so that a better effect under the specific scene can be obtained at a lower cost.

Step S224, initializing the training network with the trained pre-training network parameter part, and training the training network with the constructed data set.

(1) Initializing training network parameters, initializing the parameters of the first N layers of convolutional layers by adopting the parameters of the corresponding first N layers of convolutional layers obtained in the pre-training process, and then randomly initializing the network parameters of the layer A convolutional layers and the layer B fully-connected layers.

(2) And (3) using a self-built traffic intersection vehicle data set as a training set of the network, unifying the size specification of pictures in the data set, and inputting the pictures into the model for training.

(3) The traffic intersection vehicle pictures are continuously convolved and pooled, and the tensor A of SxSxm is output on the last layer of convolution layer, namely the original picture is divided into SxS grids, each grid unit corresponds to one part of the original traffic intersection picture, and the picture feature in each grid corresponds to one m-dimensional vector in the tensor.

(4) Using 3 factor matrices U, V, W along different dimensions to perform modulo-3 matrix multiplication with the convolutional layer output tensor a to obtain a core tensor B.

B＝A x₁U x₂V x₃W

(5) And inputting the kernel tensor B into a nonlinear activation function to find out corresponding potential vehicle features in hidden nodes, and outputting a feature tensor Z.

Z＝h(B)

The activation function h () may be a sigmoid function, a hyperbolic tangent function, or a ReLU.

(6) And outputting a tensor of SxSx (5a) by the improved B-layer full link (namely repeating the previous two steps for B times, wherein parameters of each layer are different), namely coordinates (x, y, w, h) of a vehicle detection boundary frame corresponding to each grid unit and confidence of the vehicle detected in the vehicle identification frame. Wherein x and y are coordinates of the center point of the vehicle identification frame, w and h are width and height of the vehicle identification frame respectively, and the coordinates are normalized to be between 0 and 1.

(7) And (3) adjusting network parameters to a specified precision by using a back propagation algorithm according to a loss function L (the loss function adopts a sum of squares error loss function, which is specifically described in the following) formed by errors between the output predicted value and the real vehicle marking value in the original image, namely correctly classifying the image into a vehicle class or not, and then storing the network parameters. Wherein:

the loss function uses a sum of squares error loss function, which includes 3 parts, a coordinate prediction function, a confidence prediction function for recognition boxes containing vehicles, and a confidence prediction function for recognition boxes not containing vehicles.

Wherein x, y are coordinates of the center position of the vehicle recognition frame, w, h are the width and height of the vehicle recognition frame,

to determine whether the jth identification box in the ith mesh is responsible for detection,

to determine if there is a vehicle center falling within grid i, l_coordPredicting weights for the coordinates,/_noobjThe confidence weight for the recognition box that does not contain a vehicle.

And step S225, testing the trained network.

And implementing vehicle detection, performing target detection on the vehicle in the video image, inputting the acquired intersection vehicle image into a trained detection network model, and outputting the coordinates of the vehicle in the detected image and the probability of identifying the vehicle. And setting different threshold values to adjust the recognition precision according to actual requirements.

And counting the trained domain frame bodies so as to obtain the number of vehicles on the lane as a control parameter of the traffic signal lamp at the crossroad.

And step S3, according to the obtained control parameters, on the basis of the original passing time, substituting the quantity of vehicles in each direction of the intersection as parameters into a pre-established standard for passing the traffic signal lamp at the intersection, and adjusting the passing time of the traffic signal lamp at the intersection in real time, so that the real-time control of the traffic signal lamp is realized, and the effects of improving the passing efficiency, saving energy and reducing emission are achieved. Specifically, the method comprises the following steps:

firstly, a standard for adjusting the time of traffic signal lamps at the crossroads is established.

Then, the number of vehicles on lanes in each direction identified by the intelligent traffic signal lamp at the intersection is substituted into a standard system, so that the required control result, namely the traffic light time adjusted in real time according to the current number of the vehicles, is finally output.

Referring to fig. 5, a system architecture diagram of the intelligent traffic signal control system 10 of the present invention is shown. The system comprises an image acquisition module 101, a recognition module 102 and a control module 103.

The image acquisition module 101 is used for acquiring road vehicle videos in real time by installing a camera on a traffic signal lamp, acquiring video image information of the road vehicles, acquiring roads and vehicles with targets within a certain distance in four directions of the intersection, and providing analysis data for subsequent identification. Specifically, the method comprises the following steps:

The identification module 102 is configured to detect the acquired video image information, so as to identify the specific number of vehicles in each direction at the intersection, and provide parameters for controlling the intelligent traffic signal lamp. That is, firstly, the vehicle in the video frame image is accurately detected in real time through the improved convolutional neural network, the vehicle identification frame is circled out, and then the identification frame in the video frame image is counted. The method comprises the following specific steps:

the recognition module 102 preprocesses the acquired video image information, only reserves the information of the lane and the vehicle thereof, and uses the processed video image information for vehicle detection. The acquisition range of video image information is limited, and on the basis, the video image is preprocessed by methods such as cutting and background removal, so that the identification accuracy can be improved. The ground color after cutting is defaulted as the road ground color so as to improve the detection accuracy of the improved convolutional neural network.

The recognition module 102 builds a pre-training network model, builds a training network on the basis of the pre-training network model, pre-trains the built pre-training network, initializes the training network with pre-trained network parameters, and trains the training network with the built data set until the number of vehicles in the corresponding lane in the picture is detected in real time. In the embodiment, the improved convolutional neural network is used for carrying out real-time vehicle detection on the preprocessed video image information, and the improved convolutional neural network is used for carrying out vehicle target detection and identification. The principle of the improved convolutional neural network is to detect video frame images within a deep network framework. The method comprises the following specific steps:

And constructing a pre-training network, and constructing a training network on the basis of the pre-training network model.

(1) Pre-training network

(2) Training network

And training the pre-training network constructed in the above way.

And initializing the training network by using the trained pre-training network parameter part, and training the training network by using the constructed data set.

(3) The traffic intersection vehicle pictures are continuously convolved and pooled, and the tensor A of SxSx m is output on the last layer of convolution layer, namely the original picture is divided into SxS grids, each grid unit corresponds to one part of the original traffic intersection picture, and the picture feature in each grid corresponds to one m-dimensional vector in the tensor.

B＝A x₁U x₂V x₃W

Z＝h(B)

(7) And (3) adjusting the network parameters to a specified precision by using a back propagation algorithm according to a loss function L (the loss function adopts a sum of squares error loss function, which is specifically described in the following) formed by the error between the output predicted value and the real marked value of the vehicle in the original image, and then storing the network parameters. Wherein:

And testing the trained network.

The control module 103 is used for substituting the number of vehicles in each direction of the intersection as parameters into a pre-established intersection traffic signal lamp passing standard on the basis of the original passing time according to the obtained control parameters, adjusting the passing time of the intersection traffic signal lamp in real time, realizing the real-time control of the traffic signal lamp, and achieving the effects of improving the passing efficiency, saving energy and reducing emission. Specifically, the method comprises the following steps:

Compared with the traditional detection control method, the intelligent traffic signal lamp control method and the intelligent traffic signal lamp control system provided by the invention have the following advantages:

(1) the invention has the advantages of flexible installation and configuration, low cost, and better applicability and popularization.

(2) The improved convolutional neural network greatly improves the analysis speed and efficiency of vehicle identification, and can detect real-time pictures of traffic intersections. The improved convolutional neural network is an improvement of introducing Turker decomposition into a CNN network full connection layer and designing a network structure aiming at a target detection task, can efficiently detect vehicles in a picture in real time, well distinguishes a detected target and a background, and has the advantage of high recognition speed.

(3) The method has higher robustness in the practical application process. The judgment basis for adjusting the signal lamp passing time at the crossroad is determined by the quantity of vehicles in a certain distance on lanes in all directions of the crossroad. Therefore, the traffic signal lamp at the intersection is used as a system, the algorithm identification accuracy rate allows certain false detection rate to exist to some extent, and the system robustness is strong.

(4) Simplifying a complex traffic control model. Under the condition of identifying the quantity of vehicles in each direction of the intersection in real time, statistics on vehicle identification results is added on the basis of vehicle identification, the traffic control model of the intersection is simplified, if all detection data provided with the system are further aggregated, the urban traffic can be easily optimized and scheduled, and the complex problem of the smart urban traffic is simplified.

Although the present invention has been described with reference to the presently preferred embodiments, it will be understood by those skilled in the art that the foregoing description is illustrative only and is not intended to limit the scope of the invention, as claimed.

Claims

1. A method for controlling an intelligent traffic light, characterized in that the method comprises the steps:

a. Collect road vehicle videos in real time, and obtain video image information of road vehicles;

b. Preprocess the video image information obtained above, build a pre-training network model, and build a training network on the basis of the pre-training network model, pre-train the constructed pre-training network, and initialize with the pre-trained network parameters Train the network, train the training network with the constructed data set, until the number of vehicles in the corresponding lane in the picture is detected in real time;

c. On the basis of the original passing time, by substituting the number of vehicles in the corresponding lane detected above as a parameter into the pre-established traffic signal traffic standard at the intersection, adjust the traffic signal traffic time at the intersection in real time;

In the step b, the training network is initialized with pre-trained network parameters, and the training network is trained with the constructed data set until the number of vehicles in the corresponding lane in the picture is detected in real time, which specifically includes:

Initialize training network parameters;

Use the self-built traffic intersection vehicle dataset as the training set of the training network, and input the images in the dataset into the model for training;

The image of the vehicle at the traffic intersection is continuously convolutional and pooled, and the tensor A of SxSxm is output in the last convolutional layer, where: SxS means dividing the original image into an SxS grid, m is the m dimension in the tensor A vector, the m-dimensional vector corresponds to the picture feature in each grid;

Use 3 factor matrices U, V, W along different dimensions, and do a modulo 3 matrix multiplication operation with the output tensor A of the convolution layer to obtain a kernel tensor B;

Input the kernel tensor B into the nonlinear activation function to find out the corresponding potential vehicle features in the hidden nodes, and output the feature tensor Z;

The improved B-layer full link outputs a tensor of SxSx(5a);

Using the back propagation algorithm, adjust the network parameters to the specified accuracy according to the loss function L formed by the error between the output predicted value and the actual labeled value of the vehicle in the original image, and then save the network parameters.

2 . The method according to claim 1 , wherein the preprocessing in step b is: processing the acquired video image information to retain only the oncoming lane and its vehicle information. 3 .

3. The method of claim 1, wherein the step b further comprises:

Select the pictures of traffic conditions at intersections collected by surveillance cameras to construct a dataset.

4. The method of claim 1, wherein the step c specifically comprises:

Develop standards for time adjustment of traffic lights at intersections;

According to the number of vehicles in the lanes in each direction identified by the intelligent traffic lights at the intersection, the traffic light time is adjusted in real time.

5. An intelligent traffic signal light control system, characterized in that the system comprises an image acquisition module, an identification module and a control module, wherein:

The image acquisition module is used to collect video of road vehicles in real time, and obtain video image information of road vehicles;

The recognition module is used for preprocessing the above-obtained video image information, constructing a pretraining network model, and constructing a training network on the basis of the pretraining network model, and pretraining the constructed pretraining network to obtain a good pretrained network. Initialize the training network with the network parameters of , and train the training network with the constructed data set until the number of vehicles in the corresponding lane in the picture is detected in real time;

The control module is configured to, on the basis of the original passing time, adjust the passing time of the traffic light at the intersection in real time by substituting the detected number of vehicles in the corresponding lane as a parameter into the pre-established traffic light traffic standard at the intersection;

Wherein, the training network is initialized with pre-trained network parameters, and the training network is trained with the constructed data set until the number of vehicles in the corresponding lane in the picture is detected in real time, which specifically includes:

Initialize training network parameters;

The improved B-layer full link outputs a tensor of SxSx(5a);

6 . The system according to claim 5 , wherein the preprocessing is: processing the acquired video image information to keep only the oncoming lane and its vehicle information. 7 .

7. The system of claim 5, wherein the identification module is further used for:

8. The system of claim 5, wherein the control module is specifically used for:

Develop standards for time adjustment of traffic lights at intersections;