CN112101117A - Expressway congestion identification model construction method and device and identification method - Google Patents
Expressway congestion identification model construction method and device and identification method Download PDFInfo
- Publication number
- CN112101117A CN112101117A CN202010831009.2A CN202010831009A CN112101117A CN 112101117 A CN112101117 A CN 112101117A CN 202010831009 A CN202010831009 A CN 202010831009A CN 112101117 A CN112101117 A CN 112101117A
- Authority
- CN
- China
- Prior art keywords
- layer
- feature
- stage
- weight
- highway
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000010276 construction Methods 0.000 title claims description 6
- 230000005284 excitation Effects 0.000 claims abstract description 28
- 238000012549 training Methods 0.000 claims abstract description 21
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 19
- 238000002372 labelling Methods 0.000 claims abstract description 12
- 230000007246 mechanism Effects 0.000 claims abstract description 8
- 238000000605 extraction Methods 0.000 claims description 41
- 238000011176 pooling Methods 0.000 claims description 41
- 238000010586 diagram Methods 0.000 claims description 27
- 239000013598 vector Substances 0.000 claims description 19
- 230000006870 function Effects 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 11
- 238000012544 monitoring process Methods 0.000 claims description 10
- 230000009471 action Effects 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 238000013135 deep learning Methods 0.000 claims description 4
- 238000011478 gradient descent method Methods 0.000 claims description 3
- 238000012360 testing method Methods 0.000 description 10
- 238000012795 verification Methods 0.000 description 4
- 238000010200 validation analysis Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000001066 destructive effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/54—Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method and a device for constructing a highway congestion identification model and an identification method, wherein the method comprises the following steps of 1, acquiring a highway video frame image to obtain an initial image set; labeling each image in the initial image set to obtain a label set, wherein the labels comprise traffic categories which comprise congestion, saturation or unblocked states; step 2, taking the initial image set as input and the label set as output, training a deep convolutional neural network, fusing a network structure into a classic structure Bottleneck Layer and Squeeze and Excitation (SE) Block in a ResNet network on the basis of VGG-16, and simultaneously introducing an attention mechanism to construct a classification network of SE-VGG 16; the method can accurately identify the traffic jam state of the highway, can be applied to the traffic jam identification under the conditions of various traffic scenes and various camera view angles, has the end-to-end characteristic, and is simpler to realize and higher in identification precision.
Description
Technical Field
The invention belongs to the technical field of intelligent traffic, and particularly relates to a method and a device for constructing a highway congestion identification model and an identification method.
Background
Traffic congestion detection is important for monitoring traffic conditions and optimizing road network performance. Early traffic monitoring systems used loop detectors to count vehicle travel, collecting occupancy of traffic flow, but had limited ability to provide rich and accurate traffic information. In order to improve accuracy, researchers detect traffic jams by combining a route map with GPS data collected by a GPS tracker or a smartphone. However, these methods are destructive to the road surface, and need to rely on specially constructed and deployed resources, and data is difficult to obtain.
With the continuous reduction of the installation cost of the monitoring cameras, a large number of cameras in a road network can generate massive monitoring data every day, and required traffic flow parameters are extracted by analyzing the video data, so that the method has important practical significance for detecting the traffic jam state and has no damage. The traditional method detects and tracks vehicles in a video in an image processing mode, and counts traffic flow parameters, and the method has low robustness on shielding, is easy to cause wrong calculation of the traffic flow parameters, and affects the recognition result of a congestion state. In recent years, convolutional neural networks have achieved excellent results in image classification and recognition, they are able to automatically learn features in images, and are robust to translation, scaling, and rotation. Therefore, there are some researchers that apply convolutional neural networks to traffic congestion identification. One method is to estimate the traffic flow density through a network, the method is susceptible to the change of camera view angles, the other method is to calculate the congestion degree directly through a classification network, but the data of the current method only comprises specific cities and camera view angles, and the universality is general.
Disclosure of Invention
Aiming at the defects and shortcomings in the prior art, the invention provides a method, a device and a method for constructing a highway congestion identification model, which realize end-to-end identification and overcome the defects of low efficiency and the like of the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for constructing a highway congestion identification model comprises the following steps:
labeling each image in the initial image set to obtain a label set, wherein the labels comprise traffic categories which comprise congestion, saturation or unblocked states;
step 2, training a deep convolutional neural network by taking the initial image set as input and the label set as output;
the deep convolutional neural network comprises a plurality of feature extraction layers and classification layers which are sequentially arranged; the feature extraction layer comprises 5 feature extraction blocks, and the classification layer comprises 3 full-connection layers;
the feature extraction Block comprises a Bottleneck Layer, a SE Block and a pooling Layer which are connected in series;
the Bottleneck Layer comprises 3 convolution layers with the sizes of 1 × 1, 3 × 3 and 1 × 1 which are sequentially connected in series;
the SE Block comprises 1 convolutional Layer for further performing feature extraction on a feature graph output by a Bottleneck Layer and 1 global average pooling Layer;
the global average pooling layer corresponds to the Squeeze stage, the Excitation stage and the weight stage, wherein the Squeeze stage and the Excitation stage are used for generating weight parameters required by an attention mechanism, and the weight stage is used for acting the global average pooling layer on a feature map obtained by the convolution layer action of the SE Block to generate feature maps capable of reflecting different importance;
the 5 feature extraction blocks of the feature extraction layer are connected in series and then are connected with the 3 full connection layers of the classification layer for final traffic class classification;
and obtaining a highway congestion identification model.
The invention also comprises the following technical characteristics:
specifically, the loss function in the highway congestion identification model adopts a cross entropy loss function.
Specifically, a deep convolutional neural network is constructed by selecting a deep learning frame Caffe, a random gradient descent method is selected to optimize the highway congestion identification model, the learning rate is set to be 0.01, the batch-size is set to be 32, and the iteration epoch is set to be 80 times.
Specifically, the global average pooling layer corresponds to the Squeeze stage, the Excitation stage and the weight stage, and each feature channel is converted into a real number with a global receptive field by using the global average pooling in the Squeeze stage; generating a weight w for each characteristic channel in an Excitation stage; multiplying the characteristic channel corresponding to the Squeeze stage by the weight generated in the Excitation stage in the weight stage to obtain characteristic graphs capable of reflecting different importance degrees; the method specifically comprises the following steps:
first, a standard convolution operation, i.e. a conversion operation, is performed, FtrX → U, X representing the original input characteristic diagram, U representing the output characteristic diagram, FtrRepresenting a conversion from an original input feature map to an output feature map;
wherein the content of the first and second substances,U∈RW×H×Cr represents a real number space domain, W ', H ', C ' respectively represent the width, height and channel number of the characteristic diagram X, and W, H and C respectively represent the width, height and channel number of the characteristic diagram U, and the specific formula is as follows:
wherein, represents convolution operation, C and s are serial numbers of characteristic diagram channels, and the value ranges are respectively 1-C, 1-C', xsFor the s-th feature layer in the feature map X, vcFor the c-th convolution kernel, the number of channels is the same as that of the feature map X,at the s-th layer of the c-th convolution kernel, ucRepresenting the c-th feature layer in the feature map U;
performing global average pooling in the Squeeze stage, and taking the feature map U as a new oneInputting, converting into a new output characteristic diagram Z, wherein Z is equal to R 1*1*C1,1 and C respectively represent the width, height and channel number of the characteristic diagram Z, and the specific formula is as follows:
wherein F represents a global average pooling operation, uc(i, j) represents the element in the position of i row and j column on the c th feature layer in the feature map U, zcRepresenting the c-th characteristic layer in the characteristic diagram Z;
in an Excitation stage, performing weight calculation on the feature map Z output by the Squeeze stage global average pooling to obtain a weight vector w corresponding to each feature layer in the feature map Z, wherein the formula is as follows:
w=Fex(Z,W)=σ(W2(W1Z))(3)
wherein, FexRepresenting a weight calculation operation, W1,W2All represent fully connected layers, the dimensions are C/r, W represents W1,W2A fully connected layer group formed together; w1Z represents a first full-connection operation, the dimension of an output vector is 1 x C/r, r is a scaling parameter, and the output vector is a ReLu activation function mainly for reducing the dimension and reducing the calculation amount; w2(W1Z) is a second fully-connected layer operation, the output vector dimension is 1 x C, sigma is a Sigmoid activation function, and finally the weight vector w corresponding to each characteristic layer in the characteristic diagram Z is obtained through sigma operation;
the concrete formula of the Reweight stage is as follows:
wherein, FscaleRepresents a weight assignment operation, ucDenotes the c-th feature layer, w, in the feature map UcThe element representing the c-th position in the weight vector w, represents a multiplication operation, i.e. w is addedcAnd ucEach element in the feature layer is multiplied.
A highway congestion identification model construction device comprises:
the data set acquisition and labeling module is used for acquiring a highway traffic monitoring video and storing video frame images as an initial image set; labeling each image in the initial image set to obtain a label set, wherein the labels comprise traffic categories which comprise a congestion state, a saturation state or a smooth state;
the network training module is used for training the deep convolutional neural network by taking the initial image set as input and the label set as output;
the deep convolutional neural network comprises a plurality of feature extraction layers and classification layers which are sequentially arranged; the feature extraction layer comprises 5 feature extraction blocks, and the classification layer comprises 3 full-connection layers;
the feature extraction Block comprises a Bottleneck Layer, a SE Block and a pooling Layer which are connected in series;
the Bottleneck Layer comprises 3 convolution layers with the sizes of 1 × 1, 3 × 3 and 1 × 1 which are sequentially connected in series;
the SE Block comprises 1 convolutional Layer for further performing feature extraction on a feature graph output by a Bottleneck Layer and 1 global average pooling Layer;
the global average pooling layer corresponds to the Squeeze stage, the Excitation stage and the weight stage, wherein the Squeeze stage and the Excitation stage are used for generating weight parameters required by an attention mechanism, and the weight stage is used for acting the global average pooling layer on a feature map obtained by the convolution layer action of the SE Block to generate feature maps capable of reflecting different importance;
the 5 feature extraction blocks of the feature extraction layer are connected in series and then are connected with the 3 full connection layers of the classification layer for final traffic class classification;
and obtaining a highway congestion identification model.
A highway congestion identification method comprises the steps of inputting a highway video frame image to be identified into a highway congestion identification model constructed by the highway congestion identification model construction method, and obtaining an identification result.
Compared with the prior art, the invention has the beneficial technical effects that:
the method is simple to implement, has the end-to-end characteristic, can be applied to traffic jam recognition under the conditions of various traffic scenes and various camera viewing angles, ensures the universality under the road monitoring environment, and meets the requirement of accurately judging the road state in an intelligent traffic monitoring system. The method using deep learning has wide application because of high stability and precision.
Drawings
FIG. 1 is an overall flow chart of the method of the present invention;
FIG. 2 is a partial scene graph of a data set collected in accordance with the present invention;
FIG. 3 is a block diagram of bottleeck layers and SEs in the network of the present invention;
FIG. 4 is an overall block diagram of the network SE-VGG16 according to the present invention;
FIG. 5 is a graph of loss variation and accuracy during network training according to an embodiment of the present invention;
FIG. 6 is a graph of congestion/saturation/clear portion test results for an embodiment of the present invention;
Detailed Description
The following describes in detail specific embodiments of the present invention. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.
The invention aims at a large number of collected highway monitoring video frames to produce a highway traffic jam data set, and designs a classification network, wherein the network structure is fused into a classic structure Bottleneck and Squeeze and Excitation (SE) block in a ResNet network on the basis of VGG-16, so that the highway jam state can be effectively identified.
Example 1:
as shown in fig. 1 to 6, the invention discloses a method for constructing a highway congestion identification model, which comprises the following steps:
labeling each image in the expressway video frame images to obtain a label set, wherein the labels comprise traffic categories which comprise congestion, saturation or unblocked states;
specifically, in this embodiment, the image data of the expressway video frame is from the transportation department in Hangzhou and the division of the city surrounded by xi ' an, and mainly includes Hangzhou Jinqu high speed, xi ' ang high speed and xi ' an city surrounded by high speed, and a fixed camera is used to shoot a total of 900 videos and 674 daytime scenes. The time length of each video is 1-10 minutes, one picture is cut from every 200 frames of the original video and put into a data set, each picture in the data set is marked to be in one of three states of congestion/saturation/unblocked, and the saturation type is prone to being selected for the critical situation that traffic classification is difficult to determine. In the finally generated data set, a training set comprises 21209 samples (congestion 7269 containing 371 scenes, saturation 6559 containing 278 scenes, smoothness 6921 containing 407 scenes), a verification set comprises 3661 samples (congestion 1283 containing 87 scenes, saturation 1157 containing 56 scenes, smoothness 1221 containing 90 scenes), and a test set comprises 6101 samples (congestion 2137 containing 158 scenes, saturation 1929 containing 132 scenes, smoothness 2035 containing 107 scenes). As shown in fig. 2, a partial scene graph of a data set.
Step 2, training a deep convolutional neural network by taking the initial image set as input and the label set as output;
the deep convolutional neural network comprises a plurality of feature extraction layers and classification layers which are sequentially arranged; the feature extraction layer comprises 5 feature extraction blocks, and the classification layer comprises 3 full-connection layers;
the feature extraction Block comprises a Bottleneck Layer, a SE Block and a pooling Layer which are connected in series; wherein, the Bottleneck Layer comprises 3 convolution layers with the sizes of 1 × 1, 3 × 3 and 1 × 1 which are connected in series in sequence; SE Block comprises: the method comprises the following steps that 1 convolution Layer is used for further extracting features of a feature graph output by a Bottleneck Layer and 1 global average pooling Layer, the global average pooling Layer corresponds to the stages of Squeeze, Excitation and weight, the stages of the Squeeze and the Excitation are used for generating weight parameters required by an attention mechanism, and the weight stage acts the global average pooling Layer on the feature graph obtained by the convolution Layer action of SE Block to generate feature graphs capable of reflecting different importance; connecting 3 full-connection layers behind the 5 serial feature extraction blocks for final congestion state classification;
and obtaining a highway congestion identification model.
Specifically, a deep convolutional neural network is built, network input is an RGB image with any size, the network structure comprises 13 convolutional layers, 8 SE block modules, 5 pooling layers, three full-connection layers and a final Softmax layer, and finally 1 x 3 vectors are output and comprise the probability of 3 states of congestion/saturation/smoothness.
The loss function adopts a cross entropy loss function which is most commonly used by classification tasks.
The bottle Layer is also called a Bottleneck Layer, the main purpose of using a 1 × 1 convolution kernel is to reduce the number of model parameters and the calculation amount, and through two times of convolution, the dimensionality of a feature Layer is reduced and then increased, so that feature extraction can be performed more effectively. As shown in fig. 3(a), the original convolutional Layer and the bottleeck Layer structure, from which it can be seen that each bottleeck Layer is a three-Layer structure, where 1 × 1 and 3 × 3 both represent the convolution kernel size and 16 and 64 both represent the number of channels.
The Squeeze and Excitation (SE) Block represents compression and Excitation, the channel dependence in the traditional convolution process is not fully utilized, and the convolution kernel of channel learning is carried out in a local receptive field, so that information outside a specific area of each feature map cannot be utilized. The SE Block adds an Attention mechanism when the network low-layer receptive field is small, the weight of the effective area of each characteristic layer is enlarged, and the weight of the ineffective or low-effect area is compressed, so that the model can utilize global information during training to obtain better training effect.
SE Block can be mainly divided into an Squeeze phase, an Excitation phase, and a weight (adjustment weight) 3 phases. FIG. 3(b) shows a structure of SE Block. Converting each two-dimensional feature channel into a real number with a global receptive field by using global average pooling in the Squeeze stage; generating a weight w for each characteristic channel in an Excitation stage; and multiplying the characteristic channel corresponding to the Squeeze stage by the weight generated in the Excitation stage in the weight stage to obtain characteristic graphs capable of reflecting different importance degrees.
As shown in FIG. 3(b), first, a standard convolution operation, i.e., a conversion operation, F is performedtrX → U, X representing the original input characteristic diagram, U representing the output characteristic diagram, FtrRepresenting a conversion from an original input feature map to an output feature map;
wherein the content of the first and second substances,U∈RW×H×Cr represents a real number space domain, W ', H ', C ' respectively represent the width, height and channel number of the characteristic diagram X, and W, H and C respectively represent the width, height and channel number of the characteristic diagram U, and the specific formula is as follows:
wherein, represents convolution operation, C and s are serial numbers of characteristic diagram channels, and the value ranges are respectively 1-C, 1-C', xsFor the s-th feature layer in the feature map X, vcFor the c-th convolution kernel, the number of channels is the same as that of the feature map X,at the s-th layer of the c-th convolution kernel, ucRepresenting the c-th feature layer in the feature map U;
performing global average pooling in the Squeeze stage, taking the feature map U as a new input, and converting into a new output feature map Z, wherein Z belongs to R 1*1*C1,1 and C respectively represent the width, height and channel number of the characteristic diagram Z, and the specific formula is as follows:
wherein F represents a global average pooling operation, uc(i, j) represents the element in the position of i row and j column on the c th feature layer in the feature map U, zcIn the representation characteristic diagram ZThe c characteristic layer;
in an Excitation stage, performing weight calculation on the feature map Z output by the Squeeze stage global average pooling to obtain a weight vector w corresponding to each feature layer in the feature map Z, wherein the formula is as follows:
w=Fex(Z,W)=σ(W2(W1Z)) (3)
wherein, FexRepresenting a weight calculation operation, W1,W2All represent fully connected layers, the dimensions are C/r, W represents W1,W2A fully connected layer group formed together; w1Z represents a first full-connection operation, the dimension of an output vector is 1 x C/r, r is a scaling parameter, and the output vector is a ReLu activation function mainly for reducing the dimension and reducing the calculation amount; w2(W1Z) is a second fully-connected layer operation, the output vector dimension is 1 x C, sigma is a Sigmoid activation function, and finally the weight vector w corresponding to each characteristic layer in the characteristic diagram Z is obtained through sigma operation;
the concrete formula of the Reweight stage is as follows:
wherein, FscaleRepresents a weight assignment operation, ucDenotes the c-th feature layer, w, in the feature map UcThe element representing the c-th position in the weight vector w, represents a multiplication operation, i.e. w is addedcAnd ucEach element in the feature layer is multiplied.
SE-VGG16 network architecture: and the two structures are effectively combined and added into the original VGG16 network to form a new convolutional neural network SE-VGG16 model. The detailed model structure is shown in fig. 4, wherein the leftmost column is the names of convolutional layers, pooling layers, fully-connected layers and classifiers, the module for adding SE-Block is labeled behind each convolutional layer in the leftmost column, the middle column is the size and number of convolutional cores, and the rightmost column is the output size and number of channels of the feature map. The final network structure contains 13 convolutional layers, 8 SE block modules, 5 pooling layers, three fully-connected layers and the final Softmax layer.
Setting hyper-parameters, training the network: a deep learning framework Caffe is selected to build a network, a random gradient descent method is selected to optimize a network model, the learning rate is set to be 0.01, the batch-size is set to be 32, and the iteration epoch is set to be 80 times. Fig. 5 shows a graph of loss variation and accuracy during network training.
Testing the network performance: and inputting the test sets in the production data set into the network in batches for prediction, comparing the network prediction result with the real label, and calculating the recognition performance of the network on the highway congestion. As shown in fig. 6, the test result is shown in the three states of congestion/saturation/clear.
Example 2:
the embodiment provides a device for constructing a highway congestion identification model, which comprises:
the data set acquisition and labeling module is used for acquiring a highway traffic monitoring video and storing video frame images as an initial image set; labeling each image in the initial image set to obtain a label set, wherein the labels comprise traffic categories which comprise a congestion state, a saturation state or a smooth state;
the network training module is used for training the deep convolutional neural network by taking the initial image set as input and the label set as output;
the deep convolutional neural network comprises a plurality of feature extraction layers and classification layers which are sequentially arranged; the feature extraction layer comprises 5 feature extraction blocks, and the classification layer comprises 3 full-connection layers;
the feature extraction Block comprises a Bottleneck Layer, a SE Block and a pooling Layer which are connected in series;
the Bottleneck Layer comprises 3 convolution layers with the sizes of 1 × 1, 3 × 3 and 1 × 1 which are sequentially connected in series;
the SE Block comprises 1 convolutional Layer for further performing feature extraction on a feature graph output by a Bottleneck Layer and 1 global average pooling Layer;
the global average pooling layer corresponds to the Squeeze stage, the Excitation stage and the weight stage, wherein the Squeeze stage and the Excitation stage are used for generating weight parameters required by an attention mechanism, and the weight stage is used for acting the global average pooling layer on a feature map obtained by the convolution layer action of the SE Block to generate feature maps capable of reflecting different importance;
the 5 feature extraction blocks of the feature extraction layer are connected in series and then are connected with the 3 full connection layers of the classification layer for final traffic class classification;
and obtaining a highway congestion identification model.
Example 3:
the embodiment provides a highway congestion identification method, which includes inputting a to-be-identified highway video frame image into a highway congestion identification model constructed by a highway congestion identification model construction method, and obtaining an identification result.
Example 4:
in order to verify the effectiveness of the method provided by the invention, the network model is trained and tested by using the self-labeling data set.
In the training process, the improved SE-VGG16 model is pre-trained using ImageNet data set first, and then trained using self-labeling data set. Fig. 5(a) is a loss variation curve of the validation set, and fig. 5(b) is an accuracy variation curve of the validation set, wherein the abscissa represents epoch and the ordinate represents a specific numerical value. When model training was not started, the accuracy of the validation set was 81.36% and loss was 0.4762. When the training epoch reaches 30 times, the verification accuracy of the model begins to slowly increase; when the epoch training times reach 60 times, the loss and the verification accuracy of the model both tend to be stable; when the number of epochs of training reaches 80, the accuracy of the verification set reaches 98.19%. It is demonstrated that the model has converged to a better degree.
As shown in fig. 6, the partial test result is a graph of congestion/saturation/clear. Table 1 shows the overall experimental test results, and table 2 shows the test results of each type of congestion condition.
TABLE 1 test results of the experiments
Table 2 test results of each type of congestion
Table 3 identifies the comparison results for the different algorithms, and the SE-VGG16 increased the accuracy by 2.35% over the fine tuning VGG 16. The Bottleneck Layer can reduce the number of parameters, can more effectively extract the characteristics, and the SEBlock module can also obviously improve the network performance by utilizing the interdependency among the channels.
TABLE 3 identification of comparison results by different algorithms
Experimental results show that the expressway congestion identification completed by the method has high precision. The experiment proves the effectiveness of the method provided by the invention to a certain extent.
Claims (6)
1. A method for constructing a highway congestion identification model is characterized by comprising the following steps:
step 1, acquiring a video frame image of a highway to obtain an initial image set;
labeling each image in the initial image set to obtain a label set, wherein the labels comprise traffic categories which comprise congestion, saturation or unblocked states;
step 2, training a deep convolutional neural network by taking the initial image set as input and the label set as output;
the deep convolutional neural network comprises a plurality of feature extraction layers and classification layers which are sequentially arranged; the feature extraction layer comprises 5 feature extraction blocks, and the classification layer comprises 3 full-connection layers;
the feature extraction Block comprises a Bottleneck Layer, a SE Block and a pooling Layer which are connected in series;
the Bottleneck Layer comprises 3 convolution layers with the sizes of 1 × 1, 3 × 3 and 1 × 1 which are sequentially connected in series;
the SE Block comprises 1 convolutional Layer for further performing feature extraction on a feature graph output by a Bottleneck Layer and 1 global average pooling Layer;
the global average pooling layer corresponds to the Squeeze stage, the Excitation stage and the weight stage, wherein the Squeeze stage and the Excitation stage are used for generating weight parameters required by an attention mechanism, and the weight stage is used for acting the global average pooling layer on a feature map obtained by the convolution layer action of the SE Block to generate feature maps capable of reflecting different importance;
the 5 feature extraction blocks of the feature extraction layer are connected in series and then are connected with the 3 full connection layers of the classification layer for final traffic class classification;
and obtaining a highway congestion identification model.
2. The method for constructing the highway congestion identification model according to claim 1, wherein the loss function in the highway congestion identification model adopts a cross entropy loss function.
3. The method for constructing the highway congestion identification model according to claim 1, wherein a deep learning framework Caffe is selected to construct a deep convolutional neural network, a random gradient descent method is selected to optimize the highway congestion identification model, the learning rate is set to be 0.01, the batch-size is set to be 32, and the iteration epoch is set to be 80 times.
4. The method according to claim 1, wherein the global average pooling layer corresponds to the Squeeze, Excitation and weight stages, and each feature channel is converted into a real number with a global receptive field by using global average pooling in the Squeeze stage; generating a weight w for each characteristic channel in an Excitation stage; multiplying the characteristic channel corresponding to the Squeeze stage by the weight generated in the Excitation stage in the weight stage to obtain characteristic graphs capable of reflecting different importance degrees; the method specifically comprises the following steps:
first, a standard convolution operation, i.e. a conversion operation, is performed, FtrX → U, X representing the original input characteristic diagram, U representing the output characteristic diagram, FtrRepresenting a conversion from an original input feature map to an output feature map;
wherein X ∈ RW'×H'×C',U∈RW×H×CR represents a real number space domain, W ', H ', C ' respectively represent the width, height and channel number of the characteristic diagram X, and W, H and C respectively represent the width, height and channel number of the characteristic diagram U, and the specific formula is as follows:
wherein, represents convolution operation, C and s are serial numbers of characteristic diagram channels, and the value ranges are respectively 1-C, 1-C', xsFor the s-th feature layer in the feature map X, vcFor the c-th convolution kernel, the number of channels is the same as that of the feature map X,at the s-th layer of the c-th convolution kernel, ucRepresenting the c-th feature layer in the feature map U;
performing global average pooling in the Squeeze stage, taking the feature map U as a new input, and converting into a new output feature map Z, wherein Z belongs to R1*1*C1,1 and C respectively represent the width, height and channel number of the characteristic diagram Z, and the specific formula is as follows:
wherein F represents a global average pooling operation, uc(i, j) represents the element in the position of i row and j column on the c th feature layer in the feature map U, zcRepresenting the c-th characteristic layer in the characteristic diagram Z;
in an Excitation stage, performing weight calculation on the feature map Z output by the Squeeze stage global average pooling to obtain a weight vector w corresponding to each feature layer in the feature map Z, wherein the formula is as follows:
w=Fex(Z,W)=σ(W2(W1Z)) (3)
wherein, FexRepresenting a weight calculation operation, W1,W2All represent fully connected layers, the dimensions are C/r, W represents W1,W2A fully connected layer group formed together; w1Z represents a first full-connection operation, the dimension of an output vector is 1 x C/r, r is a scaling parameter, and the output vector is a ReLu activation function mainly for reducing the dimension and reducing the calculation amount; w2(W1Z) is a second fully-connected layer operation, the output vector dimension is 1 x C, sigma is a Sigmoid activation function, and finally the weight vector w corresponding to each characteristic layer in the characteristic diagram Z is obtained through sigma operation;
the concrete formula of the Reweight stage is as follows:
wherein, FscaleRepresents a weight assignment operation, ucDenotes the c-th feature layer, w, in the feature map UcThe element representing the c-th position in the weight vector w, represents a multiplication operation, i.e. w is addedcAnd ucEach element in the feature layer is multiplied.
5. A device for constructing a highway congestion identification model is characterized by comprising the following steps:
the data set acquisition and labeling module is used for acquiring a highway traffic monitoring video and storing video frame images as an initial image set; labeling each image in the initial image set to obtain a label set, wherein the labels comprise traffic categories which comprise a congestion state, a saturation state or a smooth state;
the network training module is used for training the deep convolutional neural network by taking the initial image set as input and the label set as output;
the deep convolutional neural network comprises a plurality of feature extraction layers and classification layers which are sequentially arranged; the feature extraction layer comprises 5 feature extraction blocks, and the classification layer comprises 3 full-connection layers;
the feature extraction Block comprises a Bottleneck Layer, a SE Block and a pooling Layer which are connected in series;
the Bottleneck Layer comprises 3 convolution layers with the sizes of 1 × 1, 3 × 3 and 1 × 1 which are sequentially connected in series;
the SE Block comprises 1 convolutional Layer for further performing feature extraction on a feature graph output by a Bottleneck Layer and 1 global average pooling Layer;
the global average pooling layer corresponds to the Squeeze stage, the Excitation stage and the weight stage, wherein the Squeeze stage and the Excitation stage are used for generating weight parameters required by an attention mechanism, and the weight stage is used for acting the global average pooling layer on a feature map obtained by the convolution layer action of the SE Block to generate feature maps capable of reflecting different importance;
the 5 feature extraction blocks of the feature extraction layer are connected in series and then are connected with the 3 full connection layers of the classification layer for final traffic class classification;
and obtaining a highway congestion identification model.
6. A highway congestion identification method is characterized in that a highway video frame image to be identified is input into a highway congestion identification model constructed by the highway congestion identification model construction method as claimed in any one of claims 1 to 4, and an identification result is obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010831009.2A CN112101117A (en) | 2020-08-18 | 2020-08-18 | Expressway congestion identification model construction method and device and identification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010831009.2A CN112101117A (en) | 2020-08-18 | 2020-08-18 | Expressway congestion identification model construction method and device and identification method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112101117A true CN112101117A (en) | 2020-12-18 |
Family
ID=73753918
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010831009.2A Pending CN112101117A (en) | 2020-08-18 | 2020-08-18 | Expressway congestion identification model construction method and device and identification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112101117A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112991719A (en) * | 2021-01-28 | 2021-06-18 | 北京奥泽尔科技发展有限公司 | Traffic congestion prediction method and system based on congestion portrait |
CN113011500A (en) * | 2021-03-22 | 2021-06-22 | 华南理工大学 | Virtual reality scene data set classification method, system, device and medium |
CN113191283A (en) * | 2021-05-08 | 2021-07-30 | 河北工业大学 | Driving path decision method based on emotion change of on-road travelers |
CN113569734A (en) * | 2021-07-28 | 2021-10-29 | 山东力聚机器人科技股份有限公司 | Image identification and classification method and device based on feature recalibration |
CN114429618A (en) * | 2022-01-06 | 2022-05-03 | 电子科技大学 | Congestion identification method based on improved AlexNet network model |
CN116612388A (en) * | 2023-07-17 | 2023-08-18 | 新疆华屹能源发展有限公司 | Blocking removing method and system for oil production well |
CN117152973A (en) * | 2023-10-27 | 2023-12-01 | 贵州宏信达高新科技有限责任公司 | Expressway real-time flow monitoring method and system based on ETC portal data |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106297297A (en) * | 2016-11-03 | 2017-01-04 | 成都通甲优博科技有限责任公司 | Traffic jam judging method based on degree of depth study |
AU2018102037A4 (en) * | 2018-12-09 | 2019-01-17 | Ge, Jiahao Mr | A method of recognition of vehicle type based on deep learning |
CN109447962A (en) * | 2018-10-22 | 2019-03-08 | 天津工业大学 | A kind of eye fundus image hard exudate lesion detection method based on convolutional neural networks |
CN109858495A (en) * | 2019-01-16 | 2019-06-07 | 五邑大学 | A kind of feature extracting method, device and its storage medium based on improvement convolution block |
US20190355128A1 (en) * | 2017-01-06 | 2019-11-21 | Board Of Regents, The University Of Texas System | Segmenting generic foreground objects in images and videos |
CN110796177A (en) * | 2019-10-10 | 2020-02-14 | 温州大学 | Method for effectively reducing neural network overfitting in image classification task |
-
2020
- 2020-08-18 CN CN202010831009.2A patent/CN112101117A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106297297A (en) * | 2016-11-03 | 2017-01-04 | 成都通甲优博科技有限责任公司 | Traffic jam judging method based on degree of depth study |
US20190355128A1 (en) * | 2017-01-06 | 2019-11-21 | Board Of Regents, The University Of Texas System | Segmenting generic foreground objects in images and videos |
CN109447962A (en) * | 2018-10-22 | 2019-03-08 | 天津工业大学 | A kind of eye fundus image hard exudate lesion detection method based on convolutional neural networks |
AU2018102037A4 (en) * | 2018-12-09 | 2019-01-17 | Ge, Jiahao Mr | A method of recognition of vehicle type based on deep learning |
CN109858495A (en) * | 2019-01-16 | 2019-06-07 | 五邑大学 | A kind of feature extracting method, device and its storage medium based on improvement convolution block |
CN110796177A (en) * | 2019-10-10 | 2020-02-14 | 温州大学 | Method for effectively reducing neural network overfitting in image classification task |
Non-Patent Citations (2)
Title |
---|
JIE HU ET AL.: "Squeeze-and-Excitation Networks", 《ARCHITECARXIV:[CS.CV]》, 16 May 2019 (2019-05-16), pages 1 - 13 * |
赵鸣 等: "《智能系统与技术丛书 基于深度学习的自然语言处理》", 机械工业出版社, pages: 71 - 72 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112991719A (en) * | 2021-01-28 | 2021-06-18 | 北京奥泽尔科技发展有限公司 | Traffic congestion prediction method and system based on congestion portrait |
CN112991719B (en) * | 2021-01-28 | 2022-05-24 | 北京奥泽尔科技发展有限公司 | Traffic congestion prediction method and system based on congestion portrait |
CN113011500A (en) * | 2021-03-22 | 2021-06-22 | 华南理工大学 | Virtual reality scene data set classification method, system, device and medium |
CN113011500B (en) * | 2021-03-22 | 2023-08-22 | 华南理工大学 | Classification method, system, equipment and medium for virtual reality scene data set |
CN113191283A (en) * | 2021-05-08 | 2021-07-30 | 河北工业大学 | Driving path decision method based on emotion change of on-road travelers |
CN113569734A (en) * | 2021-07-28 | 2021-10-29 | 山东力聚机器人科技股份有限公司 | Image identification and classification method and device based on feature recalibration |
CN114429618A (en) * | 2022-01-06 | 2022-05-03 | 电子科技大学 | Congestion identification method based on improved AlexNet network model |
CN116612388A (en) * | 2023-07-17 | 2023-08-18 | 新疆华屹能源发展有限公司 | Blocking removing method and system for oil production well |
CN116612388B (en) * | 2023-07-17 | 2023-09-19 | 新疆华屹能源发展有限公司 | Blocking removing method and system for oil production well |
CN117152973A (en) * | 2023-10-27 | 2023-12-01 | 贵州宏信达高新科技有限责任公司 | Expressway real-time flow monitoring method and system based on ETC portal data |
CN117152973B (en) * | 2023-10-27 | 2024-01-05 | 贵州宏信达高新科技有限责任公司 | Expressway real-time flow monitoring method and system based on ETC portal data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112101117A (en) | Expressway congestion identification model construction method and device and identification method | |
CN110188705B (en) | Remote traffic sign detection and identification method suitable for vehicle-mounted system | |
CN110111335B (en) | Urban traffic scene semantic segmentation method and system for adaptive countermeasure learning | |
CN111563508B (en) | Semantic segmentation method based on spatial information fusion | |
WO2022083784A1 (en) | Road detection method based on internet of vehicles | |
CN114202672A (en) | Small target detection method based on attention mechanism | |
CN111461083A (en) | Rapid vehicle detection method based on deep learning | |
CN108537824B (en) | Feature map enhanced network structure optimization method based on alternating deconvolution and convolution | |
CN109784183B (en) | Video saliency target detection method based on cascade convolution network and optical flow | |
CN112785848B (en) | Traffic data prediction method and system | |
CN113255589B (en) | Target detection method and system based on multi-convolution fusion network | |
CN112581409B (en) | Image defogging method based on end-to-end multiple information distillation network | |
CN111461129B (en) | Context prior-based scene segmentation method and system | |
CN115731533A (en) | Vehicle-mounted target detection method based on improved YOLOv5 | |
CN112364855A (en) | Video target detection method and system based on multi-scale feature fusion | |
CN113780132A (en) | Lane line detection method based on convolutional neural network | |
CN111832453A (en) | Unmanned scene real-time semantic segmentation method based on double-path deep neural network | |
CN112819000A (en) | Streetscape image semantic segmentation system, streetscape image semantic segmentation method, electronic equipment and computer readable medium | |
CN114267025A (en) | Traffic sign detection method based on high-resolution network and light-weight attention mechanism | |
CN116206306A (en) | Inter-category characterization contrast driven graph roll point cloud semantic annotation method | |
CN112766378A (en) | Cross-domain small sample image classification model method focusing on fine-grained identification | |
CN113011308A (en) | Pedestrian detection method introducing attention mechanism | |
CN112084897A (en) | Rapid traffic large-scene vehicle target detection method of GS-SSD | |
CN113361528B (en) | Multi-scale target detection method and system | |
CN114639067A (en) | Multi-scale full-scene monitoring target detection method based on attention mechanism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20201218 |