CN112101117A - Expressway congestion identification model construction method and device and identification method - Google Patents

Expressway congestion identification model construction method and device and identification method Download PDF

Info

Publication number
CN112101117A
CN112101117A CN202010831009.2A CN202010831009A CN112101117A CN 112101117 A CN112101117 A CN 112101117A CN 202010831009 A CN202010831009 A CN 202010831009A CN 112101117 A CN112101117 A CN 112101117A
Authority
CN
China
Prior art keywords
layer
feature
stage
weight
highway
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010831009.2A
Other languages
Chinese (zh)
Inventor
刘妮
唐心瑶
崔华
袁鸽鸽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changan University
Original Assignee
Changan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changan University filed Critical Changan University
Priority to CN202010831009.2A priority Critical patent/CN112101117A/en
Publication of CN112101117A publication Critical patent/CN112101117A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a device for constructing a highway congestion identification model and an identification method, wherein the method comprises the following steps of 1, acquiring a highway video frame image to obtain an initial image set; labeling each image in the initial image set to obtain a label set, wherein the labels comprise traffic categories which comprise congestion, saturation or unblocked states; step 2, taking the initial image set as input and the label set as output, training a deep convolutional neural network, fusing a network structure into a classic structure Bottleneck Layer and Squeeze and Excitation (SE) Block in a ResNet network on the basis of VGG-16, and simultaneously introducing an attention mechanism to construct a classification network of SE-VGG 16; the method can accurately identify the traffic jam state of the highway, can be applied to the traffic jam identification under the conditions of various traffic scenes and various camera view angles, has the end-to-end characteristic, and is simpler to realize and higher in identification precision.

Description

Expressway congestion identification model construction method and device and identification method
Technical Field
The invention belongs to the technical field of intelligent traffic, and particularly relates to a method and a device for constructing a highway congestion identification model and an identification method.
Background
Traffic congestion detection is important for monitoring traffic conditions and optimizing road network performance. Early traffic monitoring systems used loop detectors to count vehicle travel, collecting occupancy of traffic flow, but had limited ability to provide rich and accurate traffic information. In order to improve accuracy, researchers detect traffic jams by combining a route map with GPS data collected by a GPS tracker or a smartphone. However, these methods are destructive to the road surface, and need to rely on specially constructed and deployed resources, and data is difficult to obtain.
With the continuous reduction of the installation cost of the monitoring cameras, a large number of cameras in a road network can generate massive monitoring data every day, and required traffic flow parameters are extracted by analyzing the video data, so that the method has important practical significance for detecting the traffic jam state and has no damage. The traditional method detects and tracks vehicles in a video in an image processing mode, and counts traffic flow parameters, and the method has low robustness on shielding, is easy to cause wrong calculation of the traffic flow parameters, and affects the recognition result of a congestion state. In recent years, convolutional neural networks have achieved excellent results in image classification and recognition, they are able to automatically learn features in images, and are robust to translation, scaling, and rotation. Therefore, there are some researchers that apply convolutional neural networks to traffic congestion identification. One method is to estimate the traffic flow density through a network, the method is susceptible to the change of camera view angles, the other method is to calculate the congestion degree directly through a classification network, but the data of the current method only comprises specific cities and camera view angles, and the universality is general.
Disclosure of Invention
Aiming at the defects and shortcomings in the prior art, the invention provides a method, a device and a method for constructing a highway congestion identification model, which realize end-to-end identification and overcome the defects of low efficiency and the like of the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for constructing a highway congestion identification model comprises the following steps:
step 1, acquiring a video frame image of a highway to obtain an initial image set;
labeling each image in the initial image set to obtain a label set, wherein the labels comprise traffic categories which comprise congestion, saturation or unblocked states;
step 2, training a deep convolutional neural network by taking the initial image set as input and the label set as output;
the deep convolutional neural network comprises a plurality of feature extraction layers and classification layers which are sequentially arranged; the feature extraction layer comprises 5 feature extraction blocks, and the classification layer comprises 3 full-connection layers;
the feature extraction Block comprises a Bottleneck Layer, a SE Block and a pooling Layer which are connected in series;
the Bottleneck Layer comprises 3 convolution layers with the sizes of 1 × 1, 3 × 3 and 1 × 1 which are sequentially connected in series;
the SE Block comprises 1 convolutional Layer for further performing feature extraction on a feature graph output by a Bottleneck Layer and 1 global average pooling Layer;
the global average pooling layer corresponds to the Squeeze stage, the Excitation stage and the weight stage, wherein the Squeeze stage and the Excitation stage are used for generating weight parameters required by an attention mechanism, and the weight stage is used for acting the global average pooling layer on a feature map obtained by the convolution layer action of the SE Block to generate feature maps capable of reflecting different importance;
the 5 feature extraction blocks of the feature extraction layer are connected in series and then are connected with the 3 full connection layers of the classification layer for final traffic class classification;
and obtaining a highway congestion identification model.
The invention also comprises the following technical characteristics:
specifically, the loss function in the highway congestion identification model adopts a cross entropy loss function.
Specifically, a deep convolutional neural network is constructed by selecting a deep learning frame Caffe, a random gradient descent method is selected to optimize the highway congestion identification model, the learning rate is set to be 0.01, the batch-size is set to be 32, and the iteration epoch is set to be 80 times.
Specifically, the global average pooling layer corresponds to the Squeeze stage, the Excitation stage and the weight stage, and each feature channel is converted into a real number with a global receptive field by using the global average pooling in the Squeeze stage; generating a weight w for each characteristic channel in an Excitation stage; multiplying the characteristic channel corresponding to the Squeeze stage by the weight generated in the Excitation stage in the weight stage to obtain characteristic graphs capable of reflecting different importance degrees; the method specifically comprises the following steps:
first, a standard convolution operation, i.e. a conversion operation, is performed, FtrX → U, X representing the original input characteristic diagram, U representing the output characteristic diagram, FtrRepresenting a conversion from an original input feature map to an output feature map;
wherein the content of the first and second substances,
Figure BDA0002637975580000021
U∈RW×H×Cr represents a real number space domain, W ', H ', C ' respectively represent the width, height and channel number of the characteristic diagram X, and W, H and C respectively represent the width, height and channel number of the characteristic diagram U, and the specific formula is as follows:
Figure BDA0002637975580000031
wherein, represents convolution operation, C and s are serial numbers of characteristic diagram channels, and the value ranges are respectively 1-C, 1-C', xsFor the s-th feature layer in the feature map X, vcFor the c-th convolution kernel, the number of channels is the same as that of the feature map X,
Figure BDA0002637975580000032
at the s-th layer of the c-th convolution kernel, ucRepresenting the c-th feature layer in the feature map U;
performing global average pooling in the Squeeze stage, and taking the feature map U as a new oneInputting, converting into a new output characteristic diagram Z, wherein Z is equal to R 1*1*C1,1 and C respectively represent the width, height and channel number of the characteristic diagram Z, and the specific formula is as follows:
Figure BDA0002637975580000033
wherein F represents a global average pooling operation, uc(i, j) represents the element in the position of i row and j column on the c th feature layer in the feature map U, zcRepresenting the c-th characteristic layer in the characteristic diagram Z;
in an Excitation stage, performing weight calculation on the feature map Z output by the Squeeze stage global average pooling to obtain a weight vector w corresponding to each feature layer in the feature map Z, wherein the formula is as follows:
w=Fex(Z,W)=σ(W2(W1Z))(3)
wherein, FexRepresenting a weight calculation operation, W1,W2All represent fully connected layers, the dimensions are C/r, W represents W1,W2A fully connected layer group formed together; w1Z represents a first full-connection operation, the dimension of an output vector is 1 x C/r, r is a scaling parameter, and the output vector is a ReLu activation function mainly for reducing the dimension and reducing the calculation amount; w2(W1Z) is a second fully-connected layer operation, the output vector dimension is 1 x C, sigma is a Sigmoid activation function, and finally the weight vector w corresponding to each characteristic layer in the characteristic diagram Z is obtained through sigma operation;
the concrete formula of the Reweight stage is as follows:
Figure BDA0002637975580000034
wherein, FscaleRepresents a weight assignment operation, ucDenotes the c-th feature layer, w, in the feature map UcThe element representing the c-th position in the weight vector w, represents a multiplication operation, i.e. w is addedcAnd ucEach element in the feature layer is multiplied.
A highway congestion identification model construction device comprises:
the data set acquisition and labeling module is used for acquiring a highway traffic monitoring video and storing video frame images as an initial image set; labeling each image in the initial image set to obtain a label set, wherein the labels comprise traffic categories which comprise a congestion state, a saturation state or a smooth state;
the network training module is used for training the deep convolutional neural network by taking the initial image set as input and the label set as output;
the deep convolutional neural network comprises a plurality of feature extraction layers and classification layers which are sequentially arranged; the feature extraction layer comprises 5 feature extraction blocks, and the classification layer comprises 3 full-connection layers;
the feature extraction Block comprises a Bottleneck Layer, a SE Block and a pooling Layer which are connected in series;
the Bottleneck Layer comprises 3 convolution layers with the sizes of 1 × 1, 3 × 3 and 1 × 1 which are sequentially connected in series;
the SE Block comprises 1 convolutional Layer for further performing feature extraction on a feature graph output by a Bottleneck Layer and 1 global average pooling Layer;
the global average pooling layer corresponds to the Squeeze stage, the Excitation stage and the weight stage, wherein the Squeeze stage and the Excitation stage are used for generating weight parameters required by an attention mechanism, and the weight stage is used for acting the global average pooling layer on a feature map obtained by the convolution layer action of the SE Block to generate feature maps capable of reflecting different importance;
the 5 feature extraction blocks of the feature extraction layer are connected in series and then are connected with the 3 full connection layers of the classification layer for final traffic class classification;
and obtaining a highway congestion identification model.
A highway congestion identification method comprises the steps of inputting a highway video frame image to be identified into a highway congestion identification model constructed by the highway congestion identification model construction method, and obtaining an identification result.
Compared with the prior art, the invention has the beneficial technical effects that:
the method is simple to implement, has the end-to-end characteristic, can be applied to traffic jam recognition under the conditions of various traffic scenes and various camera viewing angles, ensures the universality under the road monitoring environment, and meets the requirement of accurately judging the road state in an intelligent traffic monitoring system. The method using deep learning has wide application because of high stability and precision.
Drawings
FIG. 1 is an overall flow chart of the method of the present invention;
FIG. 2 is a partial scene graph of a data set collected in accordance with the present invention;
FIG. 3 is a block diagram of bottleeck layers and SEs in the network of the present invention;
FIG. 4 is an overall block diagram of the network SE-VGG16 according to the present invention;
FIG. 5 is a graph of loss variation and accuracy during network training according to an embodiment of the present invention;
FIG. 6 is a graph of congestion/saturation/clear portion test results for an embodiment of the present invention;
Detailed Description
The following describes in detail specific embodiments of the present invention. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.
The invention aims at a large number of collected highway monitoring video frames to produce a highway traffic jam data set, and designs a classification network, wherein the network structure is fused into a classic structure Bottleneck and Squeeze and Excitation (SE) block in a ResNet network on the basis of VGG-16, so that the highway jam state can be effectively identified.
Example 1:
as shown in fig. 1 to 6, the invention discloses a method for constructing a highway congestion identification model, which comprises the following steps:
step 1, acquiring a video frame image of a highway to obtain an initial image set;
labeling each image in the expressway video frame images to obtain a label set, wherein the labels comprise traffic categories which comprise congestion, saturation or unblocked states;
specifically, in this embodiment, the image data of the expressway video frame is from the transportation department in Hangzhou and the division of the city surrounded by xi ' an, and mainly includes Hangzhou Jinqu high speed, xi ' ang high speed and xi ' an city surrounded by high speed, and a fixed camera is used to shoot a total of 900 videos and 674 daytime scenes. The time length of each video is 1-10 minutes, one picture is cut from every 200 frames of the original video and put into a data set, each picture in the data set is marked to be in one of three states of congestion/saturation/unblocked, and the saturation type is prone to being selected for the critical situation that traffic classification is difficult to determine. In the finally generated data set, a training set comprises 21209 samples (congestion 7269 containing 371 scenes, saturation 6559 containing 278 scenes, smoothness 6921 containing 407 scenes), a verification set comprises 3661 samples (congestion 1283 containing 87 scenes, saturation 1157 containing 56 scenes, smoothness 1221 containing 90 scenes), and a test set comprises 6101 samples (congestion 2137 containing 158 scenes, saturation 1929 containing 132 scenes, smoothness 2035 containing 107 scenes). As shown in fig. 2, a partial scene graph of a data set.
Step 2, training a deep convolutional neural network by taking the initial image set as input and the label set as output;
the deep convolutional neural network comprises a plurality of feature extraction layers and classification layers which are sequentially arranged; the feature extraction layer comprises 5 feature extraction blocks, and the classification layer comprises 3 full-connection layers;
the feature extraction Block comprises a Bottleneck Layer, a SE Block and a pooling Layer which are connected in series; wherein, the Bottleneck Layer comprises 3 convolution layers with the sizes of 1 × 1, 3 × 3 and 1 × 1 which are connected in series in sequence; SE Block comprises: the method comprises the following steps that 1 convolution Layer is used for further extracting features of a feature graph output by a Bottleneck Layer and 1 global average pooling Layer, the global average pooling Layer corresponds to the stages of Squeeze, Excitation and weight, the stages of the Squeeze and the Excitation are used for generating weight parameters required by an attention mechanism, and the weight stage acts the global average pooling Layer on the feature graph obtained by the convolution Layer action of SE Block to generate feature graphs capable of reflecting different importance; connecting 3 full-connection layers behind the 5 serial feature extraction blocks for final congestion state classification;
and obtaining a highway congestion identification model.
Specifically, a deep convolutional neural network is built, network input is an RGB image with any size, the network structure comprises 13 convolutional layers, 8 SE block modules, 5 pooling layers, three full-connection layers and a final Softmax layer, and finally 1 x 3 vectors are output and comprise the probability of 3 states of congestion/saturation/smoothness.
The loss function adopts a cross entropy loss function which is most commonly used by classification tasks.
The bottle Layer is also called a Bottleneck Layer, the main purpose of using a 1 × 1 convolution kernel is to reduce the number of model parameters and the calculation amount, and through two times of convolution, the dimensionality of a feature Layer is reduced and then increased, so that feature extraction can be performed more effectively. As shown in fig. 3(a), the original convolutional Layer and the bottleeck Layer structure, from which it can be seen that each bottleeck Layer is a three-Layer structure, where 1 × 1 and 3 × 3 both represent the convolution kernel size and 16 and 64 both represent the number of channels.
The Squeeze and Excitation (SE) Block represents compression and Excitation, the channel dependence in the traditional convolution process is not fully utilized, and the convolution kernel of channel learning is carried out in a local receptive field, so that information outside a specific area of each feature map cannot be utilized. The SE Block adds an Attention mechanism when the network low-layer receptive field is small, the weight of the effective area of each characteristic layer is enlarged, and the weight of the ineffective or low-effect area is compressed, so that the model can utilize global information during training to obtain better training effect.
SE Block can be mainly divided into an Squeeze phase, an Excitation phase, and a weight (adjustment weight) 3 phases. FIG. 3(b) shows a structure of SE Block. Converting each two-dimensional feature channel into a real number with a global receptive field by using global average pooling in the Squeeze stage; generating a weight w for each characteristic channel in an Excitation stage; and multiplying the characteristic channel corresponding to the Squeeze stage by the weight generated in the Excitation stage in the weight stage to obtain characteristic graphs capable of reflecting different importance degrees.
As shown in FIG. 3(b), first, a standard convolution operation, i.e., a conversion operation, F is performedtrX → U, X representing the original input characteristic diagram, U representing the output characteristic diagram, FtrRepresenting a conversion from an original input feature map to an output feature map;
wherein the content of the first and second substances,
Figure BDA0002637975580000061
U∈RW×H×Cr represents a real number space domain, W ', H ', C ' respectively represent the width, height and channel number of the characteristic diagram X, and W, H and C respectively represent the width, height and channel number of the characteristic diagram U, and the specific formula is as follows:
Figure BDA0002637975580000071
wherein, represents convolution operation, C and s are serial numbers of characteristic diagram channels, and the value ranges are respectively 1-C, 1-C', xsFor the s-th feature layer in the feature map X, vcFor the c-th convolution kernel, the number of channels is the same as that of the feature map X,
Figure BDA0002637975580000072
at the s-th layer of the c-th convolution kernel, ucRepresenting the c-th feature layer in the feature map U;
performing global average pooling in the Squeeze stage, taking the feature map U as a new input, and converting into a new output feature map Z, wherein Z belongs to R 1*1*C1,1 and C respectively represent the width, height and channel number of the characteristic diagram Z, and the specific formula is as follows:
Figure BDA0002637975580000073
wherein F represents a global average pooling operation, uc(i, j) represents the element in the position of i row and j column on the c th feature layer in the feature map U, zcIn the representation characteristic diagram ZThe c characteristic layer;
in an Excitation stage, performing weight calculation on the feature map Z output by the Squeeze stage global average pooling to obtain a weight vector w corresponding to each feature layer in the feature map Z, wherein the formula is as follows:
w=Fex(Z,W)=σ(W2(W1Z)) (3)
wherein, FexRepresenting a weight calculation operation, W1,W2All represent fully connected layers, the dimensions are C/r, W represents W1,W2A fully connected layer group formed together; w1Z represents a first full-connection operation, the dimension of an output vector is 1 x C/r, r is a scaling parameter, and the output vector is a ReLu activation function mainly for reducing the dimension and reducing the calculation amount; w2(W1Z) is a second fully-connected layer operation, the output vector dimension is 1 x C, sigma is a Sigmoid activation function, and finally the weight vector w corresponding to each characteristic layer in the characteristic diagram Z is obtained through sigma operation;
the concrete formula of the Reweight stage is as follows:
Figure BDA0002637975580000074
wherein, FscaleRepresents a weight assignment operation, ucDenotes the c-th feature layer, w, in the feature map UcThe element representing the c-th position in the weight vector w, represents a multiplication operation, i.e. w is addedcAnd ucEach element in the feature layer is multiplied.
SE-VGG16 network architecture: and the two structures are effectively combined and added into the original VGG16 network to form a new convolutional neural network SE-VGG16 model. The detailed model structure is shown in fig. 4, wherein the leftmost column is the names of convolutional layers, pooling layers, fully-connected layers and classifiers, the module for adding SE-Block is labeled behind each convolutional layer in the leftmost column, the middle column is the size and number of convolutional cores, and the rightmost column is the output size and number of channels of the feature map. The final network structure contains 13 convolutional layers, 8 SE block modules, 5 pooling layers, three fully-connected layers and the final Softmax layer.
Setting hyper-parameters, training the network: a deep learning framework Caffe is selected to build a network, a random gradient descent method is selected to optimize a network model, the learning rate is set to be 0.01, the batch-size is set to be 32, and the iteration epoch is set to be 80 times. Fig. 5 shows a graph of loss variation and accuracy during network training.
Testing the network performance: and inputting the test sets in the production data set into the network in batches for prediction, comparing the network prediction result with the real label, and calculating the recognition performance of the network on the highway congestion. As shown in fig. 6, the test result is shown in the three states of congestion/saturation/clear.
Example 2:
the embodiment provides a device for constructing a highway congestion identification model, which comprises:
the data set acquisition and labeling module is used for acquiring a highway traffic monitoring video and storing video frame images as an initial image set; labeling each image in the initial image set to obtain a label set, wherein the labels comprise traffic categories which comprise a congestion state, a saturation state or a smooth state;
the network training module is used for training the deep convolutional neural network by taking the initial image set as input and the label set as output;
the deep convolutional neural network comprises a plurality of feature extraction layers and classification layers which are sequentially arranged; the feature extraction layer comprises 5 feature extraction blocks, and the classification layer comprises 3 full-connection layers;
the feature extraction Block comprises a Bottleneck Layer, a SE Block and a pooling Layer which are connected in series;
the Bottleneck Layer comprises 3 convolution layers with the sizes of 1 × 1, 3 × 3 and 1 × 1 which are sequentially connected in series;
the SE Block comprises 1 convolutional Layer for further performing feature extraction on a feature graph output by a Bottleneck Layer and 1 global average pooling Layer;
the global average pooling layer corresponds to the Squeeze stage, the Excitation stage and the weight stage, wherein the Squeeze stage and the Excitation stage are used for generating weight parameters required by an attention mechanism, and the weight stage is used for acting the global average pooling layer on a feature map obtained by the convolution layer action of the SE Block to generate feature maps capable of reflecting different importance;
the 5 feature extraction blocks of the feature extraction layer are connected in series and then are connected with the 3 full connection layers of the classification layer for final traffic class classification;
and obtaining a highway congestion identification model.
Example 3:
the embodiment provides a highway congestion identification method, which includes inputting a to-be-identified highway video frame image into a highway congestion identification model constructed by a highway congestion identification model construction method, and obtaining an identification result.
Example 4:
in order to verify the effectiveness of the method provided by the invention, the network model is trained and tested by using the self-labeling data set.
In the training process, the improved SE-VGG16 model is pre-trained using ImageNet data set first, and then trained using self-labeling data set. Fig. 5(a) is a loss variation curve of the validation set, and fig. 5(b) is an accuracy variation curve of the validation set, wherein the abscissa represents epoch and the ordinate represents a specific numerical value. When model training was not started, the accuracy of the validation set was 81.36% and loss was 0.4762. When the training epoch reaches 30 times, the verification accuracy of the model begins to slowly increase; when the epoch training times reach 60 times, the loss and the verification accuracy of the model both tend to be stable; when the number of epochs of training reaches 80, the accuracy of the verification set reaches 98.19%. It is demonstrated that the model has converged to a better degree.
As shown in fig. 6, the partial test result is a graph of congestion/saturation/clear. Table 1 shows the overall experimental test results, and table 2 shows the test results of each type of congestion condition.
TABLE 1 test results of the experiments
Figure BDA0002637975580000091
Table 2 test results of each type of congestion
Figure BDA0002637975580000092
Table 3 identifies the comparison results for the different algorithms, and the SE-VGG16 increased the accuracy by 2.35% over the fine tuning VGG 16. The Bottleneck Layer can reduce the number of parameters, can more effectively extract the characteristics, and the SEBlock module can also obviously improve the network performance by utilizing the interdependency among the channels.
TABLE 3 identification of comparison results by different algorithms
Figure BDA0002637975580000093
Figure BDA0002637975580000101
Experimental results show that the expressway congestion identification completed by the method has high precision. The experiment proves the effectiveness of the method provided by the invention to a certain extent.

Claims (6)

1. A method for constructing a highway congestion identification model is characterized by comprising the following steps:
step 1, acquiring a video frame image of a highway to obtain an initial image set;
labeling each image in the initial image set to obtain a label set, wherein the labels comprise traffic categories which comprise congestion, saturation or unblocked states;
step 2, training a deep convolutional neural network by taking the initial image set as input and the label set as output;
the deep convolutional neural network comprises a plurality of feature extraction layers and classification layers which are sequentially arranged; the feature extraction layer comprises 5 feature extraction blocks, and the classification layer comprises 3 full-connection layers;
the feature extraction Block comprises a Bottleneck Layer, a SE Block and a pooling Layer which are connected in series;
the Bottleneck Layer comprises 3 convolution layers with the sizes of 1 × 1, 3 × 3 and 1 × 1 which are sequentially connected in series;
the SE Block comprises 1 convolutional Layer for further performing feature extraction on a feature graph output by a Bottleneck Layer and 1 global average pooling Layer;
the global average pooling layer corresponds to the Squeeze stage, the Excitation stage and the weight stage, wherein the Squeeze stage and the Excitation stage are used for generating weight parameters required by an attention mechanism, and the weight stage is used for acting the global average pooling layer on a feature map obtained by the convolution layer action of the SE Block to generate feature maps capable of reflecting different importance;
the 5 feature extraction blocks of the feature extraction layer are connected in series and then are connected with the 3 full connection layers of the classification layer for final traffic class classification;
and obtaining a highway congestion identification model.
2. The method for constructing the highway congestion identification model according to claim 1, wherein the loss function in the highway congestion identification model adopts a cross entropy loss function.
3. The method for constructing the highway congestion identification model according to claim 1, wherein a deep learning framework Caffe is selected to construct a deep convolutional neural network, a random gradient descent method is selected to optimize the highway congestion identification model, the learning rate is set to be 0.01, the batch-size is set to be 32, and the iteration epoch is set to be 80 times.
4. The method according to claim 1, wherein the global average pooling layer corresponds to the Squeeze, Excitation and weight stages, and each feature channel is converted into a real number with a global receptive field by using global average pooling in the Squeeze stage; generating a weight w for each characteristic channel in an Excitation stage; multiplying the characteristic channel corresponding to the Squeeze stage by the weight generated in the Excitation stage in the weight stage to obtain characteristic graphs capable of reflecting different importance degrees; the method specifically comprises the following steps:
first, a standard convolution operation, i.e. a conversion operation, is performed, FtrX → U, X representing the original input characteristic diagram, U representing the output characteristic diagram, FtrRepresenting a conversion from an original input feature map to an output feature map;
wherein X ∈ RW'×H'×C',U∈RW×H×CR represents a real number space domain, W ', H ', C ' respectively represent the width, height and channel number of the characteristic diagram X, and W, H and C respectively represent the width, height and channel number of the characteristic diagram U, and the specific formula is as follows:
Figure FDA0002637975570000021
wherein, represents convolution operation, C and s are serial numbers of characteristic diagram channels, and the value ranges are respectively 1-C, 1-C', xsFor the s-th feature layer in the feature map X, vcFor the c-th convolution kernel, the number of channels is the same as that of the feature map X,
Figure FDA0002637975570000022
at the s-th layer of the c-th convolution kernel, ucRepresenting the c-th feature layer in the feature map U;
performing global average pooling in the Squeeze stage, taking the feature map U as a new input, and converting into a new output feature map Z, wherein Z belongs to R1*1*C1,1 and C respectively represent the width, height and channel number of the characteristic diagram Z, and the specific formula is as follows:
Figure FDA0002637975570000023
wherein F represents a global average pooling operation, uc(i, j) represents the element in the position of i row and j column on the c th feature layer in the feature map U, zcRepresenting the c-th characteristic layer in the characteristic diagram Z;
in an Excitation stage, performing weight calculation on the feature map Z output by the Squeeze stage global average pooling to obtain a weight vector w corresponding to each feature layer in the feature map Z, wherein the formula is as follows:
w=Fex(Z,W)=σ(W2(W1Z)) (3)
wherein, FexRepresenting a weight calculation operation, W1,W2All represent fully connected layers, the dimensions are C/r, W represents W1,W2A fully connected layer group formed together; w1Z represents a first full-connection operation, the dimension of an output vector is 1 x C/r, r is a scaling parameter, and the output vector is a ReLu activation function mainly for reducing the dimension and reducing the calculation amount; w2(W1Z) is a second fully-connected layer operation, the output vector dimension is 1 x C, sigma is a Sigmoid activation function, and finally the weight vector w corresponding to each characteristic layer in the characteristic diagram Z is obtained through sigma operation;
the concrete formula of the Reweight stage is as follows:
Figure FDA0002637975570000024
wherein, FscaleRepresents a weight assignment operation, ucDenotes the c-th feature layer, w, in the feature map UcThe element representing the c-th position in the weight vector w, represents a multiplication operation, i.e. w is addedcAnd ucEach element in the feature layer is multiplied.
5. A device for constructing a highway congestion identification model is characterized by comprising the following steps:
the data set acquisition and labeling module is used for acquiring a highway traffic monitoring video and storing video frame images as an initial image set; labeling each image in the initial image set to obtain a label set, wherein the labels comprise traffic categories which comprise a congestion state, a saturation state or a smooth state;
the network training module is used for training the deep convolutional neural network by taking the initial image set as input and the label set as output;
the deep convolutional neural network comprises a plurality of feature extraction layers and classification layers which are sequentially arranged; the feature extraction layer comprises 5 feature extraction blocks, and the classification layer comprises 3 full-connection layers;
the feature extraction Block comprises a Bottleneck Layer, a SE Block and a pooling Layer which are connected in series;
the Bottleneck Layer comprises 3 convolution layers with the sizes of 1 × 1, 3 × 3 and 1 × 1 which are sequentially connected in series;
the SE Block comprises 1 convolutional Layer for further performing feature extraction on a feature graph output by a Bottleneck Layer and 1 global average pooling Layer;
the global average pooling layer corresponds to the Squeeze stage, the Excitation stage and the weight stage, wherein the Squeeze stage and the Excitation stage are used for generating weight parameters required by an attention mechanism, and the weight stage is used for acting the global average pooling layer on a feature map obtained by the convolution layer action of the SE Block to generate feature maps capable of reflecting different importance;
the 5 feature extraction blocks of the feature extraction layer are connected in series and then are connected with the 3 full connection layers of the classification layer for final traffic class classification;
and obtaining a highway congestion identification model.
6. A highway congestion identification method is characterized in that a highway video frame image to be identified is input into a highway congestion identification model constructed by the highway congestion identification model construction method as claimed in any one of claims 1 to 4, and an identification result is obtained.
CN202010831009.2A 2020-08-18 2020-08-18 Expressway congestion identification model construction method and device and identification method Pending CN112101117A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010831009.2A CN112101117A (en) 2020-08-18 2020-08-18 Expressway congestion identification model construction method and device and identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010831009.2A CN112101117A (en) 2020-08-18 2020-08-18 Expressway congestion identification model construction method and device and identification method

Publications (1)

Publication Number Publication Date
CN112101117A true CN112101117A (en) 2020-12-18

Family

ID=73753918

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010831009.2A Pending CN112101117A (en) 2020-08-18 2020-08-18 Expressway congestion identification model construction method and device and identification method

Country Status (1)

Country Link
CN (1) CN112101117A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112991719A (en) * 2021-01-28 2021-06-18 北京奥泽尔科技发展有限公司 Traffic congestion prediction method and system based on congestion portrait
CN113011500A (en) * 2021-03-22 2021-06-22 华南理工大学 Virtual reality scene data set classification method, system, device and medium
CN113191283A (en) * 2021-05-08 2021-07-30 河北工业大学 Driving path decision method based on emotion change of on-road travelers
CN113569734A (en) * 2021-07-28 2021-10-29 山东力聚机器人科技股份有限公司 Image identification and classification method and device based on feature recalibration
CN114429618A (en) * 2022-01-06 2022-05-03 电子科技大学 Congestion identification method based on improved AlexNet network model
CN116612388A (en) * 2023-07-17 2023-08-18 新疆华屹能源发展有限公司 Blocking removing method and system for oil production well
CN117152973A (en) * 2023-10-27 2023-12-01 贵州宏信达高新科技有限责任公司 Expressway real-time flow monitoring method and system based on ETC portal data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106297297A (en) * 2016-11-03 2017-01-04 成都通甲优博科技有限责任公司 Traffic jam judging method based on degree of depth study
AU2018102037A4 (en) * 2018-12-09 2019-01-17 Ge, Jiahao Mr A method of recognition of vehicle type based on deep learning
CN109447962A (en) * 2018-10-22 2019-03-08 天津工业大学 A kind of eye fundus image hard exudate lesion detection method based on convolutional neural networks
CN109858495A (en) * 2019-01-16 2019-06-07 五邑大学 A kind of feature extracting method, device and its storage medium based on improvement convolution block
US20190355128A1 (en) * 2017-01-06 2019-11-21 Board Of Regents, The University Of Texas System Segmenting generic foreground objects in images and videos
CN110796177A (en) * 2019-10-10 2020-02-14 温州大学 Method for effectively reducing neural network overfitting in image classification task

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106297297A (en) * 2016-11-03 2017-01-04 成都通甲优博科技有限责任公司 Traffic jam judging method based on degree of depth study
US20190355128A1 (en) * 2017-01-06 2019-11-21 Board Of Regents, The University Of Texas System Segmenting generic foreground objects in images and videos
CN109447962A (en) * 2018-10-22 2019-03-08 天津工业大学 A kind of eye fundus image hard exudate lesion detection method based on convolutional neural networks
AU2018102037A4 (en) * 2018-12-09 2019-01-17 Ge, Jiahao Mr A method of recognition of vehicle type based on deep learning
CN109858495A (en) * 2019-01-16 2019-06-07 五邑大学 A kind of feature extracting method, device and its storage medium based on improvement convolution block
CN110796177A (en) * 2019-10-10 2020-02-14 温州大学 Method for effectively reducing neural network overfitting in image classification task

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIE HU ET AL.: "Squeeze-and-Excitation Networks", 《ARCHITECARXIV:[CS.CV]》, 16 May 2019 (2019-05-16), pages 1 - 13 *
赵鸣 等: "《智能系统与技术丛书 基于深度学习的自然语言处理》", 机械工业出版社, pages: 71 - 72 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112991719A (en) * 2021-01-28 2021-06-18 北京奥泽尔科技发展有限公司 Traffic congestion prediction method and system based on congestion portrait
CN112991719B (en) * 2021-01-28 2022-05-24 北京奥泽尔科技发展有限公司 Traffic congestion prediction method and system based on congestion portrait
CN113011500A (en) * 2021-03-22 2021-06-22 华南理工大学 Virtual reality scene data set classification method, system, device and medium
CN113011500B (en) * 2021-03-22 2023-08-22 华南理工大学 Classification method, system, equipment and medium for virtual reality scene data set
CN113191283A (en) * 2021-05-08 2021-07-30 河北工业大学 Driving path decision method based on emotion change of on-road travelers
CN113569734A (en) * 2021-07-28 2021-10-29 山东力聚机器人科技股份有限公司 Image identification and classification method and device based on feature recalibration
CN114429618A (en) * 2022-01-06 2022-05-03 电子科技大学 Congestion identification method based on improved AlexNet network model
CN116612388A (en) * 2023-07-17 2023-08-18 新疆华屹能源发展有限公司 Blocking removing method and system for oil production well
CN116612388B (en) * 2023-07-17 2023-09-19 新疆华屹能源发展有限公司 Blocking removing method and system for oil production well
CN117152973A (en) * 2023-10-27 2023-12-01 贵州宏信达高新科技有限责任公司 Expressway real-time flow monitoring method and system based on ETC portal data
CN117152973B (en) * 2023-10-27 2024-01-05 贵州宏信达高新科技有限责任公司 Expressway real-time flow monitoring method and system based on ETC portal data

Similar Documents

Publication Publication Date Title
CN112101117A (en) Expressway congestion identification model construction method and device and identification method
CN110188705B (en) Remote traffic sign detection and identification method suitable for vehicle-mounted system
CN110111335B (en) Urban traffic scene semantic segmentation method and system for adaptive countermeasure learning
CN111563508B (en) Semantic segmentation method based on spatial information fusion
WO2022083784A1 (en) Road detection method based on internet of vehicles
CN114202672A (en) Small target detection method based on attention mechanism
CN111461083A (en) Rapid vehicle detection method based on deep learning
CN108537824B (en) Feature map enhanced network structure optimization method based on alternating deconvolution and convolution
CN109784183B (en) Video saliency target detection method based on cascade convolution network and optical flow
CN112785848B (en) Traffic data prediction method and system
CN113255589B (en) Target detection method and system based on multi-convolution fusion network
CN112581409B (en) Image defogging method based on end-to-end multiple information distillation network
CN111461129B (en) Context prior-based scene segmentation method and system
CN115731533A (en) Vehicle-mounted target detection method based on improved YOLOv5
CN112364855A (en) Video target detection method and system based on multi-scale feature fusion
CN113780132A (en) Lane line detection method based on convolutional neural network
CN111832453A (en) Unmanned scene real-time semantic segmentation method based on double-path deep neural network
CN112819000A (en) Streetscape image semantic segmentation system, streetscape image semantic segmentation method, electronic equipment and computer readable medium
CN114267025A (en) Traffic sign detection method based on high-resolution network and light-weight attention mechanism
CN116206306A (en) Inter-category characterization contrast driven graph roll point cloud semantic annotation method
CN112766378A (en) Cross-domain small sample image classification model method focusing on fine-grained identification
CN113011308A (en) Pedestrian detection method introducing attention mechanism
CN112084897A (en) Rapid traffic large-scene vehicle target detection method of GS-SSD
CN113361528B (en) Multi-scale target detection method and system
CN114639067A (en) Multi-scale full-scene monitoring target detection method based on attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201218