CN112446300A - Method, system and computer storage medium for analyzing traffic density - Google Patents

Method, system and computer storage medium for analyzing traffic density Download PDF

Info

Publication number
CN112446300A
CN112446300A CN202011221447.3A CN202011221447A CN112446300A CN 112446300 A CN112446300 A CN 112446300A CN 202011221447 A CN202011221447 A CN 202011221447A CN 112446300 A CN112446300 A CN 112446300A
Authority
CN
China
Prior art keywords
prediction
frames
traffic
density
traffic flow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011221447.3A
Other languages
Chinese (zh)
Other versions
CN112446300B (en
Inventor
周文略
黎繁胜
翟懿奎
蒋润锦
张俊亮
刘始匡
李汶睿
陈乐轩
黄俊威
詹英培
梁汝桐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuyi University
Original Assignee
Wuyi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuyi University filed Critical Wuyi University
Priority to CN202011221447.3A priority Critical patent/CN112446300B/en
Publication of CN112446300A publication Critical patent/CN112446300A/en
Application granted granted Critical
Publication of CN112446300B publication Critical patent/CN112446300B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • G06V20/47Detecting features for summarising video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Multimedia (AREA)
  • Traffic Control Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a system for analyzing traffic flow density and a computer storage medium, and relates to the field of traffic. The traffic flow density analysis method comprises the following steps: collecting traffic flow videos of a road section to be analyzed; extracting key frames of the traffic stream video to obtain a traffic stream density image; inputting the traffic flow density image into a trained target detection network, and outputting three feature maps with different scales extracted by Darknet-53; performing target prediction on the feature maps with three different scales through a target detection algorithm, and outputting a plurality of prediction boxes, wherein each prediction box comprises prediction box coordinate information and confidence; and outputting the traffic flow density according to the plurality of prediction frames and the road length of the road section to be analyzed. By the aid of the method, accuracy of traffic flow density analysis can be improved.

Description

Method, system and computer storage medium for analyzing traffic density
Technical Field
The invention relates to the field of traffic, in particular to a method and a system for analyzing traffic flow density and a computer storage medium.
Background
With the rapid development of economy in China, the quality of life of people is greatly improved, and the number of people going out is gradually increased to meet the requirement of the quality of life. With the increase of the number of people, many roads, scenic spots, large-scale activities, meetings and the like are in some sensitive areas and roads, and due to the rapid increase of traffic flow, various unpredictable results are caused, and personal and property loss, traffic accidents and the like are brought about. Under the large background that security and protection are more and more emphasized, the control of traffic flow in sensitive areas also becomes an important subject of a new era. Due to the fact that the scene and the environment of the application of the vehicle detection technology are complex, particularly when partial small-size vehicles exist in the scene and shielding exists among the vehicles, the accuracy rate of vehicle density analysis is low.
Disclosure of Invention
The present invention is directed to solving at least one of the problems of the prior art. Therefore, the invention provides a method and a system for analyzing the density of the traffic flow and a computer storage medium, which can improve the accuracy of the density analysis of the traffic flow.
According to an embodiment of the first aspect of the invention, the method for analyzing the traffic density comprises the following steps:
collecting traffic flow videos of a road section to be analyzed;
extracting key frames of the traffic stream video to obtain a traffic stream density image;
inputting the traffic flow density image into a trained target detection network, and outputting three feature maps with different scales extracted by Darknet-53;
performing target prediction on the feature maps with three different scales through a target detection algorithm, and outputting a plurality of prediction boxes, wherein each prediction box comprises prediction box coordinate information and confidence;
and outputting the traffic flow density according to the plurality of prediction frames and the road length of the road section to be analyzed.
According to the above embodiments of the present invention, at least the following advantages are provided: three feature maps with different scales are established through Darknet-53, targets with different sizes in the traffic flow density image can be accurately detected, and therefore the number of traffic flows passing through a road section to be analyzed can be obtained according to the number of the accurately detected targets, and traffic flow density information can be obtained. Because can carry out accurate detection to the target, consequently can promote the accuracy of traffic flow density analysis.
According to some embodiments of the first aspect of the present invention, the performing, by an object detection algorithm, object prediction on the feature maps at three different scales and outputting a plurality of prediction boxes, where each of the prediction boxes includes prediction box coordinate information and confidence, further includes the following steps:
obtaining a plurality of corresponding prior preselection frames according to the size of the characteristic diagram;
respectively carrying out sliding window sampling on pixel points on each feature map through the corresponding prior preselected frame to obtain a prediction frame set, wherein each prediction frame in the prediction frame set comprises prediction frame coordinate information and confidence;
outputting all prediction blocks in the set of prediction blocks.
Therefore, by performing sliding window sampling on each feature map through a priori preselected frame, each pixel point on the feature map can be traversed, and the score condition (namely confidence) of the target contained in the framed pixel and the coordinate information of the predicted frame are obtained.
According to some embodiments of the first aspect of the present invention, the performing, by an object detection algorithm, object prediction on the feature maps at three different scales and outputting a plurality of prediction boxes, where each of the prediction boxes includes prediction box coordinate information and a confidence level, includes the following steps:
acquiring a confidence value of each prediction frame in the prediction frame set, and comparing the confidence value with a reference confidence value;
and deleting the prediction frames which do not meet the value of the reference confidence degree in the prediction frame set according to the comparison result.
Therefore, by setting the reference confidence value, a part of invalid prior preselected frames can be removed, so that the influence of the invalid prior preselected frames on the traffic density analysis is reduced.
According to some embodiments of the first aspect of the present invention, the outputting the traffic density according to the road length of the road segment to be analyzed and the number of the prediction frames comprises:
fusing the prediction frames through a maximum suppression algorithm to obtain target detection frames;
and outputting the traffic density according to the number of the target detection frames and the road length of the road section to be analyzed.
Therefore, by locally searching the prediction frame with the maximum confidence value, redundant prediction frames of the same target can be removed, and the traffic flow density prediction conversion defect is improved.
According to some embodiments of the first aspect of the present invention, the fusing the prediction frames by a maximum suppression algorithm to obtain the target detection frames includes the following steps:
acquiring a target category corresponding to each prediction frame;
classifying all the prediction frames according to the target categories to obtain a prediction frame set corresponding to each category;
sorting each prediction frame set according to the confidence coefficient value of the prediction frames in the prediction frame set;
and respectively carrying out intersection comparison iteration processing on the prediction frames in each prediction frame set to obtain a plurality of target detection frames.
Therefore, by adopting the thought of the maximum value suppression algorithm and carrying out iterative processing on the confidence values and the cross-over ratios of the targets of the same class, a plurality of prediction frames belonging to the same target can be judged, and thus the target detection frame is obtained.
According to some embodiments of the first aspect of the present invention, the size of the a priori preselected box is obtained by:
acquiring a plurality of traffic flow sample density images and the number of preset clustering central points;
and carrying out iterative processing on each traffic flow sample density image through a K-means clustering algorithm to obtain a plurality of prior preselected frames, wherein an objective function of the K-means clustering algorithm is set as the ratio of the non-overlapping part between the prior preselected frames and the corresponding real frames to the union.
According to some embodiments of the first aspect of the present invention, the inputting the traffic density image into a trained target detection network and outputting feature maps of three different scales extracted by a Darknet-53 includes the following steps:
carrying out image whitening processing on the traffic flow density image to obtain a traffic flow density whitened image, wherein the average value of pixels of the traffic flow density whitened image is 0, and the expected value is 1;
and inputting the flow density whitened image into a trained target detection network, and outputting three feature maps with different scales obtained by feature extraction through Darknet-53, wherein the input of each neural network layer of the target detection network is processed through a batch normalization algorithm.
According to a second aspect of the present invention, there is provided a traffic density analysis system, comprising:
the system comprises a video image acquisition terminal, a video analysis terminal and a video analysis terminal, wherein the video image acquisition terminal is used for acquiring traffic flow videos of a road section to be analyzed;
the image processing module is used for extracting key frames of the traffic stream video to obtain a traffic stream density image;
the target detection network module is used for carrying out feature map on the input traffic flow density image and carrying out target prediction on the feature map through a target detection algorithm to obtain a plurality of prediction frames;
and the traffic flow density processing module is used for outputting the traffic flow density according to the plurality of prediction frames and the road length of the road section to be analyzed.
The traffic density analysis system according to the second aspect has all the advantages of the first aspect of the present invention, because it applies the traffic density analysis method according to any one of the first aspect.
According to some embodiments of the second aspect of the present invention, the video image capture terminal is disposed on an aircraft. Through setting up video image acquisition terminal on the aircraft, can acquire the traffic stream video of wider visual angle.
According to a third aspect of the present invention, there is provided a computer storage medium comprising computer-executable instructions stored thereon for performing the method of traffic density analysis of the embodiments of the first aspect.
The computer storage medium of the third aspect has all the advantages of the first aspect of the present invention, since it applies the method of traffic density analysis of any one of the first aspect.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a diagram of the main steps of a method of traffic density analysis according to an embodiment of the present invention;
FIG. 2 is a diagram of the main steps of target prediction according to an embodiment of the present invention;
FIG. 3 is a step chart of traffic density acquisition according to an embodiment of the present invention;
FIG. 4 is a diagram of the steps of target detection box generation according to an embodiment of the present invention;
FIG. 5 is a diagram of the steps of a priori preselection block acquisition of an embodiment of the invention;
fig. 6 is a system configuration diagram of traffic density analysis according to the embodiment of the present invention.
Reference numerals:
the system comprises a video image acquisition terminal 100, an image processing module 200, a target detection network module 300 and a traffic density processing module 400.
Detailed Description
In the description of the present invention, unless otherwise explicitly limited, terms such as arrangement, installation, connection and the like should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific contents of the technical solutions. The meaning of a plurality of means is one or more, the meaning of a plurality of means is two or more, and the meaning of more than, less than, exceeding, etc. is understood as excluding the number, and the meaning of more than, less than, etc. is understood as including the number.
A method for traffic density analysis according to an embodiment of the first aspect of the present invention, as shown in fig. 1, includes the following steps:
and S100, collecting traffic flow videos of the road section to be analyzed.
It should be understood that the traffic stream video may be obtained by a camera on an aircraft, such as an unmanned aerial vehicle or the like, or by a camera provided on the road segment to be analyzed.
And S200, extracting key frames of the traffic flow video to obtain a traffic flow density image.
It should be understood that the keyframe represents a traffic image at a time corresponding to the traffic density at the specified time on the road segment to be analyzed.
And step S300, inputting the traffic flow density image into the trained target detection network, and outputting three feature maps with different scales extracted by Darknet-53.
It will be appreciated that Darknet-53 introduces a Resnet-like residual structure, combining high-level convolution features with low-resolution, high-semantic information with low-level convolution features with high-resolution, low-semantic information, so that all scale features are rich in semantic information. The three feature maps are feature maps with the sizes of 13 × 13, 26 × 26 and 52 × 52, wherein the 13 × 13 scale feature map indicates that one feature point represents a target with 32 × 32 pixels and is responsible for detecting a larger scale target; 26 × 26 represents a feature point representing a target of 16 × 16 pixels, responsible for detecting the mesoscale target; 52 x 52 indicates that one feature point represents an 8 x 8 pixel target, which is responsible for detecting small-scale targets.
And S400, performing target prediction on the three feature maps with different scales through a target detection algorithm, and outputting a plurality of prediction boxes, wherein each prediction box comprises prediction box coordinate information and confidence.
It should be understood that by performing target prediction on feature maps of different scales, targets of different scales can be detected.
And S500, outputting the traffic density according to the plurality of prediction frames and the road length of the road section to be analyzed.
Therefore, three feature maps with different scales are established through Darknet-53, targets with different sizes in the traffic flow density image can be accurately detected, and the number of traffic flows passing through a road section to be analyzed can be obtained according to the number of the accurately detected targets, so that traffic flow density information can be obtained. Because can carry out accurate detection to the target, consequently can promote the accuracy of traffic flow density analysis.
In some embodiments of the first aspect of the present invention, as shown in fig. 2, step S400 comprises the steps of:
and S410, acquiring a plurality of corresponding prior pre-selection frames according to the size of the feature map.
It will be appreciated that the use of a priori preselected boxes of different sizes may make the confidence values of the targets selected by the a priori preselected boxes more trustworthy.
It should be understood that for feature maps of the same size, a plurality of a priori preselected blocks of different sizes may be used for target detection to improve the accuracy of the prediction.
Step S420, performing sliding window sampling on the pixel points on each feature map through the corresponding prior preselected frame to obtain a prediction frame set, where each prediction frame in the prediction frame set includes the coordinate information of the prediction frame and the confidence.
It should be understood that, after a plurality of pixel points on the feature map are selected in the preselection frame, the condition that the framed pixel points contain the target can be judged according to the information that the framed pixel points correspond to the feature map, so that the confidence of the target in the prior preselection frame can be obtained.
And step S430, outputting all the prediction boxes in the prediction box set.
Therefore, by performing sliding window sampling on each feature map through a priori preselected frame, each pixel point on the feature map can be traversed, and the score condition (namely confidence) of the target contained in the framed pixel and the coordinate information of the predicted frame are obtained.
In some embodiments of the first aspect of the present invention, after step S420, the following steps are further included:
and acquiring the confidence value of each prediction frame in the prediction frame set, and comparing the confidence value with the reference confidence value.
And deleting the prediction frames which do not meet the value of the reference confidence degree in the prediction frame set according to the comparison result.
Therefore, by setting the reference confidence value, a part of invalid prior preselected frames can be removed, so that the influence of the invalid prior preselected frames on the traffic density analysis is reduced.
It should be understood that when the confidence of the object of the prior preselected frame is low, indicating that although the centroid position of the object is within the prior preselected frame, the prior preselected frame covers less area of the object, the prior preselected frame may be deleted to bring the pixels of the remaining prior preselected frame closer to the real object.
In some embodiments of the first aspect of the present invention, as shown in fig. 3, step S500 comprises the following steps:
and step S510, carrying out fusion processing on the prediction frames through a maximum suppression algorithm to obtain a plurality of target detection frames.
It should be understood that, since there are a plurality of a priori preselected frames for the same target when sliding window sampling is performed by the a priori preselected frames, a maximum suppression algorithm process is required to frame the a priori preselected frame with the best target effect.
And S520, outputting the traffic density according to the number of the target detection frames and the road length of the road section to be analyzed.
Therefore, by locally searching the prediction frame with the maximum confidence value, redundant prediction frames of the same target can be removed, and the traffic flow density prediction conversion defect is improved.
In some embodiments of the first aspect of the present invention, as shown in fig. 4, step S510 includes the following steps:
step S511, a target category corresponding to each prediction box is obtained.
It should be understood that the feature map contains category information of the target of the prediction box selection.
And S512, classifying all the prediction frames according to the target classes to obtain a prediction frame set corresponding to each class.
It should be understood that assuming the target classes as class A and class B, all prediction boxes are { A }1、A2、A3、A4、B1、B2、B3、B4、B5And according to the target category information, classifying into a category A prediction frame set as follows: { A1、A2、A3、A4H, wherein the confidence relationship of class A is C4>C1>C3>C2; c4 is A4C1 is A1The confidence of (2); c3 is A3The confidence of (2); c2 is A2The confidence of (c). The set of class B prediction boxes is: { B1、B2、B3、B4、B5}。
And step S513, sorting each prediction frame set according to the confidence coefficient value of the prediction frames in the prediction frame set.
It should be understood that, taking the prediction box set of class a as an example, the ordered prediction box set corresponding to class a is: a' ═ a4、A1、A3、A2}。
And step S514, respectively carrying out intersection comparison iteration processing on the prediction frames in each prediction frame set to obtain a plurality of target detection frames.
It should be understood that the cross-over-iteration process is as follows:
selecting the prediction frame A with the maximum confidence coefficient from the prediction frames in the set A4As the target detection frame, the coordinate information of the other prediction frames in the set A' is compared with the coordinate information of the prediction frame A4And carrying out intersection comparison processing on the coordinate information of the prediction frame to obtain an IOU value. When A is1And A4When IOU is larger than the preset threshold value, delete A in the set A1At this time, a' ═ a3、A2}. At this time, the largest prediction box A is selected again from the set A3And to A3Execution A4The same operation is performed until there is no data in set A'. At this time, the prediction frame with the highest confidence level selected from the set a' is the target detection frame.
Therefore, by adopting the thought of the maximum value suppression algorithm and carrying out iterative processing on the confidence values and the cross-over ratios of the targets of the same class, a plurality of prediction frames belonging to the same target can be judged, and thus the target detection frame is obtained.
In some embodiments of the first aspect of the present invention, as shown in fig. 5, the size of the a priori preselected box in step S410 is obtained by:
and S411, acquiring a plurality of traffic flow sample density images and the number of preset clustering center points.
It should be understood that the number of cluster center points may be set according to empirical values.
And S412, carrying out iterative processing on each traffic flow sample density image through a K-means clustering algorithm to obtain a plurality of prior preselected frames, wherein an objective function of the K-means clustering algorithm is set as the ratio of the non-overlapping part between the prior preselected frames and the corresponding real frames to the union.
It should be understood that for the traffic flow sample density image, the real frame information corresponding to each target can be known. Assuming that the intersection ratio of the prior preselected box and the real box is IOU:
Figure BDA0002762175850000071
wherein bb (gt) represents a real frame, and bb (dt) represents a prediction frame.
The objective function is then:
d(box,cnetrd)=1-IOU(box,cnetrd)
wherein the non-overlapping portion between the a priori preselected frame and the corresponding real frame is bb (gt) £ bb (dt) -bb (dt) and $ bb (dt); the union between the a priori preselected frame and the corresponding real frame is bb (gt) £ bb (dt).
In some embodiments of the first aspect of the present invention, step S300, comprises the steps of:
and carrying out image whitening processing on the traffic flow density image to obtain a traffic flow density whitened image, wherein the average value of pixels of the traffic flow density whitened image is 0, and the expected value is 1.
And inputting the flow density whitened image into a trained target detection network, and outputting three feature maps with different scales obtained by feature extraction through Darknet-53, wherein the input of each neural network layer of the target detection network is processed through a batch normalization algorithm.
It should be understood that the batch normalization algorithm changes the distribution of the input values of any neuron of each layer of target detection network into the standard normal distribution with the mean value of 0 and the variance of 1, so as to avoid the problem of gradient disappearance, and at the same time, the learning convergence speed of the target detection network can be increased, and the training speed can be greatly increased.
According to a second aspect of the present invention, there is provided a traffic density analyzing system, as shown in fig. 6, including:
the system comprises a video image acquisition terminal 100, a video image acquisition terminal 100 and a video analysis server, wherein the video image acquisition terminal 100 is used for acquiring traffic stream videos of a road section to be analyzed;
the image processing module 200, the image processing module 200 is used for extracting key frames from the traffic stream video to obtain a traffic stream density image;
the target detection network module 300 is used for performing feature map on the input traffic density image and performing target prediction on the feature map through a target detection algorithm to obtain a plurality of prediction frames;
and the traffic flow density processing module 400, wherein the traffic flow density processing module 400 is used for outputting the traffic flow density according to the plurality of prediction frames and the road length of the road section to be analyzed.
The traffic density analysis system according to the second aspect has all the advantages of the first aspect of the present invention, because it applies the traffic density analysis method according to any one of the first aspect.
According to some embodiments of the second aspect of the invention, the video image capture terminal is disposed on an aircraft. Through setting up video image acquisition terminal on the aircraft, can acquire the traffic stream video of wider visual angle.
It should be understood that the above system may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. The terminal 100 is not limited to a PC computer, a mobile phone, or other communication devices.
According to a third aspect of the present invention, there is provided a computer storage medium comprising computer-executable instructions stored thereon for performing the method of traffic density analysis of the first aspect.
The computer storage medium of the third aspect has all the advantages of the first aspect of the present invention, since it applies the method of traffic density analysis of any one of the first aspect.
It should be understood that the term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer.
A system for applying traffic density analysis according to an embodiment of the present invention will be described in detail in a specific embodiment with reference to fig. 1 to 6. It is to be understood that the following description is only exemplary, and not a specific limitation of the invention.
As shown in fig. 6, the video image capturing terminal 100 is disposed on an aircraft, and the image processing module 200 is in communication connection with the video image capturing terminal 100; the target detection network module 300 is electrically connected with the image processing module 200; the traffic density processing module 400 is electrically connected to the target detection network module 300.
As shown in fig. 1, the system for analyzing the density of the traffic flow performs the following steps:
in step S100, the video image capturing terminal 100 provided on the aircraft captures a traffic video of a road section to be analyzed. And transmits the acquired traffic video to the image processing module 200.
Further, in step S200, the image processing module 200 performs key frame extraction on the traffic stream video to obtain a traffic stream density image.
Further, in step S300, the traffic density image is input into the trained target detection network, and three feature maps of different scales extracted by the Darknet-53 are output.
Specifically, the three feature maps are feature maps with the sizes of 13 × 13, 26 × 26 and 52 × 52, wherein the 13 × 13 scale feature map indicates that one feature point represents a target with 32 × 32 pixels and is responsible for detecting a larger scale target; 26 × 26 represents a feature point representing a target of 16 × 16 pixels, responsible for detecting the mesoscale target; 52 x 52 indicates that one feature point represents an 8 x 8 pixel target, which is responsible for detecting small-scale targets.
Specifically, the target detection network module 300 performs image whitening processing on the traffic flow density image to obtain a traffic flow density whitened image, where the average value of pixels of the traffic flow density whitened image is 0 and the expected value is 1.
Further, the target detection network module 300 inputs the flow density whitened image into the trained target detection network, and outputs feature maps of three different scales obtained through Darknet-53, wherein the input of each neural network layer of the target detection network is processed through a batch normalization algorithm.
Further, in step S400, a target is predicted on the feature maps of three different scales by a target detection algorithm, and a plurality of prediction boxes are output, where each prediction box includes coordinate information of the prediction box and a confidence level.
Specifically, as shown in step S410 in fig. 2, the target detection network module 300 obtains a plurality of a priori pre-selection frames according to the size of the feature map.
Specifically, in step S420, the target detection network module 300 performs sliding window sampling on the pixel points on each feature map through the corresponding prior preselected frame to obtain a prediction frame set, where each prediction frame in the prediction frame set includes the coordinate information of the prediction frame and the confidence level.
Further, the target detection network module 300 obtains a confidence value of each prediction box of the prediction box set, and compares the confidence value with a reference confidence value.
Further, the target detection network module 300 deletes the prediction box that does not satisfy the value of the reference confidence in the prediction box set according to the comparison result.
Specifically, in step S430, the target detection network module 300 outputs all the prediction frames in the prediction frame set to the traffic density processing module 400.
At this time, in step S500, the traffic density processing module 400 outputs the traffic density according to the prediction frames and the road length of the road segment to be analyzed.
Specifically, as shown in step S510 in fig. 3, the traffic density processing module 400 performs fusion processing on the prediction frames through a maximum suppression algorithm to obtain a plurality of target detection frames. Specifically, the traffic density processing module 400 processes as follows:
in step S511 of fig. 4, the target category corresponding to each prediction box is obtained.
In step S512, all the prediction frames are classified according to the target category, so as to obtain a prediction frame set corresponding to each category.
Assume target classes are class A and class B, and all prediction boxes are { A }1、A2、A3、A4、B1、B2、B3、B4、B5And according to the target category information, classifying into a category A prediction frame set as follows: { A1、A2、A3、A4H, wherein the confidence relationship of class A is C4>C1>C3>C2; c4 is A4C1 is A1The confidence of (2); c3 is A3The confidence of (2); c2 is A2The confidence of (c). The set of class B prediction boxes is: { B1、B2、B3、B4、B5}。
In step S513, each prediction box set is sorted according to the confidence value of the prediction box in the prediction box set.
Specifically, taking the prediction box set of class a as an example, the ordered prediction box set corresponding to class a is: a' ═ a4、A1、A3、A2}。
Further, in step S514, intersection comparison iteration processing is performed on the prediction frames in each prediction frame set, so as to obtain a plurality of target detection frames.
The intersection-to-iteration process is as follows:
selecting the prediction frame A with the maximum confidence coefficient from the prediction frames in the set A4As the target detection frame, the coordinate information of the other prediction frames in the set A' is compared with the coordinate information of the prediction frame A4And carrying out intersection comparison processing on the coordinate information of the prediction frame to obtain an IOU value. When A is1And A4When IOU is larger than the preset threshold value, delete A in the set A1At this time, a' ═ a3、A2}. At this time, from the set A' againNext choosing the largest prediction box A3And to A3Execution A4The same operation is performed until there is no data in set A'. At this time, the prediction frame with the highest confidence level selected from the set a' is the target detection frame.
At this time, the processing of steps S512 to S514 is performed for each classified prediction frame, and a plurality of target detection frames are obtained. And sends the target detection frame to the traffic density processing module 400.
At this time, as shown in step S520 of fig. 3, the traffic density processing module 400 outputs the traffic density according to the number of target detection frames and the road length of the road segment to be analyzed.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example," or "some examples" or the like mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims (10)

1. A method for analyzing traffic flow density is characterized by comprising the following steps:
collecting traffic flow videos of a road section to be analyzed;
extracting key frames of the traffic stream video to obtain a traffic stream density image;
inputting the traffic flow density image into a trained target detection network, and outputting three feature maps with different scales extracted by Darknet-53;
performing target prediction on the feature maps with three different scales through a target detection algorithm, and outputting a plurality of prediction boxes, wherein each prediction box comprises prediction box coordinate information and confidence;
and outputting the traffic flow density according to the plurality of prediction frames and the road length of the road section to be analyzed.
2. The method of traffic density analysis according to claim 1,
the target detection algorithm is used for predicting the target on the feature maps with three different scales and outputting a plurality of prediction boxes, wherein each prediction box comprises prediction box coordinate information and confidence degrees, and the method further comprises the following steps:
obtaining a plurality of corresponding prior preselection frames according to the size of the characteristic diagram;
respectively carrying out sliding window sampling on pixel points on each feature map through the corresponding prior preselected frame to obtain a prediction frame set, wherein each prediction frame in the prediction frame set comprises prediction frame coordinate information and confidence;
outputting all prediction blocks in the set of prediction blocks.
3. The method of traffic density analysis according to claim 2,
the target detection algorithm is used for predicting the target on the feature maps with three different scales and outputting a plurality of prediction boxes, wherein each prediction box comprises prediction box coordinate information and confidence degrees, and the method comprises the following steps:
acquiring a confidence value of each prediction frame in the prediction frame set, and comparing the confidence value with a reference confidence value;
and deleting the prediction frames which do not meet the value of the reference confidence degree in the prediction frame set according to the comparison result.
4. The method of traffic density analysis according to claim 3,
the method for outputting the traffic density according to the plurality of prediction frames and the road length of the road section to be analyzed comprises the following steps:
fusing the prediction frames through a maximum suppression algorithm to obtain target detection frames;
and outputting the traffic density according to the number of the target detection frames and the road length of the road section to be analyzed.
5. The method of traffic density analysis according to claim 4,
the method comprises the following steps of performing fusion processing on a plurality of prediction frames through a maximum suppression algorithm to obtain a plurality of target detection frames:
acquiring a target category corresponding to each prediction frame;
classifying all the prediction frames according to the target categories to obtain a prediction frame set corresponding to each category;
sorting each prediction frame set according to the confidence coefficient value of the prediction frames in the prediction frame set;
and respectively carrying out intersection comparison iteration processing on the prediction frames in each prediction frame set to obtain a plurality of target detection frames.
6. The method of traffic density analysis according to claim 2,
the size of the prior preselected frame is obtained by the following steps:
acquiring a plurality of traffic flow sample density images and the number of preset clustering central points;
and carrying out iterative processing on each traffic flow sample density image through a K-means clustering algorithm to obtain a plurality of prior preselected frames, wherein an objective function of the K-means clustering algorithm is set as the ratio of the non-overlapping part between the prior preselected frames and the corresponding real frames to the union.
7. The method of traffic density analysis according to claim 1,
the method for inputting the traffic flow density image into the trained target detection network and outputting the three feature maps with different scales extracted by Darknet-53 comprises the following steps:
carrying out image whitening processing on the traffic flow density image to obtain a traffic flow density whitened image, wherein the average value of pixels of the traffic flow density whitened image is 0, and the expected value is 1;
and inputting the flow density whitened image into a trained target detection network, and outputting three feature maps with different scales obtained by feature extraction through Darknet-53, wherein the input of each neural network layer of the target detection network is processed through a batch normalization algorithm.
8. A system for traffic density analysis, comprising:
the system comprises a video image acquisition terminal, a video analysis terminal and a video analysis terminal, wherein the video image acquisition terminal is used for acquiring traffic flow videos of a road section to be analyzed;
the image processing module is used for extracting key frames of the traffic stream video to obtain a traffic stream density image;
the target detection network module is used for carrying out feature map on the input traffic flow density image and carrying out target prediction on the feature map through a target detection algorithm to obtain a plurality of prediction frames;
and the traffic flow density processing module is used for outputting the traffic flow density according to the plurality of prediction frames and the road length of the road section to be analyzed.
9. The system for traffic density analysis according to claim 8,
the video image acquisition terminal is arranged on the aircraft.
10. A computer storage medium comprising computer-executable instructions stored thereon for performing the method of traffic density analysis of any of claims 1-7.
CN202011221447.3A 2020-11-05 2020-11-05 Method, system and computer storage medium for traffic density analysis Active CN112446300B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011221447.3A CN112446300B (en) 2020-11-05 2020-11-05 Method, system and computer storage medium for traffic density analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011221447.3A CN112446300B (en) 2020-11-05 2020-11-05 Method, system and computer storage medium for traffic density analysis

Publications (2)

Publication Number Publication Date
CN112446300A true CN112446300A (en) 2021-03-05
CN112446300B CN112446300B (en) 2024-01-12

Family

ID=74735961

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011221447.3A Active CN112446300B (en) 2020-11-05 2020-11-05 Method, system and computer storage medium for traffic density analysis

Country Status (1)

Country Link
CN (1) CN112446300B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190057507A1 (en) * 2017-08-18 2019-02-21 Samsung Electronics Co., Ltd. System and method for semantic segmentation of images
CN109815886A (en) * 2019-01-21 2019-05-28 南京邮电大学 A kind of pedestrian and vehicle checking method and system based on improvement YOLOv3
US20200118423A1 (en) * 2017-04-05 2020-04-16 Carnegie Mellon University Deep Learning Methods For Estimating Density and/or Flow of Objects, and Related Methods and Software
CN111091095A (en) * 2019-12-19 2020-05-01 华中科技大学 Method for detecting ship target in remote sensing image

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200118423A1 (en) * 2017-04-05 2020-04-16 Carnegie Mellon University Deep Learning Methods For Estimating Density and/or Flow of Objects, and Related Methods and Software
US20190057507A1 (en) * 2017-08-18 2019-02-21 Samsung Electronics Co., Ltd. System and method for semantic segmentation of images
CN109815886A (en) * 2019-01-21 2019-05-28 南京邮电大学 A kind of pedestrian and vehicle checking method and system based on improvement YOLOv3
CN111091095A (en) * 2019-12-19 2020-05-01 华中科技大学 Method for detecting ship target in remote sensing image

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴玉枝;吴志红;熊运余;: "基于卷积神经网络的小样本车辆检测与识别", 计算机技术与发展, no. 06, pages 7 - 12 *

Also Published As

Publication number Publication date
CN112446300B (en) 2024-01-12

Similar Documents

Publication Publication Date Title
Xie et al. Multilevel cloud detection in remote sensing images based on deep learning
CN109284670B (en) Pedestrian detection method and device based on multi-scale attention mechanism
CN109033950B (en) Vehicle illegal parking detection method based on multi-feature fusion cascade depth model
CN108960266B (en) Image target detection method and device
Jiao et al. A configurable method for multi-style license plate recognition
CN106683119B (en) Moving vehicle detection method based on aerial video image
CN107273832B (en) License plate recognition method and system based on integral channel characteristics and convolutional neural network
US20060067562A1 (en) Detection of moving objects in a video
Wang et al. Object detection using clustering algorithm adaptive searching regions in aerial images
CN111274926B (en) Image data screening method, device, computer equipment and storage medium
EP3151160B1 (en) Visual attention detector and visual attention detection method
CN107358141B (en) Data identification method and device
CN110929593A (en) Real-time significance pedestrian detection method based on detail distinguishing and distinguishing
Zhang et al. License plate localization in unconstrained scenes using a two-stage CNN-RNN
CN107622280B (en) Modularized processing mode image saliency detection method based on scene classification
Li et al. Robust vehicle detection in high-resolution aerial images with imbalanced data
CN113435407B (en) Small target identification method and device for power transmission system
CN110610123A (en) Multi-target vehicle detection method and device, electronic equipment and storage medium
CN113724286A (en) Method and device for detecting saliency target and computer-readable storage medium
Su et al. A new local-main-gradient-orientation HOG and contour differences based algorithm for object classification
CN111881984A (en) Target detection method and device based on deep learning
Ali et al. Vehicle detection and tracking in UAV imagery via YOLOv3 and Kalman filter
Ghahremannezhad et al. Automatic road detection in traffic videos
Ali et al. IRUVD: a new still-image based dataset for automatic vehicle detection
CN112132839B (en) Multi-scale rapid face segmentation method based on deep convolution cascade network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant