CN110633678A - Rapid and efficient traffic flow calculation method based on video images - Google Patents

Rapid and efficient traffic flow calculation method based on video images Download PDF

Info

Publication number
CN110633678A
CN110633678A CN201910883883.8A CN201910883883A CN110633678A CN 110633678 A CN110633678 A CN 110633678A CN 201910883883 A CN201910883883 A CN 201910883883A CN 110633678 A CN110633678 A CN 110633678A
Authority
CN
China
Prior art keywords
data
image
time
road surface
camera
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910883883.8A
Other languages
Chinese (zh)
Other versions
CN110633678B (en
Inventor
王亚涛
江龙
赵英
魏世安
邓佳
邓家勇
郑全新
张磊
孟祥松
高志成
黄志举
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Tongfang Software Co Ltd
Tongfang Co Ltd
Original Assignee
Beijing Tongfang Software Co Ltd
Tongfang Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Tongfang Software Co Ltd, Tongfang Co Ltd filed Critical Beijing Tongfang Software Co Ltd
Priority to CN201910883883.8A priority Critical patent/CN110633678B/en
Publication of CN110633678A publication Critical patent/CN110633678A/en
Application granted granted Critical
Publication of CN110633678B publication Critical patent/CN110633678B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0137Measuring and analyzing of parameters relative to traffic conditions for specific applications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Chemical & Material Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Traffic Control Systems (AREA)
  • Image Analysis (AREA)

Abstract

A fast and efficient traffic flow calculation method based on video images relates to target detection based on video images, an intelligent event analysis system applied to video monitoring data in traffic scenes and a traffic parameter calculation system. The invention collects the vehicle video image and processes and analyzes it by the camera or image collecting card device installed on the road, the method steps are: firstly, generating a road surface area and a camera position; secondly, generating an original sampling graph; thirdly, perspective transformation; fourthly, training a light sparse convolutional neural network algorithm model; fifthly, uplink and downlink statistics; and sixthly, predicting the model. Compared with the prior art, the method can realize rapid traffic flow calculation under multiple scenes, different weather conditions and different road surface states, and has the characteristics of high detection precision, good real-time performance and high efficiency.

Description

Rapid and efficient traffic flow calculation method based on video images
Technical Field
The invention relates to target detection based on video images, an intelligent event analysis system applied to video monitoring data in a traffic scene and a traffic parameter calculation system.
Background
At present, the schemes for calculating the traffic flow are mainly a ground induction coil detector, a microwave detector, an intelligent video detection method and the like. The ground induction coil detection belongs to a passive contact type detection technology, and the scheme has high precision in the aspects of traffic flow calculation, traffic occupancy and the like, and is less influenced by weather conditions. However, the construction of the scheme is complex, the coil needs to be buried under the road in the installation process, the road needs to be excavated to interrupt traffic, normal traffic is affected in the construction process, the road is damaged, and the equipment maintenance success rate is high. The microwave detector uses special equipment such as infrared rays, ultrasonic waves or microwaves and the like to finish vehicle detection by transmitting electromagnetic waves and receiving induction information. The scheme is insensitive to the change of the climate conditions, and meanwhile, the equipment is simpler to install. However, the sensitivity of the scheme is not high enough, and a certain false detection rate exists.
An intelligent video image detection method belongs to a non-contact detection technology, and vehicle video images are acquired through a camera or an image acquisition card and other equipment arranged on a road. When a vehicle passes under a monitoring scene, the number of traffic flow is counted when the vehicle target crosses a certain position by detecting and tracking the vehicle target. Compared with other schemes, the video image detection method has the following advantages:
1. the hardware is simple to install and maintain, and the normal traffic of the road surface is not influenced;
2. the traffic condition can be monitored in real time through the video device, and the traffic condition can be intuitively mastered in real time;
3. the collected vehicle information is rich, and the management of traffic managers is facilitated;
4. signals between adjacent monitoring points cannot interfere with each other;
5. the monitoring range can be adjusted and expanded.
The traffic flow refers to the number of vehicles passing through a certain highway point in a certain time; the number of vehicles refers to the number of vehicles on a single static picture; since the same object may last for some time in a continuous video, traffic information cannot be obtained by accumulating a single detection result. In the current traffic flow detection scheme based on image processing, the main flow is vehicle detection and target tracking. The current main vehicle detection methods comprise a background difference method, an interframe difference method, a Vibe algorithm and the like, and the main tracking algorithms comprise TLD tracking, particle filtering, KCF tracking and the like.
The target detection and tracking method based on the video image mainly comprises the following steps:
1. the background difference method is a general method for motion segmentation of a static scene, which performs difference operation on a currently acquired image frame and a background image to obtain a gray level image of a target motion area, performs thresholding on the gray level image to extract the motion area, and updates the background image according to the currently acquired image frame in order to avoid the influence of environmental illumination change. 2. The inter-frame difference method is to subtract pixel values of two adjacent frames or two images separated by several frames in a video stream, and perform thresholding on the subtracted images to extract a motion region in the images.
And 3, storing a sample set for all the pixel points by the Vibe algorithm, wherein the sampling values stored in the sample set are the past pixel values of the pixel points and the pixel values of the neighbor points of the pixel points. And comparing the new pixel value of each frame in the following with the sample historical value in the sample set to judge whether the new pixel value belongs to the background point.
The operation mechanism of TLD tracking is as follows: and the detection module and the tracking module perform complementary interference parallel processing. First, the tracking module estimates the motion of the object by assuming that the motion of the object between adjacent video frames is limited and the tracked object is visible. If the target disappears in the camera field of view, it will cause tracking failure. The detection module assumes that each view frame is independent of each other and performs a full-image search on each frame image to locate areas where objects may appear, based on previously detected and learned object models. As with other target detection methods, errors may occur in the detection module in the TLD, and the errors are divided into two cases, namely, false negative samples and false positive samples. And the learning module evaluates the two errors of the detection module according to the result of the tracking module, generates a training sample according to the evaluation result, updates the target model of the detection module, and updates the key characteristic point of the tracking module, so as to avoid similar errors in the future.
5. Particle filtering is a nonlinear filtering method based on Monte Carlo simulation, and the core idea is to express probability density distribution by randomly sampled particles. Three important steps of particle filtering are: 1) sampling particles, extracting a set of particles from the proposed distribution; 2) particle weighting, namely calculating the weight of each particle according to observation probability distribution, importance distribution and a Bayes formula; 3) estimating output, outputting mean covariance of system state, etc. In addition, in order to cope with the particle degradation phenomenon, strategies such as resampling are also adopted.
KCF is a discriminant tracking method, which generally trains a target detector during tracking, uses the target detector to detect whether the next frame predicted position is a target, and then uses the new detection result to update the training set and further update the target detector. When the target detector is trained, the target area is generally selected as a positive sample, the area around the target is selected as a negative sample, and of course, the more the area closer to the target is, the higher the probability of the area being a positive sample is
The prior art has the following defects:
1. the traditional target detection method is particularly sensitive to image quality, light and camera shake, and meanwhile, for the adhered vehicle targets, single targets cannot be distinguished; meanwhile, the vehicle target detection effect under a large scene is not ideal, and a large number of false detection and missing detection conditions exist.
2. According to the tracking scheme, a plurality of methods are provided for multi-target tracking at present, but staggering and shielding are always difficult problems in tracking, and particularly for a plurality of vehicle targets in a large scene, when the number of targets is large and the shielding is serious at the same time, the tracking effect is not ideal.
3. The target detection method based on deep learning has a good detection effect, but the detection in the CPU mode takes time and cannot achieve a real-time effect.
4. The current most traffic flow calculation schemes adopt the flow, the detection and tracking effects directly determine the accuracy of traffic flow statistics, and the scheme combining the detection module and the tracking module has a certain problem. And because the detection and the tracking are complementary processes, part of targets may be missed for detection if the detection interval period is long, the tracking significance cannot be played if the detection interval period is short, and meanwhile, the algorithm efficiency is low.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention aims to provide a quick and efficient traffic flow calculation method based on a video image. The method can realize rapid traffic flow calculation under multiple scenes, different weather conditions and different road surface states, and has the characteristics of high detection precision, good real-time performance and high efficiency.
In order to achieve the above object, the technical solution of the present invention is implemented as follows:
a fast and efficient traffic flow calculation method based on video images collects vehicle video images through a camera or an image acquisition card device installed on a road and processes and analyzes the vehicle video images, and the method comprises the following steps:
firstly, generating a road surface area and a camera position:
and inputting a video image, and respectively training two tasks of taking the road surface area information and the camera position information as a network. Collecting image data of different pavements, and selecting 5 boundary key points of a pavement area for data annotation on the image data; and marking the position information of the camera relative to the road surface.
For the same network task, the weight of the road surface area information and the loss of the camera position is adjusted when the loss is designed, so that the data dimensionality of the road surface area information and the data dimensionality of the camera position are balanced:
Figure 790715DEST_PATH_IMAGE002
and calculating and generating a virtual mixing line according to the generated road surface area.
Secondly, generating an original sampling graph:
arranging each piece of data in sequence according to the inclination angle direction of the virtual mixing line to form an original sampling graph:
2.1 virtual swizzle point set data:
the generated virtual stirring line is represented by two points, the width of the virtual stirring line is recorded as SW, data of all point sets covered by the virtual stirring line is calculated, and all the point sets are recorded, wherein all the point set data are represented as [ (x1, y1), (x2, y2) ].
2.2 creating a sampling graph:
according to the width of the virtual mixing line, creating an original sampling graph, and recording the original sampling graph as SrcSample; image data much larger than the actual number of frames is created.
2.3 filling sampling chart:
for the 1 st frame data, putting the corresponding point set data on the frame data into [ (M + x1, y1), (M + x2, y2) ] of the SrcSample graph according to the position information; two endpoints (X _ S1, Y _ S1), (X _ S2, Y _ S2) recording the piece of data; for the 2 nd frame data, putting the frame data into the corresponding position of the 2 nd line of the Sample graph according to the position information of the camera, if the camera is on the right side, putting the frame data according to [ (M + x1-SW (time-1), y1 + SW (time-1)), (M + x2-SW (time-1), y2 + SW (time-1)).. the. (M + xn-SW (time-1), yn + SW (time-1)) ]; if the camera is on the left, it is placed as [ (M + x1 + SW (time-1), y1 + SW (time-1)), (M + x2 + SW (time-1), y2 + SW (time-1)).. the. (M + xn + SW (time-1), yn + SW (time-1)) ]; ... Two end points (X _ E1, Y _ E1), (X _ E2, Y _ E2) of the last piece of data are recorded, wherein SW is the width 1 of the sampling line and time is the nth piece of image data.
Thirdly, perspective transformation:
carrying out perspective transformation on the original sampling image, wherein the general formula of the perspective transformation is as follows:
Figure 554272DEST_PATH_IMAGE004
wherein
Figure 981711DEST_PATH_IMAGE006
The linear transformation is realized in such a way that,
Figure 197929DEST_PATH_IMAGE008
the translation transformation is realized by the aid of the translation transformation,
Figure 346014DEST_PATH_IMAGE010
the perspective transformation is realized by the method of the method,
Figure 105022DEST_PATH_IMAGE012
and realizing full-scale transformation.
The mathematical expression of the perspective transformation is:
the original sampling chart obtained records 4 pairs of coordinate points, that is, the coordinates of the original are (X _ S1, Y _ S1), (X _ S2, Y _ S2) and (X _ E1, Y _ E1), (X _ E2, Y _ E2), respectively, and the width and height of the transformed coordinates are set to W and H, so that the 4 changed coordinate points are (0,0), (0, H) and (W, H), (W, 0). And establishing a perspective transformation matrix according to the above transformation formula, and completing the transformation from the original sampling image to the aerial view.
Fourthly, training a light sparse convolution neural network algorithm model:
data is collected from a fixed location on the raw image where the up and down pixels change to within k pixels when no vehicle passes. If this value is exceeded, it is noted that a transition between two rows has occurred.
The light sparse convolutional neural network algorithm comprises the following specific steps:
4.1, the number of vehicles in the flow time series image is marked, and if the marked sample data is T pieces in total, T > = M is normal.
4.2 according to the characteristics of the flow time sequence image pixel mutation, combining the idea of a convolutional neural network, the algorithm formula is as follows:
Figure 16663DEST_PATH_IMAGE016
wherein
Figure 582819DEST_PATH_IMAGE018
Indicating the proportion of each column of pixel transitions,
Figure 711181DEST_PATH_IMAGE020
a feature matrix representing the column is then generated,
Figure 231155DEST_PATH_IMAGE020
and N represents the number of marked vehicles for the feature vector to be trained.
Figure 687544DEST_PATH_IMAGE022
Wherein VjThe pixel value of the j-th row is represented, k is the threshold value, and H is the height of the image.
4.3 model training:
4.3.1 select W data from T data set, initially select No. 1 ~ W, initial minimum error value ErrorMin is a large number
4.3.2 testing the remaining T-W sheets of data according to the corresponding
Figure 130606DEST_PATH_IMAGE018
And
Figure 946115DEST_PATH_IMAGE020
and (5) solving a prediction result, and recording a difference value between the prediction result and the actual marked data, wherein the difference value is Error. If Error is less than the minimum Error value ErrorMin, the Error is updated to ErrorMin, and the corresponding Error is recorded
Figure 499587DEST_PATH_IMAGE020
The value is obtained.
4.3.3 select the 2 nd 2 ~ W +1 st data, solve by W data
Figure 58745DEST_PATH_IMAGE020
(ii) a Then step 4.3.2 is performed.
4.3.4 loops through 4.3.2 and 4.3.3 until ErrorMin no longer changes.
4.3.5 calculate the ratio of the absolute errors, which is used as the tuning parameter in the later model prediction.
Figure 505907DEST_PATH_IMAGE024
Where ErrorMin is the minimum error value in the training process, NiIs the label data of the corresponding image.
Fifthly, uplink and downlink statistics:
5.1 count the number of transitions in each lane
5.2 circularly counting the jump data of a certain number of times, and accumulating the result of each time as AiWherein
5.3 statistics A from left lane to right laneiThe columns smaller than a certain threshold are uniformly stored in an array JiPerforming the following steps; if the column AiLess than the threshold, then JiIs 0; otherwise it is 1.
5.4 statistical array JiAccording to the priori knowledge, the area is within 0.3 ~ 0.8.8 of the image width, the section with the longest length of 0 continuous is selected as the middle green belt or guardrail area, and the middle upper and lower line boundary is obtained according to the starting position of the section.
Sixthly, model prediction:
and step one only needs to be completed once, the operation of step two is performed on each subsequent frame of data, the operation of step three is performed when the calculation period is met, the traffic flow statistics is completed by using the flow timing diagram generated in step three according to the model parameters trained above, and the uplink and downlink flow statistics is completed by combining the uplink and downlink information.
In the traffic flow calculation method, the data labels are labeled according to a counterclockwise sequence, the data of the two upper points are subjected to standardized setting, the height position information of the two upper points is dynamically set according to the size of the image, the height is 16% of the image height, and the intersection point of the line and the road surface edge is used as the two upper boundary key points.
In the traffic flow calculation method, in the data labeling process, data normalization is performed by using relative coordinate position information, namely, the ratio of the coordinate position to the height and width of the image.
In the above-described method for calculating the vehicle flow rate, the generated virtual phantom line is substantially perpendicular to the road surface direction, has the same width as the road surface area, and has a height 1/2 equal to the road surface height.
As the method is adopted, compared with the prior art, the method has the following advantages that:
1. the calculation is fast, and the average time from input to result output of a single picture is 0.005 s;
2. the method is efficient, calculation is not needed in the generation process of the sampling graph, and the efficiency is higher.
3. The reliability is high and stable, and the sampling map can be normally collected and predicted under the conditions of vehicle existence, vehicle absence and vehicle.
4. The whole process is simple, and target detection and tracking are not required;
5. the result is comprehensive and high in precision, and the sampling graph can completely reflect the actual running condition of the traffic flow in a period of time.
6. The adaptability is wide, and the method is suitable for scenes in the day and at night; the method is suitable for high-speed and urban traffic and tunnel scenes; can be suitable for different weather conditions.
The invention is further described with reference to the following figures and detailed description.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
Referring to fig. 1, the fast and efficient traffic flow calculation method based on video images of the present invention collects video images of vehicles through a camera or an image capture card device installed on a road and processes and analyzes the video images, and the method comprises the following steps:
firstly, generating a road surface area and a camera position:
and inputting a video image, and respectively training two tasks of taking the road surface area information and the camera position information as a network. Collecting image data of different pavements, and selecting 5 boundary key points of a pavement area from the image data to perform anticlockwise sequential data annotation. The actual marking finds that three points of the lower part of the image can be easily marked at the position of the road surface boundary. Two points on the upper part of the image are inconsistent due to the fact that the front end of the expressway in different scenes is not consistent, and meanwhile, the boundary characteristics of the especially upper pavement area are not obvious. Therefore, at the time of labeling, the data of the two upper points are standardized. And dynamically setting height position information of two points at the upper end according to the size of the image, wherein the height is 16% of the height of the image, and the intersection point of the line and the edge of the road surface is used as two boundary key points. On one hand, the marking mode can carry out standardization processing on the two points, meanwhile, the influence of the road surface which is too close to the road surface is ignored, and the marking mode has higher precision than the mode without the height line through actual tests.
In the labeling process, relative coordinate position information is adopted for normalization processing, namely the proportion of the coordinate position relative to the height and the width of the image. Therefore, the network structure can be better designed, and the model precision is favorably improved.
And marking the position information of the camera relative to the road surface, wherein the data position is 0.9999 if the camera is on the left side of the road surface, the data position is 0.5 if the camera is in the middle of the road surface, and the data is 0.0001 if the camera is not in the middle of the road surface.
For the same network task, the road surface area and the camera position information need to be calculated, meanwhile, the road surface has 5 points and 10-dimensional information, and the camera position information only has 1-dimensional information. In order to solve the problem of unbalanced data dimension, the road surface area information and the loss of the camera position are weighted and adjusted when the loss is designed, so that the data dimensions of the road surface area information and the camera position are balanced:
Figure 774208DEST_PATH_IMAGE002
the algorithm can quickly and accurately generate the road surface area, and then a virtual blending line is calculated according to the road surface area, wherein the virtual blending line is basically vertical to the road surface direction, has the same width as the road surface area, and has the height approximately about 1/2 of the road surface height. The height, the width, the direction and other key information of the virtual wire mixing ensure that the sampling graph can accurately reflect the passing condition of the vehicle.
Secondly, generating an original sampling graph:
in combination with the practical application situation, since the virtual blending lines are rarely horizontal, if we put each piece of virtual blending line data directly and horizontally on the sampling graph, the target on the sampling graph has large deformation, stretching and other situations, which is not beneficial to the subsequent target number statistics. Therefore, each piece of data is arranged in sequence according to the inclination angle direction of the virtual wire mixing to form an original sampling graph.
2.1 virtual swizzle point set data:
a virtual blending line is generated according to the road surface area, the virtual blending line is represented by two points, the width of the virtual blending line is recorded as SW, and the width of the virtual blending line is 1 in an actual system. It is necessary to calculate data of all point sets covered by the virtual bridle and record all point sets, which are represented by [ (x1, y1), (x2, y 2)........ -. (xn, yn) ], and all point set data represent data to be acquired of each frame of video data.
2.2 creating a sampling graph:
according to the width of the virtual puddle, an original sampling image is created and is marked as SrcSample. In practical application, the traffic flow is counted for 2 minutes, the data of about 1200 frames in 2 minutes in the system, the original sampling graph needs to meet the limit, the virtual stirring line is considered to have a certain inclination angle, image data far larger than the actual frame number is created, and the original graph of 5000 × 5000 is created in the actual system.
2.3 filling sampling chart:
for the 1 st frame data, the corresponding point set data on the frame data is put into [ (M + x1, y1), (M + x2, y2) ] of the SrcSample graph according to the position information, and M in the actual system takes 2500. Recording the two endpoints (X _ S1, Y _ S1), (X _ S2, Y _ S2) of the piece of data in preparation for subsequent perspective transformation; for the 2 nd frame data, putting the 2 nd row corresponding position of the Sample graph according to the camera position information, if the camera is on the right side, because the slope of the virtual line mixing is a negative value, when the data is put, we are according to [ (M + x1-SW (time-1), y1 + SW (time-1)), (M + x2-SW (time-1), y2 + SW (time-1)), (M + xn-SW (time-1), yn + SW (time-1)) ]; if the camera is on the left side, since the slope of the virtual stippling is positive, we are placed with the data as the camera is in the middle (M + x1 + SW (time-1), y1 + SW (time-1)), (M + x2 + SW (time-1), y2 + SW (time-1)). 9. (M + xn + SW (time-1), yn + SW (time-1)) ]; .... The sequence is executed until the 1200 th data, the generation of the original sampling graph is completed, and two endpoints (X _ E1, Y _ E1), (X _ E2, Y _ E2) of the last piece of data are recorded. Wherein SW is the width 1 of the sampling line, and times is the nth image data.
Thirdly, perspective transformation:
in the generated original sampling map, the target has a certain tilt due to a problem of the camera angle, and the original sampling map is subjected to tensile deformation, so that it is necessary to convert the original sampling map into an overhead bird's eye view. The general formula for the perspective transformation is:
Figure 205189DEST_PATH_IMAGE028
wherein
Figure 194136DEST_PATH_IMAGE006
The linear transformation is realized in such a way that,the translation transformation is realized by the aid of the translation transformation,
Figure 167908DEST_PATH_IMAGE010
the perspective transformation is realized by the method of the method,realizing full-scale transformation;
the mathematical expression of the perspective transformation is:
Figure 712339DEST_PATH_IMAGE014
the perspective transformation is the transformation of the image based on four pairs of vertexes, and the coordinates of four pairs of pixel points corresponding to the perspective transformation, namely the coordinates of the original image and the transformed coordinates, can be given to obtain a perspective transformation matrix. Based on the transformation matrix, the perspective transformation of the image can be completed.
The obtained original sampling image records 4 pairs of coordinate points, namely, the coordinates of the original image are respectively (X _ S1, Y _ S1), (X _ S2, Y _ S2) and (X _ E1, Y _ E1), (X _ E2, Y _ E2), the width of the transformed coordinate is set as W and the height of the transformed coordinate is set as H, so that the changed 4 coordinate points are (0,0), (0, H) and (W, H), (W,0), actually, W =500 and H =1200, a perspective transformation matrix is established according to the above transformation formula, and the transformation from the original sampling image to the bird' S-eye view image is completed. The converted target eliminates the influence of the angle, the height and the position of the camera on the acquisition target, and provides effective image data for the next target counting, and the effective image data is recorded as a flow timing chart.
Fourthly, training a light sparse convolution neural network algorithm model:
through analyzing the flow rate sequence diagram, the flow rate sequence diagram is found to have certain difference with an RGB (red, green and blue) diagram directly acquired by a camera, and has some unique characteristics. Since we are data collected from fixed positions on the original image, the spatial positions corresponding to the same column of the image data flow timing chart after perspective transformation are the same, but different in time. Assuming that there is no vehicle passing through the position, the pixel values of the upper and lower lines should be identical, and the pixel values of the upper and lower lines should be changed only when there is a vehicle passing through the position. Considering the changes of light, jitter and the like in the actual operation environment, when no vehicle passes through, the uplink and downlink pixel changes within k pixels, and in the actual application, we take k = 10. If this value is exceeded, we note that a transition between two rows has occurred.
Based on the characteristics of the flow timing diagram, a light sparse convolutional neural network algorithm is designed. The light sparse convolutional neural network algorithm is provided aiming at a flow sequence diagram, is inspired by the convolutional neural network idea in order to quickly and efficiently count a new algorithm idea of vehicle flow, and is combined with a sparse coding algorithm to take the target number statistics as an accumulation result of products of different characteristics and different weights. The algorithm is similar to sparse coding in form, but the variable represents different meanings from sparse coding, and meanwhile, effective features of the image are manually selected and only the weights of the features are trained, so that the algorithm is different from a common convolutional neural network. An algorithm model is formed through training in an actual system, the number of targets is directly predicted through the model in the application process, and the algorithm has the characteristics of few training samples, simplicity in training, quickness in prediction, high efficiency and the like.
The light sparse convolutional neural network algorithm comprises the following specific steps:
4.1, the number of vehicles in the flow time series image is marked, and if the marked sample data is T pieces in total, T > = M is normal.
4.2 according to the characteristics of the flow time sequence image pixel mutation, combining the idea of a convolutional neural network, the algorithm formula is as follows:
Figure 472354DEST_PATH_IMAGE016
wherein
Figure 894108DEST_PATH_IMAGE018
Indicating the proportion of each column of pixel transitions,
Figure 956742DEST_PATH_IMAGE020
a feature matrix representing the column is then generated,
Figure 706523DEST_PATH_IMAGE020
and N represents the number of marked vehicles for the feature vector to be trained.
Figure 145595DEST_PATH_IMAGE022
Wherein VjThe pixel value of the j-th row is shown, k is the threshold value, and in practical application we take 10. H is the height of the image.
4.3 model training:
4.3.1 selecting W data from T data set, initially selecting No. 1 ~ W, initial minimum error value ErrorMin is larger number, actually taking ErrorMin =10000, solving out W data
Figure 687434DEST_PATH_IMAGE020
4.3.2 testing the remaining T-W sheets of data according to the corresponding
Figure 547068DEST_PATH_IMAGE018
And
Figure 908779DEST_PATH_IMAGE020
and (5) solving a prediction result, and recording a difference value between the prediction result and the actual marked data, wherein the difference value is Error. If Error is less than the minimum Error value ErrorMin, the Error is updated to ErrorMin, and the corresponding Error is recorded
Figure 151542DEST_PATH_IMAGE020
The value is obtained.
4.3.3 select the 2 nd 2 ~ W +1 st data, solve by W data
Figure 157675DEST_PATH_IMAGE020
(ii) a Then step 4.3.2 is performed.
4.3.4 loops through 4.3.2 and 4.3.3 until ErrorMin no longer changes.
4.3.5, calculating the proportion of the absolute error, wherein the proportion is used as an adjusting parameter in the later model prediction;
Figure 562112DEST_PATH_IMAGE024
where ErrorMin is the minimum error value in the training process, NiIs the label data of the corresponding image.
Fifthly, uplink and downlink statistics:
based on the above characteristics of the traffic timing diagram, we count the uplink and downlink boundaries according to the hopping situation. The left side has the condition that the vehicle jumps through each column, the right side has the condition that the vehicle jumps through each column, and the middle green belt or the guardrail part has no vehicle to pass through all the time, so the jump is 0. Based on the above, we perform the positioning of the uplink and downlink boundary by a statistical method, and in consideration of the situation that there is a possibility that the road surface does not pass through the vehicle in a short time, we count the jump situation in a period of time (10 times of data) and finally determine the boundary.
5.1 count the number of transitions in each lane
Figure 347139DEST_PATH_IMAGE018
5.2 cycle count 10 times data, add 10 results as AiWherein
Figure 862434DEST_PATH_IMAGE027
5.3 statistics A from left lane to right laneiThe columns smaller than a certain threshold are uniformly stored in an array JiPerforming the following steps; if the column AiLess than the threshold, then JiIs 0; otherwise it is 1.
5.4 statistical array JiAccording to the priori knowledge, the area is within the range of 0.3 ~ 0.8.8 of the image width, the section with the longest length of 0 is selected as the middle green belt or guardrail area, and the middle upper and lower line boundary is obtained according to the starting position of the section.
Thirdly, model prediction:
and step one only needs to be completed once, the operation of step two is performed on each subsequent frame of data, the operation of step three is performed when the calculation period is met, the traffic flow statistics is completed by using the flow timing diagram generated in step three according to the model parameters trained above, and the uplink and downlink flow statistics is completed by combining the uplink and downlink information.
In summary, the innovation points of the present invention different from the prior art are:
1. the idea of converting the original sampling image into a bird's-eye view through perspective transformation;
2. an algorithm for counting an uplink and a downlink boundary;
3. the algorithm idea of the light sparse convolutional neural network algorithm operation;
training and optimizing process of light sparse convolution neural network algorithm operation.

Claims (4)

1. A fast and efficient traffic flow calculation method based on video images collects vehicle video images through a camera or an image acquisition card device installed on a road and processes and analyzes the vehicle video images, and the method comprises the following steps:
firstly, generating a road surface area and a camera position:
inputting a video image, and respectively training two tasks which take the road surface area information and the camera position information as a network; collecting image data of different pavements, and selecting 5 boundary key points of a pavement area for data annotation on the image data; marking the position information of the camera relative to the road surface;
for the same network task, the weight of the road surface area information and the loss of the camera position is adjusted when the loss is designed, so that the data dimensionality of the road surface area information and the data dimensionality of the camera position are balanced:
Figure DEST_PATH_IMAGE001
calculating and generating a virtual wire mixing according to the generated road surface area;
secondly, generating an original sampling graph:
arranging each piece of data in sequence according to the inclination angle direction of the virtual mixing line to form an original sampling graph:
2.1 virtual swizzle point set data:
the generated virtual stirring line is represented by two points, the width of the virtual stirring line is recorded as SW, data of all point sets covered by the virtual stirring line are calculated, and all the point sets are recorded, wherein all the point set data are represented as [ (x1, y1), (x2, y2) ]. the. (xn, yn) ] and represent data to be acquired of each frame of video data;
2.2 creating a sampling graph:
according to the width of the virtual mixing line, creating an original sampling graph, and recording the original sampling graph as SrcSample; creating image data much larger than the actual number of frames;
2.3 filling sampling chart:
for the 1 st frame data, putting the corresponding point set data on the frame data into [ (M + x1, y1), (M + x2, y2) ] of the SrcSample graph according to the position information; two endpoints (X _ S1, Y _ S1), (X _ S2, Y _ S2) recording the piece of data; for the 2 nd frame data, putting the frame data into the corresponding position of the 2 nd line of the Sample graph according to the position information of the camera, if the camera is on the right side, putting the frame data according to [ (M + x1-SW (time-1), y1 + SW (time-1)), (M + x2-SW (time-1), y2 + SW (time-1)).. the. (M + xn-SW (time-1), yn + SW (time-1)) ]; if the camera is on the left, it is placed as [ (M + x1 + SW (time-1), y1 + SW (time-1)), (M + x2 + SW (time-1), y2 + SW (time-1)).. the. (M + xn + SW (time-1), yn + SW (time-1)) ]; .... sequentially executing the steps till 1200 th time of data, finishing generating an original sampling graph, and recording two endpoints (X _ E1, Y _ E1), (X _ E2, Y _ E2) of the last piece of data, wherein SW is the width 1 of a sampling line, and time is nth image data;
thirdly, perspective transformation:
carrying out perspective transformation on the original sampling image, wherein the general formula of the perspective transformation is as follows:whereinThe linear transformation is realized in such a way that,
Figure 586660DEST_PATH_IMAGE007
the translation transformation is realized by the aid of the translation transformation,
Figure 889858DEST_PATH_IMAGE009
the perspective transformation is realized by the method of the method,realizing full-scale transformation;
the mathematical expression of the perspective transformation is:
Figure 198797DEST_PATH_IMAGE013
4 pairs of coordinate points are recorded in the obtained original sampling image, namely the coordinates of the original image are respectively (X _ S1, Y _ S1), (X _ S2, Y _ S2) and (X _ E1, Y _ E1), (X _ E2, Y _ E2), the width of the transformed coordinates is W and the height of the transformed coordinates is H, so that the 4 transformed coordinate points are (0,0), (0, H) and (W, H), (W,0), a perspective transformation matrix is established according to the above transformation formula, and the transformation from the original sampling image to the bird' S eye view image is completed;
fourthly, training a light sparse convolution neural network algorithm model:
data collected from a fixed position on an original image, wherein when no vehicle passes through the position, the uplink and downlink pixels are changed within k pixels; if the value exceeds the value, recording that jump occurs between two rows;
the light sparse convolutional neural network algorithm comprises the following specific steps:
4.1, marking the number of vehicles in the flow time sequence images, assuming that the number of marking sample data is T, and usually T > = M;
4.2 according to the characteristics of the flow time sequence image pixel mutation, combining the idea of a convolutional neural network, the algorithm formula is as follows:
Figure 872355DEST_PATH_IMAGE015
wherein
Figure 805676DEST_PATH_IMAGE017
Indicating the proportion of each column of pixel transitions,
Figure 5713DEST_PATH_IMAGE019
a feature matrix representing the column is then generated,
Figure 169716DEST_PATH_IMAGE019
representing the number of marked vehicles by N, which is a feature vector to be trained;
Figure 658466DEST_PATH_IMAGE021
wherein VjThe pixel value of the j-th line is represented, k is a threshold value, and H is the height of the image;
4.3 model training:
4.3.1 selecting W data from T data set, initially selecting No. 1 ~ W, with initial minimum error value ErrorMin being a larger number, passing WOpen up the data, solve out
Figure 926636DEST_PATH_IMAGE019
4.3.2 testing the remaining T-W sheets of data according to the correspondingAnd
Figure 83128DEST_PATH_IMAGE019
calculating a prediction result, recording a difference value between the prediction result and actual labeled data, and recording the difference value as Error; if Error is less than the minimum Error value ErrorMin, the Error is updated to ErrorMin, and the corresponding Error is recordedA value;
4.3.3 select the 2 nd 2 ~ W +1 st data, solve by W data
Figure 740823DEST_PATH_IMAGE019
(ii) a Then executing the step 4.3.2;
4.3.4 loop through 4.3.2 and 4.3.3 until ErrorMin no longer changes;
4.3.5, calculating the proportion of the absolute error, wherein the proportion is used as an adjusting parameter in the later model prediction;
where ErrorMin is the minimum error value in the training process, NiLabeling data of the corresponding image;
fifthly, uplink and downlink statistics:
5.1 count the number of transitions in each lane
5.2 Loop statistics of hop data for a certain number of timesAdding up the results of each time to AiWherein
Figure 386065DEST_PATH_IMAGE025
5.3 statistics A from left lane to right laneiThe columns smaller than a certain threshold are uniformly stored in an array JiPerforming the following steps; if the column AiLess than the threshold, then JiIs 0; otherwise, the value is 1;
5.4 statistical array JiSelecting a section with the maximum length of 0 continuously as a middle green belt or guardrail area according to the priori knowledge that the area is within the range of 0.3 ~ 0.8.8 of the image width, and calculating a middle uplink and downlink boundary according to the initial position of the section;
sixthly, model prediction:
and step one only needs to be completed once, the operation of step two is performed on each subsequent frame of data, the operation of step three is performed when the calculation period is met, the traffic flow statistics is completed by using the flow timing diagram generated in step three according to the model parameters trained above, and the uplink and downlink flow statistics is completed by combining the uplink and downlink information.
2. The method for rapidly and efficiently calculating the traffic flow based on the video image as claimed in claim 1, wherein the data labels are labeled according to a counterclockwise sequence, data of two upper points are standardized, height position information of the two upper points is dynamically set according to the size of the image, the height is 16% of the height of the image, and intersection points of lines and road surface edges are used as two upper boundary key points.
3. The method for fast and efficiently calculating the traffic flow based on the video image according to claim 1 or 2, wherein in the data labeling process, data normalization is performed by using relative coordinate position information, namely, the ratio of the coordinate position to the height and the width of the image.
4. The method for fast and efficiently calculating the traffic flow according to claim 3, wherein the generated virtual blending line is substantially perpendicular to the road surface, has the same width as the road surface area, and has a height at 1/2 of the road surface height.
CN201910883883.8A 2019-09-19 2019-09-19 Quick and efficient vehicle flow calculation method based on video image Active CN110633678B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910883883.8A CN110633678B (en) 2019-09-19 2019-09-19 Quick and efficient vehicle flow calculation method based on video image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910883883.8A CN110633678B (en) 2019-09-19 2019-09-19 Quick and efficient vehicle flow calculation method based on video image

Publications (2)

Publication Number Publication Date
CN110633678A true CN110633678A (en) 2019-12-31
CN110633678B CN110633678B (en) 2023-12-22

Family

ID=68971520

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910883883.8A Active CN110633678B (en) 2019-09-19 2019-09-19 Quick and efficient vehicle flow calculation method based on video image

Country Status (1)

Country Link
CN (1) CN110633678B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113034915A (en) * 2021-03-29 2021-06-25 北京卓视智通科技有限责任公司 Double-spectrum traffic incident detection method and device
CN113034916A (en) * 2021-03-31 2021-06-25 北京同方软件有限公司 Multitask traffic event and traffic parameter calculation method
CN114120650A (en) * 2021-12-15 2022-03-01 阿波罗智联(北京)科技有限公司 Method and device for generating test result
CN115240429A (en) * 2022-08-11 2022-10-25 深圳市城市交通规划设计研究中心股份有限公司 Pedestrian and vehicle flow statistical method, electronic equipment and storage medium
CN115497303A (en) * 2022-08-19 2022-12-20 招商新智科技有限公司 Expressway vehicle speed detection method and system under complex detection condition

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000222669A (en) * 1999-01-28 2000-08-11 Mitsubishi Electric Corp Traffic flow estimating device and traffic flow estimating method
CN102768804A (en) * 2012-07-30 2012-11-07 江苏物联网研究发展中心 Video-based traffic information acquisition method
US20130088600A1 (en) * 2011-10-05 2013-04-11 Xerox Corporation Multi-resolution video analysis and key feature preserving video reduction strategy for (real-time) vehicle tracking and speed enforcement systems
CN103366581A (en) * 2013-06-28 2013-10-23 南京云创存储科技有限公司 Traffic flow counting device and counting method
CN104658272A (en) * 2015-03-18 2015-05-27 哈尔滨工程大学 Street traffic volume statistics and sped measurement method based on binocular stereo vision
CN105261034A (en) * 2015-09-15 2016-01-20 杭州中威电子股份有限公司 Method and device for calculating traffic flow on highway
CN106127137A (en) * 2016-06-21 2016-11-16 长安大学 A kind of target detection recognizer based on 3D trajectory analysis
CN106878674A (en) * 2017-01-10 2017-06-20 哈尔滨工业大学深圳研究生院 A kind of parking detection method and device based on monitor video
CN109584558A (en) * 2018-12-17 2019-04-05 长安大学 A kind of traffic flow statistics method towards Optimization Control for Urban Traffic Signals
CN110021174A (en) * 2019-04-02 2019-07-16 北京同方软件有限公司 A kind of vehicle flowrate calculation method for being applicable in more scenes based on video image

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000222669A (en) * 1999-01-28 2000-08-11 Mitsubishi Electric Corp Traffic flow estimating device and traffic flow estimating method
US20130088600A1 (en) * 2011-10-05 2013-04-11 Xerox Corporation Multi-resolution video analysis and key feature preserving video reduction strategy for (real-time) vehicle tracking and speed enforcement systems
CN102768804A (en) * 2012-07-30 2012-11-07 江苏物联网研究发展中心 Video-based traffic information acquisition method
CN103366581A (en) * 2013-06-28 2013-10-23 南京云创存储科技有限公司 Traffic flow counting device and counting method
CN104658272A (en) * 2015-03-18 2015-05-27 哈尔滨工程大学 Street traffic volume statistics and sped measurement method based on binocular stereo vision
CN105261034A (en) * 2015-09-15 2016-01-20 杭州中威电子股份有限公司 Method and device for calculating traffic flow on highway
CN106127137A (en) * 2016-06-21 2016-11-16 长安大学 A kind of target detection recognizer based on 3D trajectory analysis
CN106878674A (en) * 2017-01-10 2017-06-20 哈尔滨工业大学深圳研究生院 A kind of parking detection method and device based on monitor video
CN109584558A (en) * 2018-12-17 2019-04-05 长安大学 A kind of traffic flow statistics method towards Optimization Control for Urban Traffic Signals
CN110021174A (en) * 2019-04-02 2019-07-16 北京同方软件有限公司 A kind of vehicle flowrate calculation method for being applicable in more scenes based on video image

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
程瑶;卜方玲;严宏海;: "基于网络监控视频流的车流量实时估计方法", 科学技术与工程, no. 28 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113034915A (en) * 2021-03-29 2021-06-25 北京卓视智通科技有限责任公司 Double-spectrum traffic incident detection method and device
CN113034916A (en) * 2021-03-31 2021-06-25 北京同方软件有限公司 Multitask traffic event and traffic parameter calculation method
CN113034916B (en) * 2021-03-31 2022-07-01 北京同方软件有限公司 Multitask traffic event and traffic parameter calculation method
CN114120650A (en) * 2021-12-15 2022-03-01 阿波罗智联(北京)科技有限公司 Method and device for generating test result
CN114120650B (en) * 2021-12-15 2023-08-08 阿波罗智联(北京)科技有限公司 Method and device for generating test results
CN115240429A (en) * 2022-08-11 2022-10-25 深圳市城市交通规划设计研究中心股份有限公司 Pedestrian and vehicle flow statistical method, electronic equipment and storage medium
CN115240429B (en) * 2022-08-11 2023-02-14 深圳市城市交通规划设计研究中心股份有限公司 Pedestrian and vehicle flow statistical method, electronic equipment and storage medium
CN115497303A (en) * 2022-08-19 2022-12-20 招商新智科技有限公司 Expressway vehicle speed detection method and system under complex detection condition

Also Published As

Publication number Publication date
CN110633678B (en) 2023-12-22

Similar Documents

Publication Publication Date Title
CN110633678B (en) Quick and efficient vehicle flow calculation method based on video image
CN109977812B (en) Vehicle-mounted video target detection method based on deep learning
CN111368687B (en) Sidewalk vehicle illegal parking detection method based on target detection and semantic segmentation
CN106874863B (en) Vehicle illegal parking and reverse running detection method based on deep convolutional neural network
CN102819764B (en) Method for counting pedestrian flow from multiple views under complex scene of traffic junction
CN103116987B (en) Traffic flow statistic and violation detection method based on surveillance video processing
CN102542289B (en) Pedestrian volume statistical method based on plurality of Gaussian counting models
CN102768804B (en) Video-based traffic information acquisition method
CN104978567B (en) Vehicle checking method based on scene classification
CN105335701B (en) A kind of pedestrian detection method based on HOG Yu D-S evidence theory multi-information fusion
CN106682586A (en) Method for real-time lane line detection based on vision under complex lighting conditions
Pan et al. Traffic surveillance system for vehicle flow detection
CN105632170A (en) Mean shift tracking algorithm-based traffic flow detection method
CN105608431A (en) Vehicle number and traffic flow speed based highway congestion detection method
CN103268470A (en) Method for counting video objects in real time based on any scene
CN102592138A (en) Object tracking method for intensive scene based on multi-module sparse projection
CN113327248B (en) Tunnel traffic flow statistical method based on video
CN112084928A (en) Road traffic accident detection method based on visual attention mechanism and ConvLSTM network
CN116153086B (en) Multi-path traffic accident and congestion detection method and system based on deep learning
CN111191610A (en) People flow detection and processing method in video monitoring
CN110674887A (en) End-to-end road congestion detection algorithm based on video classification
CN106384359A (en) Moving target tracking method and television set
CN116434159A (en) Traffic flow statistics method based on improved YOLO V7 and Deep-Sort
CN106529405A (en) Local anomaly behavior detection method based on video image block model
CN104077571B (en) A kind of crowd's anomaly detection method that model is serialized using single class

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant