CN112991399B - Bus passenger number detection system based on RFS - Google Patents

Bus passenger number detection system based on RFS Download PDF

Info

Publication number
CN112991399B
CN112991399B CN202110308023.9A CN202110308023A CN112991399B CN 112991399 B CN112991399 B CN 112991399B CN 202110308023 A CN202110308023 A CN 202110308023A CN 112991399 B CN112991399 B CN 112991399B
Authority
CN
China
Prior art keywords
getting
module
video data
bus
rfs
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110308023.9A
Other languages
Chinese (zh)
Other versions
CN112991399A (en
Inventor
汪景
吕军威
刘志钢
彭威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai University of Engineering Science
Original Assignee
Shanghai University of Engineering Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai University of Engineering Science filed Critical Shanghai University of Engineering Science
Priority to CN202110308023.9A priority Critical patent/CN112991399B/en
Publication of CN112991399A publication Critical patent/CN112991399A/en
Application granted granted Critical
Publication of CN112991399B publication Critical patent/CN112991399B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/223Analysis of motion using block-matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30232Surveillance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30242Counting objects in image

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)
  • Air-Conditioning For Vehicles (AREA)

Abstract

The invention relates to a bus passenger number detection system based on RFS, comprising: the video data acquisition and preprocessing module is used for shooting the door getting-on and getting-off areas through the vehicle-mounted camera device, acquiring the video data of the passengers getting-on and getting-off, and preprocessing the video data; the human head detection module is used for carrying out human head detection on the preprocessed video data through an SSD deep convolution neural network algorithm; the GM-PHD human head tracking module based on the RFS realizes human head tracking on video data by adopting a GM-PHD filtering algorithm based on the RFS; and the people counting output module is used for acquiring the motion track information according to the detected head information and counting the number of passengers according to the motion track information. Compared with the prior art, the method has the advantages of stable and accurate detection, no need of modifying the public transport vehicles and the like.

Description

Bus passenger number detection system based on RFS
Technical Field
The invention relates to the field of electronic information, in particular to a system for detecting the number of passengers of a bus based on RFS.
Background
With the rapid growth of economy, the continuous deepening of urbanization degree and the continuous increase of the number of motor vehicles in China, the traffic jam phenomenon is increasingly serious, and the urban sustainable development road is blocked to a great extent. The public transport is an important component of the urban passenger transport system, is related to various fields of social life and production, and undertakes the passenger transport task of most resident travel transportation. The optimal scheme for solving the severe traffic conditions of the city is to preferentially develop public transportation, scientifically and reasonably design a public transportation network and improve the comfort level and attraction of public transportation travel.
Real-time detection data of the number of passengers in a bus is an important part of bus system data. In the aspect of bus planning and operation, data support can be provided for bus network optimization; at the passenger demand aspect, unnecessary waiting time of passengers can be reduced through the public of the real-time number of the bus, so that other lines can be selected for going out. The existing bus passenger number detection methods mainly comprise three methods: the first method is pressure sensing detection, and the number of the passengers in the bus is obtained by carrying out statistical analysis on the total weight and the weight of the getting-on and getting-off components; the second is infrared detection, wherein infrared detection devices are arranged at the upper and lower doors of the automobile to detect and count the number of passengers getting on and off the automobile; the third is image processing, which directly detects the number of people by using the omnibearing video image data in the carriage. The first and second methods can be realized only after the bus is modified; in the third method, the detection error of the number of people in the area repeatedly covered by the camera in the carriage is large, and the accuracy of statistical data is low.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a bus passenger number detection system based on RFS.
The purpose of the invention can be realized by the following technical scheme:
an RFS-based bus people number detection system, comprising:
the video data acquisition and preprocessing module is used for shooting the door getting-on and getting-off areas through the vehicle-mounted camera device, acquiring the video data of the passengers getting-on and getting-off, and preprocessing the video data;
the human head detection module is used for detecting the human head of the passenger through an SSD deep convolution neural network algorithm on the preprocessed video data;
the GM-PHD human head tracking module based on the RFS realizes the human head tracking of passengers on video data by adopting a GM-PHD filtering algorithm based on the RFS;
and the people counting output module is used for acquiring the motion trail information according to the detected head information of the passengers and counting the number of the passengers according to the motion trail information.
Specifically, the video data acquiring and preprocessing module includes:
the system comprises a bus getting-on and getting-off video acquisition sub-module, a bus data preprocessing sub-module and a bus data preprocessing sub-module, wherein the bus getting-on and getting-off video acquisition sub-module is used for shooting an getting-on door area and a getting-off door area through a vehicle-mounted camera device to acquire video data, detecting the number of people in the getting-on video data and the getting-off video data to acquire the number of people getting on a bus and getting off the bus when the bus arrives at a stop, and wirelessly transmitting the video data to the video data preprocessing sub-module;
and the video data preprocessing submodule is used for performing frame cutting processing on the video and standardizing the processed data.
The human head detection module includes:
the training submodule is used for marking the picture sequence of the video data processed by the video data acquisition and preprocessing module and training the SSD deep convolutional neural network by using the marked picture sequence;
and the prediction sub-module is used for inputting the video data to be detected into the trained SSD deep convolutional neural network, detecting the head of a passenger in the picture sequence to be detected, and transmitting the centroid position information of the detected head of the passenger to the GM-PHD head tracking module based on the RFS.
The GM-PHD human head tracking module based on RFS comprises:
a prediction submodule: according to the centroid position information of the passenger head, carrying out newborn target prediction, hatching target prediction and surviving target prediction in sequence;
the updating submodule updates the Gaussian element parameters by using the measurement and observation matrix and the measurement noise;
the pruning submodule is used for pruning the updated Gaussian element parameters, combining similar Gaussian elements and pruning the Gaussian element with the minimum weight;
the state extraction submodule extracts expected values corresponding to Gaussian elements with weights larger than a threshold value;
and the track identification submodule is used for carrying out human head movement detection by utilizing an animation track identification algorithm and outputting the track information of the human head of the passenger.
Furthermore, the people counting and outputting module counts the number of getting-on people by adopting a cross-line people group counting method and counts the number of getting-off people by adopting a cross-region people group counting method. The specific contents for counting the number of passengers getting on the bus by adopting the cross-line people group counting method are as follows:
firstly, determining a determination line, judging whether a detection target moves downwards and crosses the line according to motion track information transmitted by a GM-PHD head tracking module based on RFS, and counting the number of passengers getting on the bus by one if the detection target crosses the line; if the detection target does not cross the line, judging that the vehicle is not on the bus and not counting the number of people on the bus.
The specific content of counting the getting-off number by adopting a cross-region people group counting method is as follows:
1) firstly, dividing video data into three regions of interest I, II and III from top to bottom to adapt to crowds with different heights, judging the region where a passenger is located according to the current position of the passenger, and acquiring the movement displacement of the passenger;
2) judge the passenger and locate the region, if in I district, then directly obtain the motion displacement by its motion trail information and be N pixel, if in II district, obtain its M pixel that has moved in II district according to the motion trail information, judge again whether it reaches I district: if the motion displacement reaches the I area, the motion displacement is updated to N pixel points moving in the I area, and if the motion displacement does not reach the I area, the motion displacement is M pixel points; if the target person is in the area III, counting and judging after the target person enters the area II; and finally, judging the false detection condition: if the movement displacement (N/M) is larger than 60, counting the number of passengers getting off, and if the movement displacement (N/M) is smaller than or equal to 60, judging that the passengers do not get off, and not counting the number of the passengers getting off.
Further, the training submodule includes:
the prior frame matching unit is used for searching a prior frame with the maximum IOU (input output) of each real target to ensure that each real target corresponds to at least one prior frame, trying to match the remaining unmatched prior frames with any real target, and matching if the IOU between the prior frames and any real target is greater than a threshold value, wherein the real target is the head of a passenger;
a loss function selection unit that calculates a weighted sum of the position error and the confidence error;
the data augmentation unit is used for augmenting the data by a data augmentation method;
a fine adjustment unit: based on the Hole algorithm, fine tuning is carried out on the model trained by the training submodule, and a network structure is changed to obtain a denser score map;
and a filtering unit for deleting wrong, overlapped and inaccurate bounding boxes based on the NMS algorithm.
Further, the prediction sub-module includes:
the prediction frame filtering unit is used for determining the maximum confidence coefficient of each prediction frame according to the category confidence coefficient, and filtering the prediction frame with the lower confidence coefficient according to the confidence coefficient threshold after filtering the prediction frame with the background according to the confidence coefficient;
the prediction frame decoding unit is used for decoding the left prediction frames, obtaining the real position parameters of the prediction frames according to the prior frames, performing descending order arrangement on the prediction frames according to the confidence coefficient and reserving the first k prediction frames;
and the filtering unit filters the prediction frames with larger overlapping degree based on the NMS algorithm and takes the residual prediction frames as the detection result.
Further, the trajectory recognition submodule carries out human head movement detection by adding an extraction flight path recognition algorithm optimized by a correction mechanism.
Compared with the prior art, the bus passenger number detection system based on the RFS obtains the passenger head detection result through the SSD deep convolution neural network algorithm, provides more stable and accurate measurement data for the tracking algorithm, and improves the bus passenger number detection accuracy; compared with other public transport number detection methods, the number detection system detects the number of passengers getting on or off the bus, so that the number counting result is more effective and accurate, the implementation is convenient, and the public transport vehicle is not required to be modified; the obtained real-time bus number detection data can provide data support for bus network planning, bus dispatching, bus stop passenger flow research, evacuation and the like.
Drawings
FIG. 1 is a schematic structural diagram of a bus people number detection system based on RFS in an embodiment;
FIG. 2 is a schematic flow chart of an extraction track identification algorithm in the embodiment;
FIG. 3 is a flow chart of the statistics of the number of people getting on and off the vehicle in the embodiment.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. It should be apparent that the described embodiments are only some of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.
Examples
As shown in FIG. 1, the invention relates to a bus people number detection system based on RFS, which comprises the following modules:
A. a video data acquisition and preprocessing module;
B. a human head detection module;
C. a GM-PHD head tracking module based on RFS;
D. and a people number counting output module.
The system comprises a video data acquisition and preprocessing module, a human head tracking module, a human head position information acquisition and preprocessing module, a human head position information acquisition and processing module and a human head tracking module, wherein the video data acquisition and preprocessing module is connected with the human head detection module, the video data acquisition and preprocessing module transmits a picture sequence after video standardization processing, and the human head detection module is connected with the GM-PHD human head tracking module based on RFS and is used for transmitting and detecting the mass center position information of a human head. The GM-PHD human head tracking module based on the RFS is connected with the people counting output module, and the module transmits the motion trail information of the detected human head.
The video data acquisition and preprocessing module specifically comprises two sub-modules:
a1, getting on and off the bus video acquisition submodule: the getting-on and getting-off videos are obtained by shooting the getting-on and getting-off door areas through the vehicle-mounted camera device, and the video data are transmitted to the next sub-module through the 4G wireless communication technology, so that the original video data are provided for the modules.
A2, video data preprocessing submodule: the method is used for performing frame cutting processing on the video and standardizing the processed data. The module can be remotely completed by a computer. As a preferred embodiment, the computer may use OpenCV to perform video framing and then normalize the input picture size to 300 x 300 with tensflow backup.
In the human head detection module, the invention realizes the detection of the target by using the SSD deep convolution neural network algorithm with better detection effect through computer programming. Because the target shelters from seriously when whole body detects, and considers that camera device's angle and head shelter from the lighter selection of problem and carry out remote target detection to passenger's head, mainly include two submodule pieces:
b1, training submodule: and the data processed by the video data preprocessing submodule is remotely sent to the training submodule, a marking person manually marks the picture sequence of the conventional video data processed by the video data acquisition and preprocessing module by using a VoTT marking tool, and trains a neural network by using the manually marked picture sequence so as to train a human head detection model more suitable for the situation.
B2, prediction submodule: and inputting the video data to be detected into a neural network, and performing human head detection on the picture sequence to be detected through the trained neural network. And transmitting the information of the centroid position of the detected head to the next module.
The training submodule specifically comprises the following steps:
b11, prior box matching: the prior frame is a rectangular frame with different sizes predefined at each position of the feature map, and has different aspect ratios for matching with a real object. The first step of training requires matching the real target, i.e. the passenger's head, with the prior box, and predicting this target by the bounding box corresponding to the matched prior box. A priori box matching in SSD is performed by: firstly, aiming at each real target, matching a prior frame with the maximum cross-over ratio (IOU) to ensure that each real target at least corresponds to one prior frame; the remaining unmatched prior boxes are then matched with all real targets with IOUs greater than a threshold (threshold of SSD 300 is 0.5). The prior frame matched with the real target is generally called as a positive sample, and the prior frame matched with the background is called as a negative sample. Because the negative samples are relatively more, the SSD samples the negative samples according to the principle that the confidence error is larger and the priority is higher, and the SSD is used as the negative samples during training, so that the proportion of the positive samples to the negative samples is about 1: 3. the IOU is an index for measuring the overlapping degree of the rectangular frames, and is equal to the ratio of the intersection area of the two rectangular frames to the union area of the rectangular frames.
The feature map is divided into S multiplied by S grids firstly, when the central point of a potential target is in a certain grid, the grid can generate B boundary prediction frames to predict the potential target, and each boundary prediction frame has a confidence coefficient. Then the loss function selection is performed next:
b12, selecting a loss function: the loss function is a weighted sum of the bit value error and the confidence error:
Figure BDA0002988664720000061
wherein, N is the number of positive samples of the prior frame, and if N is 0, the loss function value is 0; c is a category confidence degree predicted value; l is a position predicted value of the boundary frame corresponding to the prior frame; g is a position parameter of the real target; the weight coefficient α is set to 1 by cross validation.
Position error was calculated using Smooth L1 loss:
Figure BDA0002988664720000062
Figure BDA0002988664720000063
Figure BDA0002988664720000064
Figure BDA0002988664720000065
wherein x is ij E {1,0}, if x ij If the number of the real targets is 1, the ith prior frame is matched with the jth real target; (cx, cy) is the center of the prior frame d; w is the width of the prior frame; h is the height of the prior frame;
Figure BDA0002988664720000066
the central abscissa of the jth real target;
Figure BDA0002988664720000067
the central abscissa of the ith prior box.
Confidence errors were calculated using softmax loss:
Figure BDA0002988664720000068
wherein p is a real target category;
Figure BDA0002988664720000069
if it is
Figure BDA00029886647200000610
It indicates that the ith prior frame matches the jth real target, and the kind of the jth real target is p.
B13, data augmentation: the number of training samples is increased by various data enhancement methods such as horizontal turning, cutting, amplifying and reducing, and the robustness of the algorithm to the input targets with different sizes and different shapes is improved.
B14, Atrous Algothirm: and (3) carrying out fine adjustment on the model trained by the B1 training submodule by using a Hole algorithm, and changing the network structure to obtain a denser score map.
B15, NMS algorithm filtering: the multiple feature level features of the SSD may generate more bounding boxes, and there may still be more false, overlapping, and inaccurate bounding boxes after the IOU processing optimization. Therefore, optimization processing is carried out by using non-maximum suppression so as to improve the speed and the precision of target detection. The principle is as follows: when a plurality of bounding boxes contain the same real target and the IOU of the bounding box is higher, the bounding box with the highest score is selected, and the rest bounding boxes are deleted.
In the prediction submodule B2, the specific implementation steps are as follows:
b21, prediction box filtering: for each prediction frame, determining a category (the one with the maximum confidence) according to the category confidence, filtering out the prediction frame with a background according to the category confidence, and filtering out the prediction frame with a lower confidence value according to a confidence threshold (0.5);
b22, decoding a prediction box: decoding the left prediction frames, obtaining the real position parameters of the prediction frames according to the prior frames, generally performing descending arrangement according to confidence, and only keeping the first n (such as 400) prediction frames;
b23, NMS algorithm filtering: like B15, the prediction blocks with larger overlap are filtered out, and the remaining prediction blocks are the detection results.
In the module C, in the GM-PHD head tracking module based on RFS, a Gaussian Mixture-Probability Hypothesis Density (GM-PHD) filter in a Random Finite Set (RFS) lossy filtering algorithm is used to filter the part of the video data that realizes head tracking. The following algorithm process implementation sub-modules at the time of k + 1:
the Gaussian element parameter of the posterior PHD at the k moment is assumed to be
Figure BDA0002988664720000071
The Gaussian element parameter of the PFS posterior PHD of the new target at the k +1 moment is
Figure BDA0002988664720000072
The Gaussian element parameter of the incubation target PFS posterior PHD is
Figure BDA0002988664720000073
Wherein,
Figure BDA0002988664720000074
is the mean of the ith Gaussian at time k;
Figure BDA0002988664720000075
is the weight of the ith Gaussian element at the moment k;
Figure BDA0002988664720000076
is the covariance of the ith gaussian at time k; j. the design is a square k Is the number of gaussian elements of the check PHD after time k.
C1, predictor sub-module:
(1) predicting a new target: directly taking the Gaussian parameter of the newly-generated targets PHD at the moment of k +1 as the predicted PHD parameter, wherein the newly-generated targets are J γ,k+1 When J is 1, …, J γ,k+1 ,i=1,…,J γ,k+1 When there is
Figure BDA0002988664720000077
(2) Predicting hatching targets: the number of incubation targets at the k +1 moment is J β,k+1 When J is 1, …, J β,k+1 ,l=1,…,J k ,i=J γ,k+1 ,…,(J γ,k+1 +J k J β,k+1 ) In time, there are:
Figure BDA0002988664720000078
Figure BDA0002988664720000079
Figure BDA00029886647200000710
wherein,
Figure BDA00029886647200000711
is a state transition matrix;
Figure BDA00029886647200000712
is the incubation target state noise covariance;
Figure BDA00029886647200000713
is the noise weight of the jth Gaussian element at the moment k;
Figure BDA0002988664720000081
is the mean of the ith Gaussian component at the moment k + 1;
Figure BDA0002988664720000082
is the covariance of the ith Gaussian component at the time when k shifts to k + 1;
(3) predicting surviving targets: let the survival probability be p S ,j=1,…,J k
i=(J γ,k+1 +J k J β,k+1 ),…,(J γ,k+1 +J k J β,k+1 +J k ) In time, there are:
Figure BDA0002988664720000083
Figure BDA0002988664720000084
Figure BDA0002988664720000085
c2, update submodule: by measuring random set Z k+1 The observation matrix H and the measurement noise covariance update weight R are updated. Let the detection probability be P D
Updating the undetected human head target by: when J is 1, …, J γ,k+1 +J k J β,k+1 +J k In time, there are:
Figure BDA0002988664720000086
Figure BDA0002988664720000087
Figure BDA0002988664720000088
for the detected human head target, the centroid coordinate obtained by the human head detection module is used as a measurement random set Z k+1 Update PHD, pair
Figure BDA0002988664720000089
When J is 1, …, J k+1|k In time, there are:
Figure BDA00029886647200000810
Figure BDA00029886647200000811
Figure BDA00029886647200000812
wherein, κ k (z) probability of cluttered RFS for poisson distribution; n (. mu.,. sigma.) represents the Gaussian density with mean μ and variance σ.
C3, trimming submodule: in order to reduce clutter and improve algorithm speed, the updated Gaussian element parameters need to be pruned, similar Gaussian elements are combined, and the Gaussian element with the minimum weight is pruned;
c4, status extraction submodule: extracting expected values corresponding to Gaussian elements with weights larger than a threshold value;
c5, track identification submodule: as shown in FIG. 2, an operation flow is implemented for the extraction track recognition algorithm. (1) Extracting color histograms of a determined target state of a previous frame and a current frame state estimation region, calculating a cost function matched with any confirmed state and any current estimation state region, and generating a correlation matrix; (2) initializing a track price for an unassigned successful state estimate (price is set to 0); (3) carrying out 'optimal track' matching; (4) and judging, and if all the matching is successful, outputting the flight path. If not, the last cycle matching result is removed, the track price is updated, and the best track matching is carried out again until all the matching is successful. In the invention, the color histogram describes the proportion of different colors in the whole image; the cost function is used for calculating the associated cost of the two states; the incidence matrix is a two-dimensional matrix consisting of incidence costs, and the incidence costs of the state estimation of all current frames and the confirmation states of all previous frames are calculated in a traversing manner; the "best track" is the motion track of the target.
Because the aution-based track recognition algorithm is only applicable to a random set of state estimates for which the number of targets is constant. Therefore, when the head target tracking is performed in the bus getting-on and getting-off videos, a correction mechanism needs to be added to optimize the algorithm and obtain a correct track. The basic principle of the correction mechanism is as follows: assume that the current frame state estimation random set summarizes the number of states as N (k +1) and the last frame confirmed estimation number is l (k). If N (k +1) is more than or equal to L (k), generating a track of L (k) individual head targets by using the track recognition algorithm, performing similarity calculation on the residual N (k +1) -L (k) state estimations and the known new target states according to the Babbitt distance of the color histogram characteristics for matching, selecting the successfully matched state estimation as the track of the current frame new target, and regarding the residual state estimation as clutter deletion; if N (k +1) is less than L (k), N (k +1) tracks are generated by applying the track recognition algorithm and represent the situation that the head target disappears.
The prediction submodule is connected with the updating submodule to transmit a prediction Gaussian parameter, the updating submodule is connected with the trimming submodule to transmit an updated Gaussian parameter, the trimming submodule is connected with the state extraction submodule to transmit a merged and deleted Gaussian parameter, the state extraction submodule is connected with the track identification submodule to transmit a target state estimation random set and a target number estimation random set, and the track identification submodule is connected with the people counting output module to transmit motion track information of a detected person.
In the module D, the people counting output module is implemented as shown in the flow chart of the people counting output module in fig. 3. When people are on the bus, a simpler cross-line people group counting method is adopted to reduce the complexity of the system and improve the efficiency; a cross-region people group counting method is adopted during the statistics of the number of passengers getting off, three regions of interest are defined to adapt to the cross-region counting of people with different heights, and the accuracy of people counting is improved.
Counting the number of passengers getting on the bus. Firstly, defining a determination line according to a head motion boundary line of passengers getting on the bus in historical video data, judging whether a detection target moves downwards and crosses the line according to motion track information transmitted by a GM-PHD head tracking module based on RFS, and counting the number of passengers getting on the bus by one if the detection target crosses the line; if the detection target does not cross the line, the detection target is not on the vehicle and is not counted.
And counting the number of people getting off. The method comprises the steps of firstly, utilizing two transverse tangents to define three interested areas for video data, and sequentially naming the three interested areas as an area I, an area II and an area III from top to bottom so as to facilitate the judgment of getting on and off of passengers with different heights. Judging the area where the passenger is located according to the current position of the passenger to obtain the movement displacement of the passenger: if the image is in the I area, directly obtaining the motion displacement as N pixel points according to the motion track information; if the mobile terminal is in the area II, M pixel points which move in the area II are obtained according to the motion trail information, and whether the mobile terminal reaches the area I or not is judged: if the image reaches the I area, the motion displacement is updated to N pixel points which move in the I area, and if the image does not reach the I area, the motion displacement is M pixel points; and if the target person is in the area III, waiting for the target person to enter the area II and then performing counting judgment. And finally, judging the false detection condition: if the movement displacement (N/M) is larger than 60, counting the getting-off, and if the movement displacement (N/M) is smaller than or equal to 60, the passengers do not get off and are not counted.
The stops are numbered in sequence, and the number of the passengers getting on the bus n (t) and the number of the passengers getting off the bus m (t) at the stop t are obtained through the operation. Assuming that the number of the buses after the bus passes the stop t is X (t), the following steps are provided:
X(t)=X(t-1)+n(t)-m(t)(t=1,2,3,…,X(0)=0)
while the invention has been described with reference to specific embodiments, the invention is not limited thereto, and those skilled in the art can easily conceive of various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (8)

1. An RFS-based bus passenger number detection system is characterized by comprising:
the video data acquisition and preprocessing module is used for shooting the door getting-on and getting-off areas through the vehicle-mounted camera device, acquiring the video data of the passengers getting-on and getting-off, and preprocessing the video data;
the human head detection module is used for detecting the human head of the passenger through an SSD deep convolution neural network algorithm on the preprocessed video data;
the GM-PHD human head tracking module based on the RFS realizes the human head tracking of passengers on video data by adopting a GM-PHD filtering algorithm based on the RFS;
the people counting output module is used for acquiring motion track information according to the detected head information of the passengers and counting the number of the passengers according to the motion track information;
the people counting and outputting module counts the number of getting-on people by adopting a cross-line people group counting method and counts the number of getting-off people by adopting a cross-region people group counting method;
the people counting output module adopts a cross-region people group counting method to count the getting-off people, and the specific contents are as follows:
1) firstly, dividing three interested areas I, II and III of video data from top to bottom to adapt to crowds with different heights, judging the area where a passenger is located according to the current position of the passenger, and acquiring the movement displacement of the passenger;
2) judging the region of the passenger, if in the I district, then directly obtaining the motion displacement by its motion trail information and being N pixel points, if in the II district, obtaining it according to the motion trail information and having moved M pixel points in the II district, judging again whether it reaches the I district: if the motion displacement reaches the I area, the motion displacement is updated to N pixel points moving in the I area, and if the motion displacement does not reach the I area, the motion displacement is M pixel points; if the target person is in the area III, counting and judging after the target person enters the area II; and finally, judging the false detection condition: if the movement displacement N/M is larger than 60, counting the number of passengers getting off, and if the movement displacement N/M is smaller than or equal to 60, judging that the passengers do not get off and not counting the number of the passengers getting off.
2. The RFS-based bus passenger number detection system as claimed in claim 1, wherein said video data acquisition and pre-processing module comprises:
the system comprises a bus getting-on and getting-off video acquisition sub-module, a bus data preprocessing sub-module and a bus data preprocessing sub-module, wherein the bus getting-on and getting-off video acquisition sub-module is used for shooting an getting-on door area and a getting-off door area through a vehicle-mounted camera device to acquire video data, detecting the number of people in the getting-on video data and the getting-off video data to acquire the number of people getting on a bus and getting off the bus when the bus arrives at a stop, and wirelessly transmitting the video data to the video data preprocessing sub-module;
and the video data preprocessing submodule is used for performing frame cutting processing on the video and standardizing the processed data.
3. The RFS-based bus occupant detection system according to claim 1, wherein said head detection module comprises:
the training submodule is used for marking the picture sequence of the video data processed by the video data acquisition and preprocessing module and training the SSD deep convolutional neural network by using the marked picture sequence;
and the prediction sub-module is used for inputting the video data to be detected into the trained SSD deep convolutional neural network, detecting the head of a passenger in the picture sequence to be detected, and transmitting the centroid position information of the detected head of the passenger to the GM-PHD head tracking module based on the RFS.
4. The RFS-based bus passenger number detection system as claimed in claim 3, wherein said RFS-based GM-PHD head tracking module comprises:
a prediction submodule: according to the centroid position information of the passenger head, carrying out newborn target prediction, hatching target prediction and surviving target prediction in sequence;
the updating submodule updates the Gaussian element parameters by using the measurement and observation matrix and the measurement noise;
the pruning submodule is used for pruning the updated Gaussian element parameters, combining similar Gaussian elements and pruning the Gaussian element with the minimum weight;
the state extraction submodule extracts expected values corresponding to the Gaussian elements with the weight values larger than the threshold;
and the track recognition sub-module is used for detecting the movement of the head of the passenger by using an animation track recognition algorithm and outputting the track information of the head of the passenger.
5. The RFS-based bus passenger number detection system as claimed in claim 3, wherein said training sub-module comprises:
the prior frame matching unit is used for searching a prior frame with the maximum IOU (input output) of each real target to ensure that each real target corresponds to at least one prior frame, trying to match the remaining unmatched prior frames with any real target, and matching if the IOU between the prior frames and any real target is greater than a threshold value, wherein the real target is the head of a passenger;
a loss function selection unit that calculates a weighted sum of the position error and the confidence error;
the data augmentation unit is used for augmenting the data by a data augmentation method;
a fine adjustment unit: based on the Hole algorithm, fine tuning is carried out on the model trained by the training submodule, and a network structure is changed to obtain a denser score map;
and a filtering unit for deleting wrong, overlapped and inaccurate bounding boxes based on the NMS algorithm.
6. The RFS-based bus occupant detection system according to claim 3, wherein said prediction sub-module comprises:
the prediction frame filtering unit is used for determining the maximum confidence coefficient of each prediction frame according to the category confidence coefficient, and filtering the prediction frame with the lower confidence coefficient according to the confidence coefficient threshold after filtering the prediction frame with the background according to the confidence coefficient;
the prediction frame decoding unit is used for decoding the left prediction frames, obtaining the real position parameters of the prediction frames according to the prior frames, performing descending order arrangement on the prediction frames according to the confidence coefficient and reserving the first k prediction frames;
and the filtering unit filters the prediction boxes with larger overlapping degree based on the NMS algorithm and takes the residual prediction boxes as the detection result.
7. The RFS-based bus passenger number detection system as claimed in claim 4, wherein the trajectory recognition sub-module performs human head movement detection by adding an extraction trajectory recognition algorithm optimized by a correction mechanism.
8. The RFS-based bus passenger number detection system as claimed in claim 1, wherein the specific content of the people number counting output module for counting the number of passengers getting on the bus by adopting an over-line people group counting method is as follows:
firstly, determining a determination line, judging whether a detection target moves downwards and crosses the line according to motion track information transmitted by a GM-PHD head tracking module based on RFS, and counting the number of passengers getting on the bus by one if the detection target crosses the line; if the detection target does not cross the line, judging that the vehicle is not on the bus and not counting the number of people on the bus.
CN202110308023.9A 2021-03-23 2021-03-23 Bus passenger number detection system based on RFS Active CN112991399B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110308023.9A CN112991399B (en) 2021-03-23 2021-03-23 Bus passenger number detection system based on RFS

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110308023.9A CN112991399B (en) 2021-03-23 2021-03-23 Bus passenger number detection system based on RFS

Publications (2)

Publication Number Publication Date
CN112991399A CN112991399A (en) 2021-06-18
CN112991399B true CN112991399B (en) 2022-08-23

Family

ID=76333096

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110308023.9A Active CN112991399B (en) 2021-03-23 2021-03-23 Bus passenger number detection system based on RFS

Country Status (1)

Country Link
CN (1) CN112991399B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114926422B (en) * 2022-05-11 2023-07-04 西南交通大学 Method and system for detecting passenger flow of getting on and off vehicles
CN116895047B (en) * 2023-07-24 2024-01-30 北京全景优图科技有限公司 Rapid people flow monitoring method and system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104637070A (en) * 2014-12-15 2015-05-20 江南大学 Probability hypothesis density based variable target number video tracking algorithm
CN104835323B (en) * 2015-05-19 2017-04-26 银江股份有限公司 Multi-target public transport passenger flow detection method combining with electronic fence
CN108805252A (en) * 2017-04-28 2018-11-13 西门子(中国)有限公司 A kind of passenger's method of counting, device and system
CN108446611A (en) * 2018-03-06 2018-08-24 深圳市图敏智能视频股份有限公司 A kind of associated binocular image bus passenger flow computational methods of vehicle door status
CN109325404A (en) * 2018-08-07 2019-02-12 长安大学 A kind of demographic method under public transport scene
CN111881749B (en) * 2020-06-24 2024-05-31 北京工业大学 Bidirectional people flow statistics method based on RGB-D multi-mode data

Also Published As

Publication number Publication date
CN112991399A (en) 2021-06-18

Similar Documents

Publication Publication Date Title
CN110487562B (en) Driveway keeping capacity detection system and method for unmanned driving
JP6570731B2 (en) Method and system for calculating passenger congestion
CN112991399B (en) Bus passenger number detection system based on RFS
CN110796168A (en) Improved YOLOv 3-based vehicle detection method
CN114299417A (en) Multi-target tracking method based on radar-vision fusion
CN107818571A (en) Ship automatic tracking method and system based on deep learning network and average drifting
CN105844229B (en) A kind of calculation method and its system of passenger's crowding
CN106541968B (en) The recognition methods of the subway carriage real-time prompt system of view-based access control model analysis
CN104200657A (en) Traffic flow parameter acquisition method based on video and sensor
CN114023062B (en) Traffic flow information monitoring method based on deep learning and edge calculation
CN110633671A (en) Bus passenger flow real-time statistical method based on depth image
CN112700473B (en) Carriage congestion degree judging system based on image recognition
CN112347864A (en) Method, device, equipment and system for sensing and inducing rail transit passenger flow
CN107145819A (en) A kind of bus crowding determines method and apparatus
CN114926422B (en) Method and system for detecting passenger flow of getting on and off vehicles
CN113034378B (en) Method for distinguishing electric automobile from fuel automobile
CN111738114A (en) Vehicle target detection method based on anchor-free accurate sampling remote sensing image
CN105930855A (en) Vehicle detection method based on deep convolution neural network
CN114926984B (en) Real-time traffic conflict collection and road safety evaluation method
CN103679128B (en) A kind of Aircraft Targets detection method of anti-interference of clouds
CN116128360A (en) Road traffic congestion level evaluation method and device, electronic equipment and storage medium
CN103605960B (en) A kind of method for identifying traffic status merged based on different focal video image
CN108520528A (en) Based on the mobile vehicle tracking for improving differential threshold and displacement field match model
CN109215059B (en) Local data association method for tracking moving vehicle in aerial video
CN117292322A (en) Deep learning-based personnel flow detection method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant