CN116229376A - Crowd early warning method, counting system, computing device and storage medium - Google Patents
Crowd early warning method, counting system, computing device and storage medium Download PDFInfo
- Publication number
- CN116229376A CN116229376A CN202310499633.0A CN202310499633A CN116229376A CN 116229376 A CN116229376 A CN 116229376A CN 202310499633 A CN202310499633 A CN 202310499633A CN 116229376 A CN116229376 A CN 116229376A
- Authority
- CN
- China
- Prior art keywords
- crowd
- detection
- target detection
- early warning
- warning method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000003860 storage Methods 0.000 title claims abstract description 10
- 238000001514 detection method Methods 0.000 claims abstract description 130
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 41
- 238000011156 evaluation Methods 0.000 claims abstract description 15
- 238000012544 monitoring process Methods 0.000 claims abstract description 15
- 230000006870 function Effects 0.000 claims description 38
- 238000012549 training Methods 0.000 claims description 29
- 238000004364 calculation method Methods 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 9
- 230000004913 activation Effects 0.000 claims description 8
- 230000002776 aggregation Effects 0.000 claims description 4
- 238000004220 aggregation Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 4
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 230000000295 complement effect Effects 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 claims description 3
- 238000004891 communication Methods 0.000 claims 1
- 238000010801 machine learning Methods 0.000 abstract description 2
- 238000002372 labelling Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000004873 anchoring Methods 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000001629 suppression Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/53—Recognition of crowd images, e.g. recognition of crowd congestion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- G06Q50/40—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G08—SIGNALLING
- G08B—SIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
- G08B21/00—Alarms responsive to a single specified undesired or abnormal condition and not otherwise provided for
- G08B21/02—Alarms for ensuring the safety of persons
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- Multimedia (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Educational Administration (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Emergency Management (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a crowd early warning method, a counting system, a computing device and a storage medium, belonging to the technical field of machine learning, comprising the following steps: collecting image data by using a monitoring camera of subway station equipment; performing head detection on passengers in the subway station by adopting a target detection algorithm; evaluating crowd crowding degree of the subway station image acquisition area according to the passenger head detection result; according to the actual crowd flowing conditions of different subway stations, setting a threshold value of crowd crowding degree evaluation indexes, and alarming by an alarm after the crowd crowding degree exceeds the set threshold value. The invention can directly detect the crowd head information of the subway station under the condition of crowded staff in the commute time period, count the staff, and effectively overcome the singleness of crowd crowding judgment by adopting absolute number by constructing the evaluation index of crowd crowding degree, so that the judgment of the crowd crowding degree of the passengers of the subway station is more scientific.
Description
Technical Field
The invention relates to the technical field of machine learning, in particular to a crowd early warning method, a counting system, computing equipment and a storage medium.
Background
With the development of economy and the increasing population, the scale of cities is larger and larger, and the pressure of urban subway systems is also increasing. The subway station is particularly crowded during commuting to work and off work, and in order to avoid accidents such as crowded stepping and the like caused by too dense personnel, the subway station is generally provided with a corresponding alarm device.
In daily practice, the inventor finds that the prior technical scheme has the following problems:
the traditional way of guaranteeing crowd safety mainly utilizes manual video monitoring, subjectively judges crowd state, and when crowd intensity is too great, then carries out manual early warning. Another widely used way is to count by using photoelectric sensors and millimeter wave radars, which is more traditional and can not accurately grasp the distribution information of people. There is no mature solution for high-end demands such as population count in irregular areas. The method is also based on the density map to count people, meanwhile, the method can output the crowd density map, provides key information of crowd space distribution, has good effect on ultra-dense areas, but has large change of the number of subway station people and rapid change of the positions of the people, and meanwhile, the method is excessively large in model parameter quantity, is not beneficial to the deployment of subway station equipment and cannot output the crowd number in real time.
In view of the foregoing, it is necessary to provide a new solution to the above-mentioned problems.
Disclosure of Invention
In order to solve the technical problems, the application provides a crowd early warning method, a counting system, a computing device and a storage medium, which can directly detect crowd head information of a subway station under the condition of crowded staff in a commute time period and count staff.
A crowd early warning method comprising:
collecting image data by using a monitoring camera of subway station equipment;
performing head detection on passengers in the subway station by adopting a target detection algorithm;
evaluating crowd crowding degree of the subway station image acquisition area according to the passenger head detection result;
setting thresholds of crowd crowding degree evaluation indexes according to actual crowd flowing conditions of different subway stations, and alarming by an alarm after the crowd crowding degree exceeds the set thresholds;
wherein, evaluation subway station image acquisition region's crowd crowded degree includes:
calculating the total pixel sum of all detection frames in each frame of the monitoring video, and counting the total occupied area of passengers in the detection area;
judging whether the passenger head detection frame is overlapped with the adjacent detection frame according to the position coordinates of the passenger head detection frame, and if the passenger head detection frame is overlapped with the adjacent detection frame, calculating the overlapping area of the passenger head detection frame;
and taking the ratio of the total occupied area of passengers to the overlapping area in the detection area as an evaluation index of crowd crowding degree.
Preferably, before the head of the subway station passenger is detected by adopting the target detection algorithm, training of the target detection algorithm is finished; the training of the target detection algorithm comprises the following steps:
creating a pedestrian head detection data set;
performing data enhancement on the data set;
writing a yaml configuration file of a data set;
performing head detection on the pictures in the enhanced data set by adopting a target detection algorithm;
calculating a loss function of the target detection algorithm, judging whether the loss function meets the requirement, and completing training of the target detection algorithm after the loss function meets the requirement;
and when the value of the loss function does not meet the requirement, adopting a Cosine LR scheduler scheduler to dynamically adjust the learning rate.
Preferably, the performing header detection on the picture in the enhanced data set by using the target detection algorithm includes:
adjusting the picture size to 960 pixels at the input;
before entering a backbone network module of a target detection network, the picture firstly enters a Focus module for slicing;
and carrying out convolution operation on the sliced picture to obtain a double downsampling characteristic diagram under the condition of no information loss.
Preferably, before the picture enters the backbone network module of the target detection network, the picture enters the Focus module for slicing, and the step of entering the Focus module for slicing includes:
and taking a value from every other pixel in each picture to obtain four complementary sampling pictures, so that the channel width and the channel height of the sampling pictures are reduced to half of the original image, but the input channels are expanded by 4 times, and the spliced pictures form 12 channels relative to the original RGB three channels.
Preferably, an adaptive calculation anchor frame strategy is adopted to adjust the width and the height of the detection frame;
in the method, in the process of the invention,representing the width of the detection frame,/-, and>representing the width of the overall image, +.>Representing the width of tensor +.>Representing an activation function;
in the method, in the process of the invention,indicating the height of the detection frame,/-, and>representing the height of the whole image, +.>Representing the height of tensor +.>Representing an activation function.
Preferably, a pre-heat training strategy is used before training the target detection algorithm; the pre-heat training strategy comprises the following steps: firstly, training is carried out for 5 times by using learning rate data smaller than a preset learning rate, and then training is carried out by modifying the learning rate data into the preset learning rate.
Preferably, the network layer part of the target detection network adopts a structure of combining a characteristic pyramid network and a path aggregation network; and the output end of the detection head of the target detection network uses the CIOU loss function as the loss function of the bounding box.
According to another aspect of the application, a counting system is further provided, which is characterized by being applicable to the crowd early warning method, including a monitoring camera, an engineering machine, an alarm, a monitoring video storage and a display screen; the engineering machine is internally provided with a target detection algorithm; the engineering machine comprises an algorithm processing module; the engineering machine can call the video information of the monitoring camera and process the video information through the algorithm processing module to obtain a detection frame of the head of the passenger.
According to another aspect of the present application, there is also provided a computing device, comprising: the system comprises a processor, a memory and a bus, wherein the memory stores machine-readable instructions executable by the processor, when the computing device is running, the processor and the memory are communicated through the bus, and the machine-readable instructions are executed by the processor to perform the steps of the crowd early warning method.
According to another aspect of the present application, there is also provided a computer storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the crowd early warning method.
Compared with the prior art, the application has the following beneficial effects:
1. aiming at the problem that the traditional pedestrian detection has poor detection effect on personnel stacking in a personnel-intensive place, the invention provides a method for directly detecting the heads of passengers in a subway station under a monitoring view angle, which can directly detect the head information of the passengers in the subway station under the condition of personnel-intensive commute time period and count the personnel.
2. The invention constructs the evaluation index of crowd crowding degree, effectively overcomes the singleness of crowd crowding judgment by adopting absolute number of people, and makes the judgment of the crowd crowding degree of passengers in subway stations more scientific.
3. The model designed by the invention has smaller structure and low calculation force requirement, and is favorable for deployment on equipment in the actual scene of the subway station.
4. The model adopted by the invention carries out various data enhancement modes in the preparation process of the data set, so that the data volume is greatly improved, the generalization performance of the model is enhanced, and the effect of the model is more stable.
5. The invention can realize the crowd counting function by improving the original camera, simplify the system and effectively reduce the cost.
Drawings
Some specific embodiments of the invention will be described in detail hereinafter by way of example and not by way of limitation with reference to the accompanying drawings. The same reference numbers will be used throughout the drawings to refer to the same or like parts or portions. It will be appreciated by those skilled in the art that the drawings are not necessarily drawn to scale. In the accompanying drawings:
FIG. 1 is a schematic overall flow chart of the present invention.
Detailed Description
For the purposes, technical solutions and advantages of the present application, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
As shown in FIG. 1, the crowd early warning method comprises the following steps:
and S1, acquiring image data by using monitoring cameras of subway station equipment.
And S2, performing head detection on passengers in the subway station by adopting a trained target detection algorithm.
And S3, evaluating the crowd crowding degree of the subway station image acquisition area according to the passenger head detection result.
And marking a head detection frame of the personnel in the video image, and displaying the number of the personnel in the frame in the video image in real time. And defining a detection area according to the requirements, and determining pixel coordinates of four vertexes of the area.
Wherein, evaluation subway station image acquisition region's crowd crowded degree includes:
step S31, calculating the total pixel sum of all detection frames in each frame of the monitoring video, and counting the total occupied area of passengers in the detection area.
And S32, judging whether the passenger head detection frame is overlapped with the adjacent detection frame according to the position coordinates of the passenger head detection frame, and if the passenger head detection frame is overlapped with the adjacent detection frame, calculating the overlapping area of the passenger head detection frame.
And S33, taking the ratio of the total occupied area of passengers in the detection area to the overlapping area as an evaluation index of crowd crowding degree.
Namely, the evaluation indexes of crowd crowding degree are as follows: S1/S2. Wherein S1 is the total occupied area of passengers in the detection area, and S2 is the overlapping area. Different evaluation index values may be set as thresholds according to construction conditions of different stations and their management capabilities.
And S4, setting thresholds of crowd crowding degree evaluation indexes according to actual crowd flowing conditions of different subway stations, and giving an alarm after the crowd crowding degree exceeds the set thresholds.
According to the actual space size of different subway stations, a threshold value of a crowd crowding degree evaluation index is set, namely, when the threshold value is larger than, an alarm gives an alarm, the number of people in a video changes the color, so that workers are reminded of excessive people in the subway stations in a scene at the moment, people need to be evacuated, and meanwhile, the background records the current crowding scene, so that the background is convenient to view.
Furthermore, before the head detection of the subway station passenger by adopting the target detection algorithm, the method further comprises the following steps:
step S10, training a target detection algorithm.
The training of the target detection algorithm comprises the following steps:
and S100, manufacturing a pedestrian head detection data set.
Specifically, through internet collection, obtain HT21 dataset, this dataset adopts the control visual angle, relates to subway station scene, and crowd is sparse all has. The dataset is initially processed and each tag is converted to a (x, y, w, h, cls). Wherein x represents the x-axis coordinate of the center point of the marking frame, y represents the y-axis coordinate of the center point of the marking frame, w represents the width of the marking frame, and h represents the height of the marking frame.
Step S200, data enhancement is carried out on the data set.
Specifically, the enhancement of image data is mainly performed in the following ways: random rotation of the image, adding Coarse Dropout noise, performing color disturbance, and adding Gaussian noise. The enhanced data set is divided into a training set and a testing set.
Step 300, writing a data set yaml configuration file.
The yaml configuration file of the data set contains addresses of a training set and a testing set, names of data categories and numbers of the data categories.
And step 400, performing head detection on the pictures in the enhanced data set by adopting a target detection algorithm.
The target detection algorithm is established in a fusion mode and comprises a detection head of a target detection network and a network layer of the target detection network.
The detection head part of the target detection network can be one of a thunder target detection algorithm, a Yolo target detection algorithm, an SDD target detection algorithm, a DETR target detection algorithm, a CenterNet target detection algorithm, a TTFNet target detection algorithm, an FCOS target detection algorithm and a Nanodet target detection algorithm.
The network layer (negk) portion of the object detection network employs a Feature Pyramid Network (FPN) and Path Aggregation Network (PAN) combined architecture. The combination operation FPN layer conveys strong semantic features from top to bottom, the feature pyramid conveys strong positioning features from bottom to top, the combination operation FPN layer and the feature pyramid are combined with each other, and parameter aggregation is carried out on different detection layers from different trunk layers.
The detector head output of the target detection network uses the CIOU loss function as the loss function of the bounding box.
Wherein distancce is a kind of 2 Represents the DISTANCE between the center points of the predicted frame and the target frame C The diagonal distance representing the smallest bounding rectangle, v is the influencing factor,is the cross-over ratio, i.e., the overlap ratio of the predicted and actual bounding boxes.
First, the picture size is adjusted to 960 pixels at the input to overcome the small person head object. The picture then enters the Focus module for slicing before entering the Backbone network (Backbone) module of the object detection network. The specific operation is that every other pixel in one picture is taken to a value similar to adjacent downsampling, thus four pictures are taken, the four pictures are complementary, but no information is lost, the channel width and the channel height are reduced to half of the original ones, but the input channels are expanded by 4 times, namely, the spliced picture is changed into 12 channels relative to the original RGB three-channel mode, and finally, the obtained new picture is subjected to convolution operation, and finally, the double downsampling characteristic diagram under the condition of no information loss is obtained.
And selecting an SGD-Momentum optimizer to optimize the target detection network, wherein the Momentum value Momentum is set to 0.9, and the acceleration value nester ov is set to true. The training iteration number is set to be 100, the training iteration number is stopped in advance about 60 through experiments, and the loss function is converged.
In the post-processing process of target detection, a non-maximum suppression operation is usually required for screening a plurality of target frames, because the CIOU loss function contains information about an influence factor v and a true value (ground trunk), and no true value exists in test reasoning. The network uses the combination of CIOU loss function and DIOU non-maximum suppression as Weighted non-maximum suppression (Weighted NMS) to screen the best detection box from multiple candidate boxes.
The loss function of the target detection network consists of three parts, namely classification loss, positioning loss and confidence loss. Wherein the classification loss and the positioning loss are calculated using a binary cross entropy loss function:
wherein BCEWITHLogitsLoss represents a binary cross entropy loss function,the weight representing the current factor is used,representing the x-axis coordinate of the center point of the marking frame, < + >>Representing the y-axis coordinate of the center point of the labeling frame, < + >>Representing an activation function.
Confidence loss calculation uses the CIOU function calculation, while classification loss, using the BCE loss function, where only the classification loss for positive samples is calculated:
wherein, as the loss function of BCE,weight representing the current factor, +.>Representing the x-axis coordinate of the center point of the marking frame, < + >>Representing the y-axis coordinate of the center point of the labeling frame, < + >>Representing an activation function.
The network predicts three prediction frames for each grid of 80 x 80 grids, and since only the passenger heads are detected, the total class number is 1, so that the prediction information of each prediction frame only comprises 1 classification probability, and finally a probability matrix of [3 x 80 x 1] is formed.
Confidence loss, referred to as CIOU of the network predicted target bounding box and the real bounding box, is used with BCE loss function. Calculated here is the confidence loss for all samples.
The positioning loss is calculated by using CIOU loss function.
And calculating a loss function of the target detection algorithm, judging whether the loss function meets the requirement, and completing training of the target detection algorithm after the loss function meets the requirement.
And when the value of the loss function does not meet the requirement, adopting a Cosine LR scheduler scheduler to dynamically adjust the learning rate.
Wherein, the liquid crystal display device comprises a liquid crystal display device,learning rate representing the present period, +.>Learning rate representing next cycle, +.>Is set to the initial learning rate, ">Representing the minimum learning rate,/->Representing learning period, the minimum learning rate defaults to 1e-5, < >>Is a constant, wherein K is an integer and +.>。
And the test picture predicts the coordinates and the categories of the head frame of the passenger through the characteristic extraction and the fusion process of the characteristic map of the target detection network, wherein the categories are the same identification.
In addition, a preheating training strategy can be used before the target detection network training starts, the learning rate not exceeding 1e-3 is used for carrying out 5 times of iterative training, and then the training is carried out by modifying the learning rate into a preset learning rate, so that the model is helped to slow down the phenomenon that the gradient is reduced and the fitting phenomenon is carried out in advance in the initial stage, and the stability of distribution and the stability of the deep layer of the model are maintained.
The width and the height of the detection frame are adjusted by adopting an adaptive calculation anchoring frame strategy, and the problems of gradient explosion and unstable training are avoided by adjusting the width and the height of the prediction target frame. Checking the labeling information in the dataset before starting the training of the target detection network, calculating the optimal recall rate of the labeling information in the dataset for the default anchoring frame, and when the optimal recall rate is greater than or equal to 0.98, not needing to update the anchoring frame; if the optimal recall is less than 0.98, then the anchor boxes conforming to this dataset need to be recalculated.
in the method, in the process of the invention,representing the width of the detection frame,/-, and>representing the width of the overall image, +.>Representing the width of tensor +.>Representing an activation function.
in the method, in the process of the invention,indicating the height of the detection frame,/-, and>representing the height of the whole image, +.>Representing the height of tensor +.>Representing an activation function.
Spatially relative terms, such as "above … …," "above … …," "upper surface at … …," "above," and the like, may be used herein for ease of description to describe one device or feature's spatial location relative to another device or feature as illustrated in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations in use or operation in addition to the orientation depicted in the figures. For example, if the device in the figures is turned over, elements described as "above" or "over" other devices or structures would then be oriented "below" or "beneath" the other devices or structures. Thus, the exemplary term "above … …" may include both orientations of "above … …" and "below … …". The device may also be positioned in other different ways (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments in accordance with the present application. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
It should be noted that the terms "first," "second," and the like in the description and claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or described herein.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. The crowd early warning method is characterized by comprising the following steps of:
collecting image data by using a monitoring camera of subway station equipment;
performing head detection on passengers in the subway station by adopting a target detection algorithm;
evaluating crowd crowding degree of the subway station image acquisition area according to the passenger head detection result;
setting thresholds of crowd crowding degree evaluation indexes according to actual crowd flowing conditions of different subway stations, and alarming by an alarm after the crowd crowding degree exceeds the set thresholds;
wherein, evaluation subway station image acquisition region's crowd crowded degree includes:
calculating the total pixel sum of all detection frames in each frame of the monitoring video, and counting the total occupied area of passengers in the detection area;
judging whether the passenger head detection frame is overlapped with the adjacent detection frame according to the position coordinates of the passenger head detection frame, and if the passenger head detection frame is overlapped with the adjacent detection frame, calculating the overlapping area of the passenger head detection frame;
and taking the ratio of the total occupied area of passengers to the overlapping area in the detection area as an evaluation index of crowd crowding degree.
2. The crowd early warning method of claim 1, further comprising training a target detection algorithm before head detection of subway station passengers using the target detection algorithm; the training of the target detection algorithm comprises the following steps:
creating a pedestrian head detection data set;
performing data enhancement on the data set;
writing a yaml configuration file of a data set;
performing head detection on the pictures in the enhanced data set by adopting a target detection algorithm;
calculating a loss function of the target detection algorithm, judging whether the loss function meets the requirement, and completing training of the target detection algorithm after the loss function meets the requirement;
and when the value of the loss function does not meet the requirement, adopting a Cosine LR scheduler scheduler to dynamically adjust the learning rate.
3. The crowd early warning method of claim 2, wherein the employing a target detection algorithm to perform head detection on pictures in the enhanced dataset comprises:
adjusting the picture size to 960 pixels at the input;
before entering a backbone network module of a target detection network, the picture firstly enters a Focus module for slicing;
and carrying out convolution operation on the sliced picture to obtain a double downsampling characteristic diagram under the condition of no information loss.
4. The crowd early warning method of claim 3, wherein the picture is sliced by entering a Focus module before entering a backbone network module of the target detection network, the entering the Focus module to slice comprising:
and taking a value from every other pixel in each picture to obtain four complementary sampling pictures, so that the channel width and the channel height of the sampling pictures are reduced to half of the original image, but the input channels are expanded by 4 times, and the spliced pictures form 12 channels relative to the original RGB three channels.
5. The crowd early warning method of claim 2, wherein an adaptive calculation anchor frame strategy is adopted to adjust the width and the height of the detection frame;
in the method, in the process of the invention,representing the width of the detection frame,/-, and>representing the width of the overall image, +.>Representing the width of tensor +.>Representing an activation function; />
6. The crowd early warning method of claim 2, wherein a warm-up training strategy is used prior to training a target detection algorithm; the pre-heat training strategy comprises the following steps: firstly, training is carried out for 5 times by using learning rate data smaller than a preset learning rate, and then training is carried out by modifying the learning rate data into the preset learning rate.
7. The crowd early warning method of claim 3, wherein the network layer part of the target detection network adopts a structure of combining a characteristic pyramid network and a path aggregation network; and the output end of the detection head of the target detection network uses the CIOU loss function as the loss function of the bounding box.
8. A counting system which is characterized by being applicable to the crowd early warning method of any one of claims 1-7, comprising a monitoring camera, an engineering machine, an alarm, a monitoring video storage and a display screen; the engineering machine is internally provided with a target detection algorithm; the engineering machine comprises an algorithm processing module; the engineering machine can call the video information of the monitoring camera and process the video information through the algorithm processing module to obtain a detection frame of the head of the passenger.
9. A computing device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory in communication over the bus when the computing device is running, the machine-readable instructions when executed by the processor performing the steps of the crowd early warning method of any one of claims 1 to 7.
10. A computer storage medium, wherein a computer program is stored on the computer storage medium, which computer program, when being executed by a processor, performs the steps of the crowd early warning method as claimed in any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310499633.0A CN116229376B (en) | 2023-05-06 | 2023-05-06 | Crowd early warning method, counting system, computing device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310499633.0A CN116229376B (en) | 2023-05-06 | 2023-05-06 | Crowd early warning method, counting system, computing device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116229376A true CN116229376A (en) | 2023-06-06 |
CN116229376B CN116229376B (en) | 2023-08-04 |
Family
ID=86585868
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310499633.0A Active CN116229376B (en) | 2023-05-06 | 2023-05-06 | Crowd early warning method, counting system, computing device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116229376B (en) |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180005071A1 (en) * | 2013-06-25 | 2018-01-04 | University Of Central Florida Research Foundation, Inc. | Multi-Source, Multi-Scale Counting in Dense Crowd Images |
US20200193628A1 (en) * | 2018-12-17 | 2020-06-18 | Microsoft Technology Licensing, Llc | Detecting objects in crowds using geometric context |
CN111832489A (en) * | 2020-07-15 | 2020-10-27 | 中国电子科技集团公司第三十八研究所 | Subway crowd density estimation method and system based on target detection |
CN112115862A (en) * | 2020-09-18 | 2020-12-22 | 广东机场白云信息科技有限公司 | Crowded scene pedestrian detection method combined with density estimation |
CN112232316A (en) * | 2020-12-11 | 2021-01-15 | 科大讯飞(苏州)科技有限公司 | Crowd gathering detection method and device, electronic equipment and storage medium |
CN112801018A (en) * | 2021-02-07 | 2021-05-14 | 广州大学 | Cross-scene target automatic identification and tracking method and application |
CN114627502A (en) * | 2022-03-10 | 2022-06-14 | 安徽农业大学 | Improved YOLOv 5-based target recognition detection method |
EP4033399A1 (en) * | 2021-01-25 | 2022-07-27 | Bull Sas | Computer device and method for estimating the density of a crowd |
CN115424209A (en) * | 2022-09-15 | 2022-12-02 | 华东交通大学 | Crowd counting method based on spatial pyramid attention network |
CN115527270A (en) * | 2022-10-10 | 2022-12-27 | 杭州电子科技大学 | Method for identifying specific behaviors in intensive crowd environment |
CN115713731A (en) * | 2023-01-10 | 2023-02-24 | 武汉图科智能科技有限公司 | Crowd scene pedestrian detection model construction method and crowd scene pedestrian detection method |
CN116071696A (en) * | 2022-11-23 | 2023-05-05 | 中通服和信科技有限公司 | Building stair congestion detection method and device based on YOLOv7 |
-
2023
- 2023-05-06 CN CN202310499633.0A patent/CN116229376B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180005071A1 (en) * | 2013-06-25 | 2018-01-04 | University Of Central Florida Research Foundation, Inc. | Multi-Source, Multi-Scale Counting in Dense Crowd Images |
US20200193628A1 (en) * | 2018-12-17 | 2020-06-18 | Microsoft Technology Licensing, Llc | Detecting objects in crowds using geometric context |
CN111832489A (en) * | 2020-07-15 | 2020-10-27 | 中国电子科技集团公司第三十八研究所 | Subway crowd density estimation method and system based on target detection |
CN112115862A (en) * | 2020-09-18 | 2020-12-22 | 广东机场白云信息科技有限公司 | Crowded scene pedestrian detection method combined with density estimation |
CN112232316A (en) * | 2020-12-11 | 2021-01-15 | 科大讯飞(苏州)科技有限公司 | Crowd gathering detection method and device, electronic equipment and storage medium |
EP4033399A1 (en) * | 2021-01-25 | 2022-07-27 | Bull Sas | Computer device and method for estimating the density of a crowd |
CN112801018A (en) * | 2021-02-07 | 2021-05-14 | 广州大学 | Cross-scene target automatic identification and tracking method and application |
CN114627502A (en) * | 2022-03-10 | 2022-06-14 | 安徽农业大学 | Improved YOLOv 5-based target recognition detection method |
CN115424209A (en) * | 2022-09-15 | 2022-12-02 | 华东交通大学 | Crowd counting method based on spatial pyramid attention network |
CN115527270A (en) * | 2022-10-10 | 2022-12-27 | 杭州电子科技大学 | Method for identifying specific behaviors in intensive crowd environment |
CN116071696A (en) * | 2022-11-23 | 2023-05-05 | 中通服和信科技有限公司 | Building stair congestion detection method and device based on YOLOv7 |
CN115713731A (en) * | 2023-01-10 | 2023-02-24 | 武汉图科智能科技有限公司 | Crowd scene pedestrian detection model construction method and crowd scene pedestrian detection method |
Non-Patent Citations (3)
Title |
---|
徐守坤;倪楚涵;吉晨晨;李宁;: "一种基于安全帽佩戴检测的图像描述方法研究", 小型微型计算机系统, no. 04 * |
沈守娟;郑广浩;彭译萱;王展青;: "基于YOLOv3算法的教室学生检测与人数统计方法", 软件导刊, no. 09 * |
谭智勇;袁家政;刘宏哲;李青;: "基于深度卷积神经网络的人群密度估计方法", 计算机应用与软件, no. 07 * |
Also Published As
Publication number | Publication date |
---|---|
CN116229376B (en) | 2023-08-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020173226A1 (en) | Spatial-temporal behavior detection method | |
CN103824070B (en) | A kind of rapid pedestrian detection method based on computer vision | |
CN112257609B (en) | Vehicle detection method and device based on self-adaptive key point heat map | |
CN106778540B (en) | Parking detection is accurately based on the parking event detecting method of background double layer | |
CN111915583B (en) | Vehicle and pedestrian detection method based on vehicle-mounted thermal infrared imager in complex scene | |
CN106778633B (en) | Pedestrian identification method based on region segmentation | |
CN107944403A (en) | Pedestrian's attribute detection method and device in a kind of image | |
CN113469278B (en) | Strong weather target identification method based on deep convolutional neural network | |
CN115205264A (en) | High-resolution remote sensing ship detection method based on improved YOLOv4 | |
CN111738336A (en) | Image detection method based on multi-scale feature fusion | |
CN109785288A (en) | Transmission facility defect inspection method and system based on deep learning | |
CN109815798A (en) | Unmanned plane image processing method and system | |
CN116052026B (en) | Unmanned aerial vehicle aerial image target detection method, system and storage medium | |
CN115147745A (en) | Small target detection method based on urban unmanned aerial vehicle image | |
CN110087041A (en) | Video data processing and transmission method and system based on the base station 5G | |
KR101874968B1 (en) | Visibility measuring system base on image information and method for using the same | |
CN115546763A (en) | Traffic signal lamp identification network training method and test method based on visual ranging | |
Lin et al. | Small object detection in aerial view based on improved YoloV3 neural network | |
CN115937796A (en) | Event classification method, system, equipment and medium based on pre-training model | |
CN109271904A (en) | A kind of black smoke vehicle detection method based on pixel adaptivenon-uniform sampling and Bayesian model | |
CN116311084A (en) | Crowd gathering detection method and video monitoring equipment | |
CN116485885A (en) | Method for removing dynamic feature points at front end of visual SLAM based on deep learning | |
CN114399734A (en) | Forest fire early warning method based on visual information | |
CN113095404B (en) | X-ray contraband detection method based on front-back background convolution neural network | |
CN113936299A (en) | Method for detecting dangerous area in construction site |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240107 Address after: Building A, Building 1102, Yeda Zhigu, No. 300 Changjiang Road, Yantai Area, China (Shandong) Pilot Free Trade Zone, Yantai City, Shandong Province, 264000 Patentee after: Yantai Jiuyuan Technology Service Co.,Ltd. Address before: Room 522, Plant 2, No. 32, the Pearl River Road, Yantai Economic and Technological Development Zone, Shandong Province, 264000 Patentee before: Shandong Yishi Intelligent Technology Co.,Ltd. |