CN111291597A - Image-based crowd situation analysis method, device, equipment and system - Google Patents

Image-based crowd situation analysis method, device, equipment and system Download PDF

Info

Publication number
CN111291597A
CN111291597A CN201811494297.6A CN201811494297A CN111291597A CN 111291597 A CN111291597 A CN 111291597A CN 201811494297 A CN201811494297 A CN 201811494297A CN 111291597 A CN111291597 A CN 111291597A
Authority
CN
China
Prior art keywords
crowd
image
analyzed
branch
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811494297.6A
Other languages
Chinese (zh)
Other versions
CN111291597B (en
Inventor
童超
车军
任烨
朱江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN201811494297.6A priority Critical patent/CN111291597B/en
Publication of CN111291597A publication Critical patent/CN111291597A/en
Application granted granted Critical
Publication of CN111291597B publication Critical patent/CN111291597B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides a method, a device, equipment and a system for analyzing crowd situation based on an image, wherein the method comprises the following steps: inputting the image into a neural network model obtained by pre-training; counting the crowd density in the image by using the crowd density counting branch in the neural network model; analyzing the crowd behaviors in the image by utilizing the crowd behavior analysis branch in the neural network model; and judging whether a group event occurs or not based on the crowd density and the crowd behavior. Therefore, in the scheme, the neural network model at least comprises two branches, one branch counts the crowd density in the image, the other branch analyzes the crowd behavior in the image, whether a group event occurs or not is analyzed from the crowd density and the behavior, and the accuracy of an analysis result is improved.

Description

Image-based crowd situation analysis method, device, equipment and system
Technical Field
The invention relates to the technical field of monitoring, in particular to a crowd situation analysis method, device, equipment and system based on images.
Background
Group events occurring in some public places, such as crowd gathering, fighting, crowd pedaling, etc., greatly affect public safety. At present, the crowd situation can be analyzed based on the monitoring image so as to timely process the crowd event and improve the public safety.
In some related schemes, monitoring images of public places are generally acquired, the number of people in the monitoring images is counted, and if the number is large, a group event occurs. However, in some places often crowded, such as subway stations and train stations at the peak of the morning and evening, the large number of people does not necessarily indicate that the group event occurs. As can be seen, the analysis result of the scheme is poor in accuracy.
Disclosure of Invention
The embodiment of the invention aims to provide a method, a device, equipment and a system for analyzing the crowd situation based on an image so as to improve the accuracy of an analysis result.
In order to achieve the above object, an embodiment of the present invention provides a method for analyzing a crowd situation based on an image, including:
acquiring an image to be analyzed;
inputting the image to be analyzed into a neural network model obtained by pre-training;
counting the crowd density in the image to be analyzed by using the crowd density counting branch in the neural network model;
analyzing the crowd behaviors in the image to be analyzed by utilizing the crowd behavior analysis branch in the neural network model;
and judging whether a group event occurs or not based on the crowd density and the crowd behavior.
Optionally, the method further includes:
utilizing a density distribution prediction branch in the neural network model to identify the crowd density value at each pixel point in the image to be analyzed;
the determining whether a groupware event occurs based on the crowd density and the crowd behavior includes:
and judging whether a population event occurs or not based on the crowd density, the crowd behavior and the crowd density value of each pixel point in the image to be analyzed.
Optionally, the process of training to obtain the neural network model includes:
inputting the sample image into a neural network with a preset structure;
carrying out convolution processing on the sample image by utilizing a convolution layer in the neural network, and respectively inputting data obtained after the convolution processing into a crowd density statistic branch, a crowd behavior analysis branch and a density distribution prediction branch in the neural network;
iteratively adjusting parameters in the convolutional layer based on a first loss function of the crowd density statistics branch, a first output result of the crowd density statistics branch, a second loss function of the crowd behavior analysis branch, a second output result of the crowd behavior analysis branch, a third loss function of the density distribution prediction branch, and a third output result of the density distribution prediction branch;
and when the adjustment of the parameters in the convolutional layer meets the convergence condition, obtaining the trained neural network model.
Optionally, the determining whether a group event occurs based on the crowd density and the crowd behavior includes:
if the crowd density exceeds a preset threshold value and the crowd behavior is abnormal, judging that a group event occurs;
in the case where it is determined that a groupware event has occurred, the method further comprises: and outputting alarm information.
Optionally, the acquiring an image to be analyzed includes:
acquiring an image to be analyzed through an unmanned aerial vehicle holder;
in the case where it is determined that a groupware event has occurred, the method further comprises:
adjusting the cloud deck to acquire images aiming at the region where the group event occurs based on the position information of the unmanned aerial vehicle;
or, under the condition that a group event is judged to occur, determining the moving direction and the moving speed of the crowd based on the image to be analyzed; and controlling the unmanned aerial vehicle to track and collect the crowd according to the moving direction and the moving speed.
Optionally, based on the position information of the unmanned aerial vehicle, adjusting the pan/tilt head to perform image acquisition for an area where a group event occurs includes:
determining image coordinates of the crowd in the image to be analyzed; based on the conversion relation between the PTZ coordinate system of the unmanned aerial vehicle holder and the image coordinate system to be analyzed and the determined image coordinate, PTZ adjustment information of the holder is calculated, and based on the PTZ adjustment information, the holder is adjusted to carry out image acquisition aiming at the region where the group event occurs.
Optionally, in the case that it is determined that a groupware event occurs, the method further includes:
identifying a human body target with abnormal behaviors in the image to be analyzed;
extracting the characteristics of the human body target, and determining the identity information and/or the motion trail of the human body target based on the extracted characteristics.
In order to achieve the above object, an embodiment of the present invention further provides an image-based crowd situation analyzing apparatus, including:
the acquisition module is used for acquiring an image to be analyzed;
the input module is used for inputting the image to be analyzed into a neural network model obtained by pre-training;
the statistical module is used for counting the crowd density in the image to be analyzed by utilizing the crowd density statistical branch in the neural network model;
the analysis module is used for analyzing the crowd behaviors in the image to be analyzed by utilizing the crowd behavior analysis branch in the neural network model;
and the judging module is used for judging whether a group event occurs or not based on the crowd density and the crowd behavior.
Optionally, the apparatus further comprises:
the identification module is used for identifying the crowd density value at each pixel point in the image to be analyzed by using the density distribution prediction branch in the neural network model;
the judgment module is specifically configured to: and judging whether a population event occurs or not based on the crowd density, the crowd behavior and the crowd density value of each pixel point in the image to be analyzed.
Optionally, the apparatus further comprises:
the training module is used for inputting the sample image into a neural network with a preset structure; carrying out convolution processing on the sample image by utilizing a convolution layer in the neural network, and respectively inputting data obtained after the convolution processing into a crowd density statistic branch, a crowd behavior analysis branch and a density distribution prediction branch in the neural network; iteratively adjusting parameters in the convolutional layer based on a first loss function of the crowd density statistics branch, a first output result of the crowd density statistics branch, a second loss function of the crowd behavior analysis branch, a second output result of the crowd behavior analysis branch, a third loss function of the density distribution prediction branch, and a third output result of the density distribution prediction branch; and when the adjustment of the parameters in the convolutional layer meets the convergence condition, obtaining the trained neural network model.
Optionally, the determining module is specifically configured to:
if the crowd density exceeds a preset threshold value and the crowd behavior is abnormal, judging that a group event occurs; the device further comprises:
and the alarm module is used for outputting alarm information under the condition of judging that the group event occurs.
Optionally, the obtaining module is specifically configured to:
acquiring an image to be analyzed through an unmanned aerial vehicle holder; the device further comprises:
the control module is used for adjusting the holder to acquire images aiming at the region where the group event occurs based on the position information of the unmanned aerial vehicle;
or, under the condition that a group event is judged to occur, determining the moving direction and the moving speed of the crowd based on the image to be analyzed; and controlling the unmanned aerial vehicle to track and collect the crowd according to the moving direction and the moving speed.
Optionally, the control module is further configured to:
determining image coordinates of the crowd in the image to be analyzed; based on the conversion relation between the PTZ coordinate system of the unmanned aerial vehicle holder and the image coordinate system to be analyzed and the determined image coordinate, PTZ adjustment information of the holder is calculated, and based on the PTZ adjustment information, the holder is adjusted to carry out image acquisition aiming at the region where the group event occurs.
Optionally, the apparatus further comprises:
the determining module is used for identifying a human body target with abnormal behaviors in the image to be analyzed under the condition of judging that a group event occurs; extracting the characteristics of the human body target, and determining the identity information and/or the motion trail of the human body target based on the extracted characteristics.
In order to achieve the above object, an embodiment of the present invention further provides an image-based crowd situation analysis system, including: an unmanned aerial vehicle and a ground station; wherein,
the unmanned aerial vehicle is used for acquiring images and sending the acquired images to the ground station;
the ground station is used for receiving the image sent by the unmanned aerial vehicle as an image to be analyzed; inputting the image to be analyzed into a neural network model obtained by pre-training; counting the crowd density in the image to be analyzed by using the crowd density counting branch in the neural network model; analyzing the crowd behaviors in the image to be analyzed by utilizing the crowd behavior analysis branch in the neural network model; and judging whether a group event occurs or not based on the crowd density and the crowd behavior.
Optionally, the ground station is further configured to:
under the condition that the occurrence of the group event is judged, adjusting a tripod head of the unmanned aerial vehicle to acquire images aiming at the region where the group event occurs based on the position information of the unmanned aerial vehicle;
or, under the condition that a group event is judged to occur, determining the moving direction and the moving speed of the crowd based on the image to be analyzed; and controlling the unmanned aerial vehicle to track and collect the crowd according to the moving direction and the moving speed.
Optionally, the ground station is further configured to:
identifying a human body target with abnormal behaviors in the image to be analyzed;
extracting the characteristics of the human body target, and determining the identity information and/or the motion trail of the human body target based on the extracted characteristics.
In order to achieve the above object, an embodiment of the present invention further provides an electronic device, including a processor and a memory;
a memory for storing a computer program;
and the processor is used for realizing any one of the image-based crowd situation analysis methods when executing the program stored in the memory.
In order to achieve the above object, an embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements any one of the above image-based people situation analysis methods.
In the embodiment of the invention, an image is input into a neural network model obtained by pre-training; counting the crowd density in the image by using the crowd density counting branch in the neural network model; analyzing the crowd behaviors in the image by utilizing the crowd behavior analysis branch in the neural network model; and judging whether a group event occurs or not based on the crowd density and the crowd behavior. Therefore, in the scheme, the neural network model at least comprises two branches, one branch counts the crowd density in the image, the other branch analyzes the crowd behavior in the image, whether a group event occurs or not is analyzed from the crowd density and the behavior, and the accuracy of an analysis result is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a method for analyzing a crowd situation based on an image according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a first structure of a neural network according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a second structure of a neural network according to an embodiment of the present invention;
FIG. 4 is a first structural diagram of a neural network model according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating a second structure of a neural network model according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an image-based crowd situation analyzing apparatus according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a system for analyzing a crowd situation based on an image according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Corresponding to the above method embodiments, embodiments of the present invention provide a method, an apparatus, a device, and a system for analyzing a crowd situation based on an image, where the method and the apparatus may be applied to various electronic devices such as an unmanned aerial vehicle, a ground station, a personal computer, and a server, and are not limited specifically. First, the image-based crowd situation analysis method provided by the embodiment of the present invention is explained in detail below.
Fig. 1 is a schematic flow chart of a method for analyzing a crowd situation based on an image according to an embodiment of the present invention, including:
s101: and acquiring an image to be analyzed.
For convenience of description, in this embodiment, an image that needs to be subjected to crowd situation analysis is referred to as an image to be analyzed. If the execution main body is an unmanned aerial vehicle, the unmanned aerial vehicle can acquire images through a holder of the unmanned aerial vehicle, and the acquired images serve as images to be analyzed. If the execution subject is a ground station, the ground station can receive the image sent by the unmanned aerial vehicle as the image to be analyzed. If the execution subject is other electronic equipment, the electronic equipment can receive the image sent by the monitoring equipment as the image to be analyzed. The image acquisition modes are many and are not listed.
S102: and inputting the image to be analyzed into a neural network model obtained by pre-training.
S103: and counting the crowd density in the image to be analyzed by utilizing the crowd density counting branch in the neural network model.
S104: and analyzing the crowd behaviors in the image to be analyzed by utilizing the crowd behavior analysis branch in the neural network model.
In the embodiment of the invention, the trained neural network model at least comprises two branches: one branch is used for counting the crowd density in the image and is called as a crowd density counting branch; the other branch is used for analyzing the crowd behavior in the image and is called a crowd behavior analysis branch.
As an embodiment, the process of training the neural network model may include:
acquiring a sample image required by training, and inputting the sample image into a neural network with a preset structure; carrying out convolution processing on the sample image by utilizing a convolution layer in the neural network, and respectively inputting data obtained after the convolution processing into a crowd density statistical branch and a crowd behavior analysis branch in the neural network; iteratively adjusting parameters in the convolutional layer based on a first loss function of the crowd density statistics branch, a first output result of the crowd density statistics branch, a second loss function of the crowd behavior analysis branch, and a second output result of the crowd behavior analysis branch; and when the adjustment of the parameters in the convolutional layer meets the convergence condition, obtaining the trained neural network model.
The neural network has the same structure as the neural network model, the training process is the process of adjusting the parameters of the neural network, and the trained neural network model is obtained when the parameter adjustment is completed.
In the present embodiment, for the purpose of description distinction, the loss function of the demographic branch is referred to as a first loss function, and the second loss function of the demographic behavior analysis branch is referred to as a second loss function.
For example, the first loss function may be a binary cross-entropy function, such as:
Figure BDA0001896470730000081
wherein ,LBehaveA loss value representing a first loss function, i represents a sample image in the training process, N represents the total number of sample images,
Figure BDA0001896470730000082
the predicted value of the crowd behavior label corresponding to the sample image i is represented, SiAnd representing the real value of the crowd behavior label corresponding to the sample image i. For example, crowd behavior tags may include both the presence of abnormal behavior and the absence of abnormal behaviorFor example, the crowd behavior tag may be 1 or-1, with 1 indicating the absence of abnormal behavior and-1 indicating the presence of abnormal behavior.
The second loss function may be a multi-class cross-entropy loss function, such as:
Figure BDA0001896470730000083
wherein ,LDensityAnd (3) representing the loss value of the second loss function, i represents the sample image in the training process, N represents the total number of the sample images, j represents the crowd density grade, in the formula 2, j comprises 5 grades, the specific number of the grades is not limited, and 5 in the formula 2 can be changed into other values.
Figure BDA0001896470730000084
Representing the probability value, S, of predicting a sample image i to a density level jijIndicating whether the sample image i is the true value of the density level j, for example, if the density level corresponding to the sample image i is 1, Si1Can be 1, Si2-Si5May be 0. For example, the crowd density may be divided into 5 levels: very rare (0-20), sparse (21-50), medium (51-100), dense (101-500), very dense (501 or more). The grade division can be performed according to the actual situation, and the specific division mode is not limited.
Referring to fig. 2, a sample image to be trained is input to a convolutional layer in a neural network, and data output by the convolutional layer is respectively input to a fully connected layer of a crowd density statistics branch and a fully connected layer of a crowd behavior analysis branch; the neural network is trained, i.e. parameters in the neural network are adjusted, using the first loss function and the second loss function.
In one case, the neural network of the preset structure may be a Residual neural network (resnet). In this case, the convolutional layer in fig. 2 may be resnet18, which includes a convolution of five layers. The recognition performance of the residual neural network is better. Alternatively, the neural network may be another type of neural network, and is not limited in particular.
As another embodiment, a density distribution prediction branch may be further included in the neural network model, in which case, the density distribution prediction branch may be used to identify a population density value at each pixel point in the image to be analyzed.
In this case, the process of training the neural network model may include:
inputting the sample image into a neural network with a preset structure;
carrying out convolution processing on the sample image by utilizing a convolution layer in the neural network, and respectively inputting data obtained after the convolution processing into a crowd density statistic branch, a crowd behavior analysis branch and a density distribution prediction branch in the neural network;
iteratively adjusting parameters in the convolutional layer based on a first loss function of the crowd density statistics branch, a first output result of the crowd density statistics branch, a second loss function of the crowd behavior analysis branch, a second output result of the crowd behavior analysis branch, a third loss function of the density distribution prediction branch, and a third output result of the density distribution prediction branch;
and when the adjustment of the parameters in the convolutional layer meets the convergence condition, obtaining the trained neural network model.
The first loss function and the second loss function may refer to the above, and the third loss function may be:
Figure BDA0001896470730000101
wherein ,LCrowdHeatmapExpressing the loss value of a third loss function, k expressing the kth pixel point in the sample image, M expressing the total number of the pixel points in the sample image, SkThe true crowd density value at pixel point k is represented,
Figure BDA0001896470730000102
and representing the predicted crowd density value at the pixel point k.
For example, if there is no people distribution at pixel point k, the crowd density value at pixel point k is 0. If 3 people are included in the area a and 1000 pixels are included in the area a, the crowd density value at each pixel in the area a is 3/1000.
Referring to fig. 3, a sample image to be trained is input to a convolutional layer in a neural network, and data output by the convolutional layer is respectively input to a fully connected layer of a crowd density statistics branch, a fully connected layer of a crowd behavior analysis branch, and a fully connected layer of a density distribution prediction branch; the neural network is trained, i.e. parameters in the neural network are adjusted, using the first loss function, the second loss function and the third loss function.
In one case, the neural network of the preset structure may be a Residual neural network (resnet). In this case, the convolutional layer in fig. 3 may be resnet18, which includes a convolution of five layers. In the density distribution prediction branch, the data output by resnet18 may be input into a convolution of 1 × 1 to obtain a density distribution heat map, which includes the population density values at each pixel point in the sample image.
S105: and judging whether a group event occurs or not based on the crowd density and the crowd behavior.
Referring to fig. 4, fig. 4 may be understood as a neural network model obtained after the neural network in fig. 2 is trained, a population density statistical branch in the neural network model may output the population density in the image, and a population behavior analysis branch may output whether the population in the image has abnormal behavior.
In this case, if the output result of the neural network model indicates: and if the crowd density exceeds a preset threshold and the crowd behavior is abnormal, judging that a group event occurs.
Or, in another case, the output of the population density statistics branch may also be a density level corresponding to the population density in the image; in this case, if the output result of the neural network model indicates: and if the crowd density grade reaches a preset grade and the crowd behavior is abnormal, judging that a group event occurs.
As described above, in another embodiment, the neural network model further includes a density distribution prediction branch. Referring to fig. 5, fig. 5 may be understood as a neural network model obtained after the neural network training in fig. 3 is completed, where the population density statistical branch may output the population density (or the population density level) in the image, the population behavior analysis branch may output whether there is an abnormal behavior in the population in the image, and the density distribution prediction branch may output the population density value at each pixel point in the image.
In one case, the convolutional layer in fig. 5 may be resnet18, which includes a convolution of five layers. In the density distribution prediction branch, the data output by resnet18 may be input into a convolution of 1 × 1 to obtain a density distribution heat map, where the density distribution heat map includes the crowd density values at each pixel point in the image to be analyzed.
In this embodiment, it may be determined whether a population event occurs according to the output results of the three branches, that is, S105 may include: and judging whether a population event occurs or not based on the crowd density, the crowd behavior and the crowd density value of each pixel point in the image to be analyzed.
For example, three determination conditions may be set for the three branches: firstly, the crowd density exceeds a first preset threshold; secondly, the number of the target pixel points exceeds a second preset threshold, and the target pixel points are: pixel points with the crowd density value exceeding a third preset threshold value; thirdly, the behavior of the crowd is abnormal. The specific values of the first preset threshold, the second preset threshold, the third preset threshold and other preset thresholds mentioned in this embodiment are not limited.
If the output result of the neural network model indicates that any two of the output results are satisfied, it can be determined that a population event occurs. Alternatively, when the output result of the neural network model indicates that the three conditions are satisfied, it may be determined that a population event has occurred.
In one embodiment, in the event that a groupware event is determined to have occurred, alarm information may be output. The alarm information may be text information, voice information, or may also be a flash lamp alarm, etc., and the specific alarm mode is not limited.
As described above, in one embodiment, the image to be analyzed may be acquired by the unmanned aerial vehicle pan-tilt; in this case, when it is determined that a group event occurs, the pan/tilt head may be adjusted to perform image capturing for an area where the group event occurs, based on the position information of the drone.
For example, the image coordinates of the population in the image to be analyzed can be determined; based on the conversion relation between the PTZ coordinate system of the unmanned aerial vehicle holder and the image coordinate system to be analyzed and the determined image coordinate, PTZ adjustment information of the holder is calculated, and based on the PTZ adjustment information, the holder is adjusted to carry out image acquisition aiming at the region where the group event occurs.
The PTZ coordinate system is: pan, Tilt, Zoom coordinate system can acquire the conversion relation between image coordinate system and unmanned aerial vehicle cloud platform PTZ coordinate system in advance. Assuming that the image coordinates of the crowd in the image to be analyzed are determined to be (X1, Y1), and the coordinates of the center point of the image are (X0, Y0); according to the conversion relationship between the image coordinate system and the PTZ coordinate system, the (X1, Y1) → (X0, Y0) positional deviation relationship is converted into the unmanned aerial vehicle pan-tilt PTZ adjustment information. Assuming that the PTZ coordinates of the unmanned aerial vehicle pan-tilt head when the image to be analyzed is acquired are (P1, T1, Z1), the unmanned aerial vehicle pan-tilt head is adjusted based on the PTZ adjustment information on the basis of (P1, T1, Z1). Like this, unmanned aerial vehicle cloud platform after the adjustment aims at crowd and carries out image acquisition to can enlarge the crowd region in the image through adjusting Zoom, obtain more details about the crowd.
Alternatively, the conversion relationship between the image coordinate system and the world coordinate system is also acquired in advance. If the unmanned aerial vehicle is far away from the crowd, the coordinates of the crowd in the image coordinate system can be converted into the coordinates in the world coordinate system; and then adjusting the unmanned aerial vehicle to fly towards the crowd according to the crowd and the coordinates of the unmanned aerial vehicle in the world coordinate system, and adjusting the holder to acquire images of the area with the group event when the unmanned aerial vehicle flies to be closer to the crowd.
In another embodiment, after an image to be analyzed is acquired through an unmanned aerial vehicle holder, the moving direction and the moving speed of a crowd can be determined based on the image to be analyzed under the condition that a group event is judged to occur; and controlling the unmanned aerial vehicle to track and collect the crowd according to the moving direction and the moving speed.
For example, the image to be analyzed may be a video frame image, and based on the video frame image, the moving direction and the moving speed of the crowd in the image coordinate system may be determined. The conversion relation between the image coordinate system and the world coordinate system can be obtained in advance, and the moving direction and the moving speed of the crowd in the world coordinate system are determined according to the conversion relation and the moving direction and the moving speed of the crowd in the image coordinate system. According to the moving direction and the moving speed of the crowd in the world coordinate system, the flight direction and the flight speed of the unmanned aerial vehicle are adjusted so as to track and collect the crowd.
As an embodiment, in the case that it is determined that a population event occurs, a human target with abnormal behavior may be identified in the image to be analyzed; extracting the characteristics of the human body target, and determining the identity information and/or the motion trail of the human body target based on the extracted characteristics.
The extracted features may be face features, or may also be clothing features, such as clothing color, whether to be a backpack, or may also be attribute features such as gender and age, which are not limited specifically. The extracted features may be stored in the form of a preset structure, that is, structured information.
In one case, it is assumed that the device (execution main body) executing the scheme is a ground station, the features extracted by the ground station are face features, the ground station is connected with a database, and the database stores identity information and face features of a plurality of persons; and the ground station matches the extracted human face features with the human face features in the database, and determines the identity information of the personnel according to the matching result. Further, the personnel participating in the group event can be subsequently investigated according to the identity information.
In one case, it is assumed that the device (execution main body) executing the scheme is a ground station, and the ground station is connected with a plurality of monitoring devices; and the ground station identifies the human body target matched with the extracted features in the images acquired by the plurality of monitoring devices, and determines the motion track of the human body target according to the time sequence of the human body target appearing in each monitoring device. The human body target, namely the human body target with abnormal behavior in the image to be analyzed, namely the personnel participating in the group event, determines the motion trail of the personnel so as to facilitate the tracking and subsequent processing of the personnel.
In the embodiment of the invention, in the first aspect, the neural network model at least comprises two branches, one branch counts the crowd density in the image, and the other branch analyzes the crowd behavior in the image, so that whether a group event occurs or not is analyzed from the two aspects of the crowd density and the behavior, and the accuracy of an analysis result is improved. In the second aspect, the neural network model obtained based on the residual neural network training has better recognition performance. And in the third aspect, under the condition that the occurrence of the group event is judged, an alarm is automatically given to prompt relevant personnel to process in time. In the fourth aspect, under the condition that a group event is determined to occur, a detailed image can be acquired for a crowd, or an unmanned aerial vehicle is controlled to track and collect the crowd, or identity information or a motion track of a person participating in the event is determined, so that subsequent processing of related persons is facilitated.
Corresponding to the above method embodiment, an embodiment of the present invention further provides an image-based crowd situation analyzing apparatus, as shown in fig. 6, including:
an obtaining module 601, configured to obtain an image to be analyzed;
an input module 602, configured to input the image to be analyzed into a neural network model obtained through pre-training;
a statistic module 603, configured to utilize a population density statistic branch in the neural network model to count a population density in the image to be analyzed;
an analysis module 604, configured to analyze the crowd behavior in the image to be analyzed by using the crowd behavior analysis branch in the neural network model;
a determining module 605, configured to determine whether a group event occurs based on the crowd density and the crowd behavior.
As an embodiment, the apparatus further comprises:
an identification module (not shown in the figure) for identifying a crowd density value at each pixel point in the image to be analyzed by using a density distribution prediction branch in the neural network model;
the determining module 605 is specifically configured to: and judging whether a population event occurs or not based on the crowd density, the crowd behavior and the crowd density value of each pixel point in the image to be analyzed.
As an embodiment, the apparatus further comprises:
a training module (not shown in the figure) for inputting the sample image into a neural network of a preset structure; carrying out convolution processing on the sample image by utilizing a convolution layer in the neural network, and respectively inputting data obtained after the convolution processing into a crowd density statistic branch, a crowd behavior analysis branch and a density distribution prediction branch in the neural network; iteratively adjusting parameters in the convolutional layer based on a first loss function of the crowd density statistics branch, a first output result of the crowd density statistics branch, a second loss function of the crowd behavior analysis branch, a second output result of the crowd behavior analysis branch, a third loss function of the density distribution prediction branch, and a third output result of the density distribution prediction branch; and when the adjustment of the parameters in the convolutional layer meets the convergence condition, obtaining the trained neural network model.
As an embodiment, the determining module 605 is specifically configured to:
if the crowd density exceeds a preset threshold value and the crowd behavior is abnormal, judging that a group event occurs; the device further comprises:
and an alarm module (not shown in the figure) for outputting alarm information in case of determining that a group event occurs.
As an embodiment, the obtaining module 601 is specifically configured to:
acquiring an image to be analyzed through an unmanned aerial vehicle holder; the device further comprises:
a control module (not shown in the figure) for adjusting the cradle head to perform image acquisition for an area where a group event occurs based on the position information of the unmanned aerial vehicle;
or, under the condition that a group event is judged to occur, determining the moving direction and the moving speed of the crowd based on the image to be analyzed; and controlling the unmanned aerial vehicle to track and collect the crowd according to the moving direction and the moving speed.
As an embodiment, the control module is further configured to:
determining image coordinates of the crowd in the image to be analyzed; based on the conversion relation between the PTZ coordinate system of the unmanned aerial vehicle holder and the image coordinate system to be analyzed and the determined image coordinate, PTZ adjustment information of the holder is calculated, and based on the PTZ adjustment information, the holder is adjusted to carry out image acquisition aiming at the region where the group event occurs.
As an embodiment, the apparatus further comprises:
a determining module (not shown in the figure) for identifying a human body target with abnormal behavior in the image to be analyzed under the condition that the occurrence of the group event is judged; extracting the characteristics of the human body target, and determining the identity information and/or the motion trail of the human body target based on the extracted characteristics.
In the embodiment of the invention, an image is input into a neural network model obtained by pre-training; counting the crowd density in the image by using the crowd density counting branch in the neural network model; analyzing the crowd behaviors in the image by utilizing the crowd behavior analysis branch in the neural network model; and judging whether a group event occurs or not based on the crowd density and the crowd behavior. Therefore, in the scheme, the neural network model at least comprises two branches, one branch counts the crowd density in the image, the other branch analyzes the crowd behavior in the image, whether a group event occurs or not is analyzed from the crowd density and the behavior, and the accuracy of an analysis result is improved.
An embodiment of the present invention further provides an image-based crowd situation analysis system, as shown in fig. 7, the system includes: an unmanned aerial vehicle and a ground station; wherein,
the unmanned aerial vehicle is used for acquiring images and sending the acquired images to the ground station;
the ground station is used for receiving the image sent by the unmanned aerial vehicle as an image to be analyzed; inputting the image to be analyzed into a neural network model obtained by pre-training; counting the crowd density in the image to be analyzed by using the crowd density counting branch in the neural network model; analyzing the crowd behaviors in the image to be analyzed by utilizing the crowd behavior analysis branch in the neural network model; and judging whether a group event occurs or not based on the crowd density and the crowd behavior.
Or in another embodiment, the unmanned aerial vehicle takes the image acquired by the unmanned aerial vehicle as the image to be analyzed; inputting the image to be analyzed into a neural network model obtained by pre-training; counting the crowd density in the image to be analyzed by using the crowd density counting branch in the neural network model; analyzing the crowd behaviors in the image to be analyzed by utilizing the crowd behavior analysis branch in the neural network model; and judging whether a group event occurs or not based on the crowd density and the crowd behavior. Then the unmanned aerial vehicle sends the acquired image and the judgment result to the ground station.
In this embodiment, the unmanned aerial vehicle may further convert the coordinates of the crowd in the image coordinate system into coordinates in the world coordinate system when it is determined that the group event occurs, and transmit the coordinates of the crowd in the world coordinate system to the ground station, so as to facilitate subsequent processing by a person associated with the ground station.
As an embodiment, the ground station may adjust a pan-tilt head of the drone to perform image acquisition for an area where a group event occurs based on the position information of the drone in a case where the group event is determined to occur.
For example, the ground station may determine image coordinates of the population in the image to be analyzed; based on the conversion relation between the PTZ coordinate system of the unmanned aerial vehicle holder and the image coordinate system to be analyzed and the determined image coordinate, PTZ adjustment information of the holder is calculated, and based on the PTZ adjustment information, the holder is adjusted to carry out image acquisition aiming at the region where the group event occurs.
The PTZ coordinate system is: pan, Tilt, Zoom coordinate system can acquire the conversion relation between image coordinate system and unmanned aerial vehicle cloud platform PTZ coordinate system in advance. Assuming that the image coordinates of the crowd in the image to be analyzed are determined to be (X1, Y1), and the coordinates of the center point of the image are (X0, Y0); according to the conversion relationship between the image coordinate system and the PTZ coordinate system, the (X1, Y1) → (X0, Y0) positional deviation relationship is converted into the unmanned aerial vehicle pan-tilt PTZ adjustment information. Assuming that the PTZ coordinates of the unmanned aerial vehicle pan-tilt head when the image to be analyzed is acquired are (P1, T1, Z1), the unmanned aerial vehicle pan-tilt head is adjusted based on the PTZ adjustment information on the basis of (P1, T1, Z1). Like this, unmanned aerial vehicle cloud platform after the adjustment aims at crowd and carries out image acquisition to can enlarge the crowd region in the image through adjusting Zoom, obtain more details about the crowd.
Alternatively, the conversion relationship between the image coordinate system and the world coordinate system is also acquired in advance. If the unmanned aerial vehicle is far away from the crowd, the coordinates of the crowd in the image coordinate system can be converted into the coordinates in the world coordinate system; and then adjusting the unmanned aerial vehicle to fly towards the crowd according to the crowd and the coordinates of the unmanned aerial vehicle in the world coordinate system, and adjusting the holder to acquire images of the area with the group event when the unmanned aerial vehicle flies to be closer to the crowd.
As another embodiment, the ground station may determine, based on the image to be analyzed, a moving direction and a moving speed of the crowd when it is determined that a group event occurs; and controlling the unmanned aerial vehicle to track and collect the crowd according to the moving direction and the moving speed.
For example, the image to be analyzed may be a video frame image, and based on the video frame image, the moving direction and the moving speed of the crowd in the image coordinate system may be determined. The conversion relation between the image coordinate system and the world coordinate system can be obtained in advance, and the moving direction and the moving speed of the crowd in the world coordinate system are determined according to the conversion relation and the moving direction and the moving speed of the crowd in the image coordinate system. According to the moving direction and the moving speed of the crowd in the world coordinate system, the flight direction and the flight speed of the unmanned aerial vehicle are adjusted so as to track and collect the crowd.
As an embodiment, the ground station may identify a human target with abnormal behavior in the image to be analyzed; extracting the characteristics of the human body target, and determining the identity information and/or the motion trail of the human body target based on the extracted characteristics.
The extracted features may be face features, or may also be clothing features, such as clothing color, whether to be a backpack, or may also be attribute features such as gender and age, which are not limited specifically. The extracted features may be stored in the form of a preset structure, that is, structured information.
In one case, the features extracted by the ground station are face features, the ground station is connected with a database, and the database stores identity information and face features of a plurality of persons; and the ground station matches the extracted human face features with the human face features in the database, and determines the identity information of the personnel according to the matching result. Further, the personnel participating in the group event can be subsequently investigated according to the identity information.
In one case, a ground station is connected with a plurality of monitoring devices; and the ground station identifies the human body target matched with the extracted features in the images acquired by the plurality of monitoring devices, and determines the motion track of the human body target according to the time sequence of the human body target appearing in each monitoring device. The human body target, namely the human body target with abnormal behavior in the image to be analyzed, namely the personnel participating in the group event, determines the motion trail of the personnel so as to facilitate the tracking and subsequent processing of the personnel.
In the embodiment of the invention, in the first aspect, the neural network model at least comprises two branches, one branch counts the crowd density in the image, and the other branch analyzes the crowd behavior in the image, so that whether a group event occurs or not is analyzed from the two aspects of the crowd density and the behavior, and the accuracy of an analysis result is improved. In a second aspect, under the condition that a group event is determined to occur, the ground station can control the unmanned aerial vehicle to acquire detailed images for the crowd, or control the unmanned aerial vehicle to track and acquire the crowd, or determine the identity information or the motion track of the people participating in the event, so as to facilitate the subsequent processing of the related people.
An electronic device is also provided in the embodiments of the present invention, as shown in fig. 8, including a processor 801 and a memory 802,
a memory 802 for storing a computer program;
the processor 801 is configured to implement any of the above-described methods for analyzing a situation of a crowd based on an image when executing a program stored in the memory 802.
The Memory mentioned in the above electronic device may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.
The electronic equipment can be various electronic equipment such as an unmanned aerial vehicle, a ground station, a personal computer and a server, and is not limited specifically.
The embodiment of the invention also discloses a computer readable storage medium, wherein a computer program is stored in the computer readable storage medium, and when the computer program is executed by a processor, the method for analyzing the crowd situation based on the image is realized.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, apparatus embodiments, system embodiments, device embodiments, and computer-readable storage medium embodiments described above are substantially similar to method embodiments, so that the description is relatively simple, and reference may be made to some descriptions of the method embodiments for relevant points.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (19)

1. An image-based crowd situation analysis method is characterized by comprising the following steps:
acquiring an image to be analyzed;
inputting the image to be analyzed into a neural network model obtained by pre-training;
counting the crowd density in the image to be analyzed by using the crowd density counting branch in the neural network model;
analyzing the crowd behaviors in the image to be analyzed by utilizing the crowd behavior analysis branch in the neural network model;
and judging whether a group event occurs or not based on the crowd density and the crowd behavior.
2. The method of claim 1, further comprising:
utilizing a density distribution prediction branch in the neural network model to identify the crowd density value at each pixel point in the image to be analyzed;
the determining whether a groupware event occurs based on the crowd density and the crowd behavior includes:
and judging whether a population event occurs or not based on the crowd density, the crowd behavior and the crowd density value of each pixel point in the image to be analyzed.
3. The method of claim 2, wherein training the neural network model comprises:
inputting the sample image into a neural network with a preset structure;
carrying out convolution processing on the sample image by utilizing a convolution layer in the neural network, and respectively inputting data obtained after the convolution processing into a crowd density statistic branch, a crowd behavior analysis branch and a density distribution prediction branch in the neural network;
iteratively adjusting parameters in the convolutional layer based on a first loss function of the crowd density statistics branch, a first output result of the crowd density statistics branch, a second loss function of the crowd behavior analysis branch, a second output result of the crowd behavior analysis branch, a third loss function of the density distribution prediction branch, and a third output result of the density distribution prediction branch;
and when the adjustment of the parameters in the convolutional layer meets the convergence condition, obtaining the trained neural network model.
4. The method of claim 1, wherein said determining whether a population event has occurred based on said population density and said population behavior comprises:
if the crowd density exceeds a preset threshold value and the crowd behavior is abnormal, judging that a group event occurs;
in the case where it is determined that a groupware event has occurred, the method further comprises: and outputting alarm information.
5. The method of claim 1, wherein the acquiring an image to be analyzed comprises:
acquiring an image to be analyzed through an unmanned aerial vehicle holder;
in the case where it is determined that a groupware event has occurred, the method further comprises:
adjusting the cloud deck to acquire images aiming at the region where the group event occurs based on the position information of the unmanned aerial vehicle;
or, under the condition that a group event is judged to occur, determining the moving direction and the moving speed of the crowd based on the image to be analyzed; and controlling the unmanned aerial vehicle to track and collect the crowd according to the moving direction and the moving speed.
6. The method according to claim 5, wherein the adjusting the pan/tilt head for image acquisition of the region where the group event occurs based on the position information of the drone comprises:
determining image coordinates of the crowd in the image to be analyzed; based on the conversion relation between the PTZ coordinate system of the unmanned aerial vehicle holder and the image coordinate system to be analyzed and the determined image coordinate, PTZ adjustment information of the holder is calculated, and based on the PTZ adjustment information, the holder is adjusted to carry out image acquisition aiming at the region where the group event occurs.
7. The method according to claim 1, wherein in case it is decided that a population event occurs, the method further comprises:
identifying a human body target with abnormal behaviors in the image to be analyzed;
extracting the characteristics of the human body target, and determining the identity information and/or the motion trail of the human body target based on the extracted characteristics.
8. An image-based crowd situation analysis apparatus, comprising:
the acquisition module is used for acquiring an image to be analyzed;
the input module is used for inputting the image to be analyzed into a neural network model obtained by pre-training;
the statistical module is used for counting the crowd density in the image to be analyzed by utilizing the crowd density statistical branch in the neural network model;
the analysis module is used for analyzing the crowd behaviors in the image to be analyzed by utilizing the crowd behavior analysis branch in the neural network model;
and the judging module is used for judging whether a group event occurs or not based on the crowd density and the crowd behavior.
9. The apparatus of claim 8, further comprising:
the identification module is used for identifying the crowd density value at each pixel point in the image to be analyzed by using the density distribution prediction branch in the neural network model;
the judgment module is specifically configured to: and judging whether a population event occurs or not based on the crowd density, the crowd behavior and the crowd density value of each pixel point in the image to be analyzed.
10. The apparatus of claim 9, further comprising:
the training module is used for inputting the sample image into a neural network with a preset structure; carrying out convolution processing on the sample image by utilizing a convolution layer in the neural network, and respectively inputting data obtained after the convolution processing into a crowd density statistic branch, a crowd behavior analysis branch and a density distribution prediction branch in the neural network; iteratively adjusting parameters in the convolutional layer based on a first loss function of the crowd density statistics branch, a first output result of the crowd density statistics branch, a second loss function of the crowd behavior analysis branch, a second output result of the crowd behavior analysis branch, a third loss function of the density distribution prediction branch, and a third output result of the density distribution prediction branch; and when the adjustment of the parameters in the convolutional layer meets the convergence condition, obtaining the trained neural network model.
11. The apparatus of claim 8, wherein the determining module is specifically configured to:
if the crowd density exceeds a preset threshold value and the crowd behavior is abnormal, judging that a group event occurs; the device further comprises:
and the alarm module is used for outputting alarm information under the condition of judging that the group event occurs.
12. The apparatus of claim 8, wherein the obtaining module is specifically configured to:
acquiring an image to be analyzed through an unmanned aerial vehicle holder; the device further comprises:
the control module is used for adjusting the holder to acquire images aiming at the region where the group event occurs based on the position information of the unmanned aerial vehicle;
or, under the condition that a group event is judged to occur, determining the moving direction and the moving speed of the crowd based on the image to be analyzed; and controlling the unmanned aerial vehicle to track and collect the crowd according to the moving direction and the moving speed.
13. The apparatus of claim 12, wherein the control module is further configured to:
determining image coordinates of the crowd in the image to be analyzed; based on the conversion relation between the PTZ coordinate system of the unmanned aerial vehicle holder and the image coordinate system to be analyzed and the determined image coordinate, PTZ adjustment information of the holder is calculated, and based on the PTZ adjustment information, the holder is adjusted to carry out image acquisition aiming at the region where the group event occurs.
14. The apparatus of claim 8, further comprising:
the determining module is used for identifying a human body target with abnormal behaviors in the image to be analyzed under the condition of judging that a group event occurs; extracting the characteristics of the human body target, and determining the identity information and/or the motion trail of the human body target based on the extracted characteristics.
15. An image-based crowd situational analysis system, comprising: an unmanned aerial vehicle and a ground station; wherein,
the unmanned aerial vehicle is used for acquiring images and sending the acquired images to the ground station;
the ground station is used for receiving the image sent by the unmanned aerial vehicle as an image to be analyzed; inputting the image to be analyzed into a neural network model obtained by pre-training; counting the crowd density in the image to be analyzed by using the crowd density counting branch in the neural network model; analyzing the crowd behaviors in the image to be analyzed by utilizing the crowd behavior analysis branch in the neural network model; and judging whether a group event occurs or not based on the crowd density and the crowd behavior.
16. The system of claim 15, wherein the ground station is further configured to:
under the condition that the occurrence of the group event is judged, adjusting a tripod head of the unmanned aerial vehicle to acquire images aiming at the region where the group event occurs based on the position information of the unmanned aerial vehicle;
or, under the condition that a group event is judged to occur, determining the moving direction and the moving speed of the crowd based on the image to be analyzed; and controlling the unmanned aerial vehicle to track and collect the crowd according to the moving direction and the moving speed.
17. The system of claim 15, wherein the ground station is further configured to:
identifying a human body target with abnormal behaviors in the image to be analyzed;
extracting the characteristics of the human body target, and determining the identity information and/or the motion trail of the human body target based on the extracted characteristics.
18. An electronic device comprising a processor and a memory;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 1 to 7 when executing a program stored in the memory.
19. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 7.
CN201811494297.6A 2018-12-07 2018-12-07 Crowd situation analysis method, device, equipment and system based on image Active CN111291597B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811494297.6A CN111291597B (en) 2018-12-07 2018-12-07 Crowd situation analysis method, device, equipment and system based on image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811494297.6A CN111291597B (en) 2018-12-07 2018-12-07 Crowd situation analysis method, device, equipment and system based on image

Publications (2)

Publication Number Publication Date
CN111291597A true CN111291597A (en) 2020-06-16
CN111291597B CN111291597B (en) 2023-10-13

Family

ID=71021286

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811494297.6A Active CN111291597B (en) 2018-12-07 2018-12-07 Crowd situation analysis method, device, equipment and system based on image

Country Status (1)

Country Link
CN (1) CN111291597B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112232124A (en) * 2020-09-11 2021-01-15 浙江大华技术股份有限公司 Crowd situation analysis method, video processing device and device with storage function

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090222388A1 (en) * 2007-11-16 2009-09-03 Wei Hua Method of and system for hierarchical human/crowd behavior detection
CN105447458A (en) * 2015-11-17 2016-03-30 深圳市商汤科技有限公司 Large scale crowd video analysis system and method thereof
WO2017156443A1 (en) * 2016-03-10 2017-09-14 Rutgers, The State University Of New Jersey Global optimization-based method for improving human crowd trajectory estimation and tracking
US20180005047A1 (en) * 2016-06-30 2018-01-04 Beijing Kuangshi Technology Co., Ltd. Video monitoring method and video monitoring device
CN107729799A (en) * 2017-06-13 2018-02-23 银江股份有限公司 Crowd's abnormal behaviour vision-based detection and analyzing and alarming system based on depth convolutional neural networks
CN107944327A (en) * 2016-10-10 2018-04-20 杭州海康威视数字技术股份有限公司 A kind of demographic method and device
WO2018121690A1 (en) * 2016-12-29 2018-07-05 北京市商汤科技开发有限公司 Object attribute detection method and device, neural network training method and device, and regional detection method and device
CN108256447A (en) * 2017-12-29 2018-07-06 广州海昇计算机科技有限公司 A kind of unmanned plane video analysis method based on deep neural network
US20180253605A1 (en) * 2017-03-03 2018-09-06 International Business Machines Corporation Crowd detection, analysis, and categorization
CN108549835A (en) * 2018-03-08 2018-09-18 深圳市深网视界科技有限公司 Crowd counts and its method, terminal device and the storage medium of model construction
CN108647592A (en) * 2018-04-26 2018-10-12 长沙学院 Group abnormality event detecting method and system based on full convolutional neural networks
CN108848348A (en) * 2018-07-12 2018-11-20 西南科技大学 A kind of crowd's abnormal behaviour monitoring device and method based on unmanned plane
CN108897342A (en) * 2018-08-22 2018-11-27 江西理工大学 For the positioning and tracing method and system of the civilian multi-rotor unmanned aerial vehicle fast moved
CN108921137A (en) * 2018-08-01 2018-11-30 深圳市旭发智能科技有限公司 A kind of unmanned plane and storage medium

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090222388A1 (en) * 2007-11-16 2009-09-03 Wei Hua Method of and system for hierarchical human/crowd behavior detection
CN105447458A (en) * 2015-11-17 2016-03-30 深圳市商汤科技有限公司 Large scale crowd video analysis system and method thereof
WO2017156443A1 (en) * 2016-03-10 2017-09-14 Rutgers, The State University Of New Jersey Global optimization-based method for improving human crowd trajectory estimation and tracking
US20180005047A1 (en) * 2016-06-30 2018-01-04 Beijing Kuangshi Technology Co., Ltd. Video monitoring method and video monitoring device
CN107944327A (en) * 2016-10-10 2018-04-20 杭州海康威视数字技术股份有限公司 A kind of demographic method and device
WO2018121690A1 (en) * 2016-12-29 2018-07-05 北京市商汤科技开发有限公司 Object attribute detection method and device, neural network training method and device, and regional detection method and device
US20180253605A1 (en) * 2017-03-03 2018-09-06 International Business Machines Corporation Crowd detection, analysis, and categorization
CN107729799A (en) * 2017-06-13 2018-02-23 银江股份有限公司 Crowd's abnormal behaviour vision-based detection and analyzing and alarming system based on depth convolutional neural networks
CN108256447A (en) * 2017-12-29 2018-07-06 广州海昇计算机科技有限公司 A kind of unmanned plane video analysis method based on deep neural network
CN108549835A (en) * 2018-03-08 2018-09-18 深圳市深网视界科技有限公司 Crowd counts and its method, terminal device and the storage medium of model construction
CN108647592A (en) * 2018-04-26 2018-10-12 长沙学院 Group abnormality event detecting method and system based on full convolutional neural networks
CN108848348A (en) * 2018-07-12 2018-11-20 西南科技大学 A kind of crowd's abnormal behaviour monitoring device and method based on unmanned plane
CN108921137A (en) * 2018-08-01 2018-11-30 深圳市旭发智能科技有限公司 A kind of unmanned plane and storage medium
CN108897342A (en) * 2018-08-22 2018-11-27 江西理工大学 For the positioning and tracing method and system of the civilian multi-rotor unmanned aerial vehicle fast moved

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
叶志鹏等: "视频场景人群运动异常状态检测", vol. 7, no. 4, pages 45 - 47 *
李白萍;韩新怡;吴冬梅;: "基于卷积神经网络的实时人群密度估计", 图学学报, no. 04 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112232124A (en) * 2020-09-11 2021-01-15 浙江大华技术股份有限公司 Crowd situation analysis method, video processing device and device with storage function

Also Published As

Publication number Publication date
CN111291597B (en) 2023-10-13

Similar Documents

Publication Publication Date Title
CN108062349B (en) Video monitoring method and system based on video structured data and deep learning
US10366595B2 (en) Surveillance method and system based on human behavior recognition
CN109214280B (en) Shop identification method and device based on street view, electronic equipment and storage medium
CN108009473A (en) Based on goal behavior attribute video structural processing method, system and storage device
US20180107182A1 (en) Detection of drones
CN112307868B (en) Image recognition method, electronic device, and computer-readable medium
EP4035070B1 (en) Method and server for facilitating improved training of a supervised machine learning process
US11429820B2 (en) Methods for inter-camera recognition of individuals and their properties
CN111353338B (en) Energy efficiency improvement method based on business hall video monitoring
CN112183166A (en) Method and device for determining training sample and electronic equipment
CN110969215A (en) Clustering method and device, storage medium and electronic device
CN108334831A (en) A kind of monitoring image processing method, monitoring terminal and system
KR102333143B1 (en) System for providing people counting service
CN111488803A (en) Airport target behavior understanding system integrating target detection and target tracking
KR20210062256A (en) Method, program and system to judge abnormal behavior based on behavior sequence
CN111191507A (en) Safety early warning analysis method and system for smart community
CN111291646A (en) People flow statistical method, device, equipment and storage medium
CN114092877A (en) Garbage can unattended system design method based on machine vision
Leonid et al. Human wildlife conflict mitigation using YOLO algorithm
CN117201733B (en) Real-time unmanned aerial vehicle monitoring and sharing system
CN111291597B (en) Crowd situation analysis method, device, equipment and system based on image
KR102411209B1 (en) System and Method for Image Classification Based on Object Detection Events by Edge Device
CN108596068B (en) Method and device for recognizing actions
CN114724011B (en) Behavior determination method and device, storage medium and electronic device
KR20210048271A (en) Apparatus and method for performing automatic audio focusing to multiple objects

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant