CN111738225B - Crowd gathering detection method, device, equipment and storage medium - Google Patents

Crowd gathering detection method, device, equipment and storage medium Download PDF

Info

Publication number
CN111738225B
CN111738225B CN202010741058.7A CN202010741058A CN111738225B CN 111738225 B CN111738225 B CN 111738225B CN 202010741058 A CN202010741058 A CN 202010741058A CN 111738225 B CN111738225 B CN 111738225B
Authority
CN
China
Prior art keywords
crowd
pedestrian
video frame
real
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010741058.7A
Other languages
Chinese (zh)
Other versions
CN111738225A (en
Inventor
张力元
胡金晖
孟建
程静
袁明冬
杨逢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Smart City Research Institute Of China Electronics Technology Group Corp
Original Assignee
Smart City Research Institute Of China Electronics Technology Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Smart City Research Institute Of China Electronics Technology Group Corp filed Critical Smart City Research Institute Of China Electronics Technology Group Corp
Priority to CN202010741058.7A priority Critical patent/CN111738225B/en
Publication of CN111738225A publication Critical patent/CN111738225A/en
Application granted granted Critical
Publication of CN111738225B publication Critical patent/CN111738225B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition

Abstract

The application provides a crowd gathering detection method, a device, equipment and a storage medium, relates to the technical field of video monitoring, and can improve the real-time performance of crowd gathering detection without additional computing resources. The method comprises the following steps: acquiring a real-time video stream of a preset area, and determining pedestrians in any video frame in the real-time video stream; acquiring the position coordinates of each pedestrian in the video frame; performing kernel function mapping on the position coordinates of each pedestrian to obtain a point set corresponding to the position coordinates of each pedestrian and a pixel value of each point in the point set; accumulating the pixel values corresponding to the point which is overlapped in each point set to obtain a crowd density distribution map of the video frame; and determining a crowd gathering area in the video frame according to the crowd density distribution map and a preset crowd gathering threshold value.

Description

Crowd gathering detection method, device, equipment and storage medium
Technical Field
The present application relates to the field of video surveillance technologies, and in particular, to a method, an apparatus, a device, and a storage medium for detecting crowd accumulation.
Background
Along with the construction of novel wisdom city, city monitored control system's function and effect are also more powerful. How to use the city monitoring system to discover crowd gathering events in time is a technical problem concerned in the construction process of smart cities. At present, in a common crowd accumulation detection method, a target pedestrian in a video frame needs to be identified, and then the pedestrian density is calculated through the position distance between the target pedestrians, or the pedestrian density is obtained through a method of generating a crowd thermodynamic diagram and then performing integral calculation on the thermodynamic diagram. The method for calculating the pedestrian density through the position distance between the target pedestrians needs to consume huge calculation resources in the multi-target tracking process in a complicated scene, and the adaptability is poor. By the method for integral calculation of the thermodynamic diagram, in order to ensure the accuracy of generating the thermodynamic diagram, a large amount of labeled training data under the same scene is required as a support, and the whole training data acquisition process is complex and is not beneficial to real-time analysis of crowd aggregation.
Disclosure of Invention
The embodiment of the application provides a crowd gathering detection method, a device, equipment and a storage medium, a density distribution map of a target pedestrian can be obtained in a kernel function mapping mode, the complexity of calculating the distance between multiple pedestrian targets by a traditional method is avoided, extra calculation resources are not needed, and the real-time performance of crowd gathering detection is improved. In a first aspect, the present application provides a method for crowd gathering detection, comprising:
acquiring a real-time video stream of a preset area, and determining pedestrians in any video frame in the real-time video stream;
acquiring the position coordinates of each pedestrian in the video frame;
performing kernel function mapping on the position coordinate of each pedestrian to obtain a point set corresponding to the position coordinate of each pedestrian and a pixel value of each point in the point set, wherein the pixel value represents the degree of people clustering;
accumulating the pixel values corresponding to the point which is overlapped in each point set to obtain a crowd density distribution map of the video frame;
and determining a crowd gathering area in the video frame according to the crowd density distribution map and a preset crowd gathering threshold value.
In an optional implementation manner, the function corresponding to the kernel function mapping is a radial basis kernel function;
the point set corresponding to the position coordinate of each pedestrian is a circle with the position coordinate as an origin and a preset length as a radius, and the pixel value of each point in the circle decreases progressively from the origin along the outward direction of the radius.
In an optional implementation manner, for any video frame in the real-time video stream, determining a pedestrian in the video frame includes:
aiming at any video frame in the real-time video stream, recognizing the head and shoulders of each pedestrian in the video frame according to a preset head and shoulder detection model; and determining the pedestrian in the video frame according to the head and the shoulder of each pedestrian.
In an optional implementation manner, acquiring the position coordinates of each pedestrian in the video frame includes:
and acquiring a first position coordinate of the head of each pedestrian in the video frame and a second position coordinate of the shoulder of each pedestrian in the video frame.
In an optional implementation manner, performing kernel function mapping on the position coordinate of each pedestrian to obtain a point set corresponding to the position coordinate of each pedestrian and a pixel value of each point in the point set, where the pixel value represents a degree of people clustering, includes:
performing kernel function mapping on the first position coordinate of each pedestrian to obtain a first point set corresponding to the first position coordinate of each pedestrian and a first pixel value of each point in the first point set;
and performing kernel function mapping on the second position coordinate of each pedestrian to obtain a second point set corresponding to the second position coordinate of each pedestrian and a second pixel value of each point in the second point set, wherein the first pixel value and the second pixel value both represent the crowd gathering degree.
In an optional implementation manner, after determining the crowd gathering area in the video frame according to the crowd density distribution map and a preset threshold, the method further includes:
determining the probability of including the crowd gathering area in the real-time video stream according to the number of the crowd gathering areas included in the video frame corresponding to each moment;
if the probability of the real-time video stream including the crowd gathering area is larger than a preset probability threshold, determining that the real-time video stream includes the crowd gathering area;
and if the probability of the real-time video stream including the crowd gathering area is smaller than or equal to a preset probability threshold, determining that the real-time video stream does not include the crowd gathering area.
In an optional implementation manner, after determining the crowd gathering area in the video frame according to the crowd density distribution map and a preset threshold, the method further includes:
calculating a crowd gradient map corresponding to the real-time video stream according to the crowd density distribution map;
and determining the movement direction of people in the real-time video stream according to the people gradient map.
In a second aspect, the present application provides a crowd gathering detection device comprising:
the device comprises a first determining module, a second determining module and a third determining module, wherein the first determining module is used for acquiring a real-time video stream of a preset area and determining pedestrians in any video frame in the real-time video stream;
the acquisition module is used for acquiring the position coordinates of each pedestrian in the video frame;
a first obtaining module, configured to perform kernel function mapping on the position coordinate of each pedestrian to obtain a point set corresponding to the position coordinate of each pedestrian and a pixel value of each point in the point set, where the pixel value represents a degree of people clustering;
a second obtaining module, configured to accumulate the pixel values corresponding to the point in each of the point sets, so as to obtain a crowd density distribution map of the video frame;
and the second determining module is used for determining the crowd gathering area in the video frame according to the crowd density distribution map and a preset crowd gathering threshold value.
In an optional implementation manner, the function corresponding to the kernel function mapping is a radial basis kernel function;
the point set corresponding to the position coordinate of each pedestrian is a circle with the position coordinate as an origin and a preset length as a radius, and the pixel value of each point in the circle decreases progressively from the origin along the outward direction of the radius.
In an optional implementation manner, for any video frame in the real-time video stream, determining a pedestrian in the video frame includes:
aiming at any video frame in the real-time video stream, recognizing the head and shoulders of each pedestrian in the video frame according to a preset head and shoulder detection model; and determining the pedestrian in the video frame according to the head and the shoulder of each pedestrian.
In an optional implementation manner, the obtaining module is specifically configured to:
and acquiring a first position coordinate of the head of each pedestrian in the video frame and a second position coordinate of the shoulder of each pedestrian in the video frame.
In an optional implementation manner, the first obtaining module includes:
a first obtaining unit, configured to perform kernel function mapping on the first position coordinate of each pedestrian to obtain a first point set corresponding to the first position coordinate of each pedestrian and a first pixel value of each point in the first point set;
a second obtaining unit, configured to perform kernel function mapping on the second position coordinate of each pedestrian to obtain a second point set corresponding to the second position coordinate of each pedestrian and a second pixel value of each point in the second point set, where the first pixel value and the second pixel value both represent a crowd aggregation degree.
In an optional implementation manner, the method further includes:
a third determining module, configured to determine, according to the number of crowd aggregation areas included in the video frame corresponding to each time, a probability that the real-time video stream includes the crowd aggregation areas;
a fourth determining module, configured to determine that the real-time video stream includes the crowd aggregation region if the probability that the real-time video stream includes the crowd aggregation region is greater than a preset probability threshold;
a fifth determining module, configured to determine that the real-time video stream does not include the crowd aggregation region if the probability that the real-time video stream includes the crowd aggregation region is smaller than or equal to a preset probability threshold.
In an optional implementation manner, the method further includes:
the calculation module is used for calculating a crowd gradient map corresponding to the real-time video stream according to the crowd density distribution map;
and the sixth determining module is used for determining the movement direction of the crowd in the real-time video stream according to the crowd gradient map.
In a third aspect, the present application provides a crowd detection device comprising a processor, a memory, and a computer program stored in the memory and executable on the processor, the processor implementing the method according to the first aspect or any alternative of the first aspect when executing the computer program.
In a fourth aspect, the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements a method according to the first aspect or any of the alternatives of the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product, which, when run on a crowd detection apparatus, causes the video processing apparatus to perform the steps of the crowd detection method according to the first aspect.
By adopting the crowd aggregation detection method provided by the application, through acquiring the position coordinates of the pedestrians in any video frame in the real-time video stream, the position coordinates of the pedestrians are subjected to kernel function mapping to obtain each point set corresponding to the position coordinates of the pedestrians and the crowd aggregation degree represented by the pixel value of each point in the point set, the crowd density distribution map of any video frame in the video stream is determined, the complexity of calculating the distance between multiple pedestrian targets by using a traditional method is avoided, and the real-time performance of crowd aggregation detection is improved while extra calculation resources are not needed.
It is understood that the beneficial effects of the second aspect to the fifth aspect can be referred to the related description of the first aspect, and are not described herein again.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1 is a schematic diagram of target object recognition provided by an embodiment of the present application;
FIG. 2 is a schematic flow chart diagram of a crowd gathering detection method provided by an embodiment of the present application;
FIG. 3 is a schematic flow chart diagram of a crowd gathering detection method provided by another embodiment of the present application;
FIG. 4 is a schematic diagram of a crowd detection device provided in an embodiment of the present application;
fig. 5 is a schematic diagram of a crowd gathering detection device provided in an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It should be understood that the terms "first," "second," "third," and the like in the description of the present application and in the appended claims, are used for distinguishing between descriptions that are not intended to indicate or imply relative importance.
It should also be appreciated that reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise.
Before describing the crowd detection method provided by the present application, exemplary descriptions are provided for the background of crowd detection and the shortcomings of the existing crowd detection methods. At present, along with the construction of novel wisdom city, city monitored control system's function and effect are also more powerful. How to use the city monitoring system to discover crowd gathering events in time is a technical problem concerned in the construction process of smart cities.
The existing crowd gathering detection methods mainly include two methods, one is a method for recognizing the positions of people through feature extraction and then calculating density, the method needs to calculate the distance between pedestrians, the gathering direction needs to be calculated through target tracking, multi-target tracking under complicated scenes consumes huge calculation resources, the critical density threshold value for gathering judgment is changed along with the change of parameters such as video scenes, field depths, resolution ratios and the like, targeted manual adjustment is needed, and timeliness is poor. The method for generating the thermodynamic diagram through the neural network and then performing integral calculation on the thermodynamic diagram needs a large amount of labeled training data under the same scene as support in order to ensure the accuracy of generating the thermodynamic diagram, and the whole training data acquisition process is complex and is not beneficial to analyzing the crowd aggregation in real time.
Next, the principle of crowd detection and related concepts in the process of crowd detection adopted by the crowd detection method proposed in the present application are exemplarily described with reference to fig. 1.
For convenience of description, in the embodiment of the present application, the video frame at any time t in the real-time video stream captured by the crowd detection device is denoted by FtTo indicate. As shown in FIG. 1, a video frame F is displayed at any time ttFor example, assume at FtMiddle ladleContains at least one pedestrian (only one pedestrian is shown in fig. 1), and F can be recognized by a pre-trained target detection modeltFor example, the pre-trained target detection mode is a head-shoulder detection model, and the video frame F can be identified by the head-shoulder detection modeltThe head and the shoulder of all the pedestrians, and the positions of the pedestrians are determined according to the head and the shoulder; and acquiring the position information of each pedestrian, for example, as shown in fig. 1, the position information corresponding to the acquired pedestrian is the pixel coordinate of the point a, and after kernel function mapping is performed on the pixel coordinate, a pixel value corresponding to the position information is obtained. It can be understood that after performing kernel function mapping on the obtained position information of multiple pedestrians, a point set of pixel coordinates corresponding to the position information of each pedestrian and a pixel value of each point in the point set are obtained, and for example, after performing kernel function mapping on the position information of multiple pedestrians, the obtained point set forms a circle, and pixel values of points corresponding to different radii in the circle are different, for example, the pixel values of each point decrease outward along the center of the circle. In an embodiment of the application, the pixel value is indicative of a degree of people clustering. Further obtaining a crowd density distribution map consisting of all the target pedestrians; and determining a crowd gathering area on the basis of the crowd density distribution diagram, and judging crowd gathering and the moving direction of the crowd gathering according to the gradient diagram of the adjacent video frames.
It can be seen that, in the embodiment of the present application, the crowd density distribution map is generated by a kernel function mapping manner, so that an operation that the time complexity required by the conventional method for calculating the distance between n pedestrians is O (n2) is avoided, and since the kernel function mapping algorithm has a very small calculation amount, the overhead of additional calculation resources is not brought. Therefore, the method is not influenced by the calculation burden caused by the increase of the value of n, and the real-time performance of the algorithm is not influenced. Meanwhile, as the deep learning generates the crowd density distribution map, a large amount of expensive training data and computing resources are not needed.
The invention further can judge the moving direction of the crowd accumulation by comparing the gradient maps of the adjacent video frames on the basis of the crowd density distribution map. The traditional method is avoided, and the real-time performance and accuracy of tracking in a complex scene are improved by using a multi-target tracking thunder area. It should be noted that not only is the consumption of computing resources huge and the real-time performance is difficult to guarantee, but also the tracking accuracy is difficult to guarantee in a complex scene.
The crowd gathering detection method provided by the present application is exemplarily described below by specific examples.
Referring to fig. 2, fig. 2 is a schematic flow chart of a crowd gathering detection method according to an embodiment of the present application. The main execution body of the crowd gathering detection method in the embodiment is crowd gathering detection equipment, including but not limited to mobile terminals such as smart phones, tablet computers, wearable equipment, and the like, and also cameras, robots, servers, and the like in various application scenarios. The crowd gathering detection method as shown in fig. 2 may include:
s201, acquiring a real-time video stream of a preset area, and determining a pedestrian in a video frame aiming at any video frame in the real-time video stream.
In this embodiment, the real-time video stream is a video stream that is monitored by the monitoring device in the preset area in real time within a preset duration, and after the crowd detection device acquires the real-time video stream detected by the monitoring device in real time, the crowd detection device identifies a target pedestrian in the video frame according to a preset target pedestrian identification method for any video frame in the real-time video stream.
The preset pedestrian identification method is, for example, a pedestrian identification method based on a machine learning model, and in this embodiment, the preset pedestrian identification method is to identify the head and the shoulder of each pedestrian in the video frame through a preset head and shoulder detection model, and then determine the pedestrian in the video frame according to the head and the shoulder of each pedestrian.
Illustratively, the position coordinates of each pedestrian are recorded, and a set of position coordinates of each pedestrian is generated, for example, the set object = { f (xi, yi) }, where xi, yi are the position coordinates of the ith pedestrian, i = {1,2,3 … N }, and N is the pedestrian object number.
S202, acquiring the position coordinates of each pedestrian in the video frame.
It can be understood that the video frame in the video stream is a two-dimensional visual image, and the pixel coordinates of each pedestrian in the video frame are obtained and taken as the position coordinates, corresponding to the pixel coordinates of the pedestrian displayed in the two-dimensional visual image.
In an alternative implementation, the position coordinates of each pedestrian respectively include a first position coordinate corresponding to the head and a second position coordinate corresponding to the shoulder; acquiring the position coordinates of each pedestrian in the video frame, wherein the acquiring comprises the following steps: and acquiring a first position coordinate of the head of each pedestrian in the video frame and a second position coordinate of the shoulder of each pedestrian in the video frame.
S203, performing kernel function mapping on the position coordinates of each pedestrian to obtain a point set corresponding to the position coordinates of each pedestrian and a pixel value of each point in the point set, wherein the pixel value represents the degree of people clustering.
In an embodiment of the present application, a radial basis kernel function is used to perform kernel function mapping on the position coordinates of each pedestrian, the point set corresponding to the position coordinates of each pedestrian is a circle using the position coordinates as an origin and using a preset length as a radius, and a pixel value of each point in the circle decreases progressively from the origin along an outward direction of the radius. The radius of the circle can be adaptively set to be a preset multiple, for example, 4 to 6 times, of the length of the detection frame corresponding to the target pedestrian.
It will be appreciated that a larger value of said pixel corresponds to a larger population density of the characterization.
In an optional implementation manner, performing kernel function mapping on the position coordinate of each pedestrian to obtain a point set corresponding to the position coordinate of each pedestrian and a pixel value of each point in the point set, includes: performing kernel function mapping on the first position coordinate of each pedestrian to obtain a first point set corresponding to the first position coordinate of each pedestrian and a first pixel value of each point in the first point set; performing kernel function mapping on the second position coordinate of each pedestrian to obtain a second point set corresponding to the second position coordinate of each pedestrian and a second pixel value of each point in the second point set; wherein the first pixel value and the second pixel value both characterize a degree of crowd gathering.
And S204, accumulating the pixel values corresponding to the point which is overlapped in each point set to obtain the crowd density distribution map of the video frame.
Illustratively, the calculation formula for accumulating the pixel values corresponding to the overlapped point in each of the point sets is as follows:
Density=
Figure DEST_PATH_IMAGE001A
where N is the number of elements of the set Objects,
Figure DEST_PATH_IMAGE002AA
is the pixel value of the ith position coordinate.
In this embodiment, the range of the pixel value of each point in the calculated crowd density distribution map is [0, 255], and if the accumulated sum exceeds 255, the default is 255. It will be appreciated that a higher pixel value for a point in a set of points represents a higher degree of crowd gathering for that point.
Optionally, in order to make it more convenient for the user to judge the position and degree of aggregation, and basically generate a more intuitive thermodynamic diagram, for example, mapping each point in the crowd density distribution diagram into the color space of the thermodynamic diagram, i.e., to make it easier to visually observe the position and degree of aggregation.
S205, determining a crowd gathering area in the video frame according to the crowd density distribution map and a preset crowd gathering threshold value.
In an embodiment of the present application, by presetting a crowd accumulation threshold, an area in the crowd density distribution map where the number of point sets is greater than the preset crowd accumulation threshold is taken as a crowd accumulation area, and each of the connected crowd accumulation areas in the point sets in the crowd density distribution map is taken as a crowd accumulation area, assuming that the crowd accumulation area in the crowd accumulation is taken as { Gatheri (x, y) }, where i = {1,2 … M }, and M is the number of the crowd accumulation areas. If M is null or M =0 in the point set in the population density distribution map, it indicates no population aggregation region.
Based on the analysis, the crowd density distribution map of any video frame in the video stream is determined by acquiring the position coordinates of the pedestrians in any video frame in the real-time video stream, performing kernel function mapping on the position coordinates of each pedestrian to obtain the point set corresponding to the position coordinates of each pedestrian and representing the crowd gathering degree by the pixel value of each point in the point set by adopting the crowd gathering detection method provided by the application, the complexity of calculating the distance between multiple pedestrian targets by using the traditional method is avoided, and the real-time performance of crowd gathering detection is improved without extra calculation resources.
Fig. 3 is a schematic flow chart of a crowd detection method according to another embodiment of the present application. Compared with the embodiment shown in fig. 2, the specific implementation processes of S301 to S305 are the same as those of S201 to S205, except that S306 to S308 are further included after S305. The details are as follows:
s301, acquiring a real-time video stream of a preset area, and determining a pedestrian in any video frame in the real-time video stream.
S302, acquiring the position coordinates of each pedestrian in the video frame.
And S303, performing kernel function mapping on the position coordinate of each pedestrian to obtain a point set corresponding to the position coordinate of each pedestrian and a pixel value of each point in the point set, wherein the pixel value represents the degree of people clustering.
And S304, accumulating the pixel values corresponding to the point which is overlapped in each point set to obtain the crowd density distribution map of the video frame.
S305, determining a crowd gathering area in the video frame according to the crowd density distribution map and a preset crowd gathering threshold value.
S306, determining the probability of the crowd gathering areas in the real-time video stream according to the number of the crowd gathering areas included in the video frame corresponding to each moment.
S307, if the probability that the real-time video stream comprises the crowd gathering area is larger than a preset probability threshold, determining that the real-time video stream comprises the crowd gathering area.
S308, if the probability that the real-time video stream comprises the crowd gathering area is smaller than or equal to a preset probability threshold, determining that the real-time video stream does not comprise the crowd gathering area.
That is, in this embodiment, on the basis of the embodiment shown in fig. 2, it is further verified whether there is a crowd in the real-time video stream according to that the probability of the crowd aggregation area included in the real-time video stream is greater than a preset probability threshold. Exemplarily, the number of people gathering areas in N consecutive video frames is determined, and if it is determined that the number of people gathering areas in N consecutive video frames is N and the ratio of N to N is greater than a preset probability threshold, it is determined that the current real-time video stream includes people gathering areas.
Further, after the current real-time video stream is judged to include the crowd gathering area, the crowd moving direction can be determined, and specifically, a crowd gradient map corresponding to the real-time video stream is calculated according to the crowd density distribution map; and then determining the movement direction of people in the real-time video stream according to the people gradient map.
Illustratively, the maximum value in each crowd area in the crowd gradient map is taken, and the positions of the same crowd area in two adjacent video frames before and after are compared, and the change direction of the crowd area is the moving direction of the crowd in the crowd area.
From the above analysis, the method for generating the crowd density distribution map by using the kernel function mapping in the embodiment needs only the time complexity O (n), and avoids the complexity of the conventional method for calculating the distance between n pedestrian targets to each other, which is O (c) (n)
Figure 737957DEST_PATH_IMAGE003
) And (4) performing the operation of (1). Moreover, the calculation amount of the kernel function mapping algorithm is very small, and the overhead of additional calculation resources is not brought, so that the method is not influenced by the calculation load caused by the increase of the value of n, and the real-time performance of the algorithm is not influenced. Meanwhile, as the deep learning generates the crowd density distribution map, a large amount of expensive training data and computing resources are not needed.
Further, the crowd thermodynamic diagram can be generated very conveniently on the basis of the crowd density distribution diagram. The more excellent visual display mode is provided through the crowd thermodynamic diagram. Compared with the traditional method, the method for generating the thermodynamic diagram requires a large amount of integral calculation or modeling, and the method only needs to map the density distribution diagram into the thermodynamic diagram color space, so that the method is quite superior.
Furthermore, the invention judges the moving direction of the crowd accumulation by comparing the gradient maps of the adjacent video frames. The traditional method is avoided, and the multi-target tracking thunder area is used. The multi-target tracking not only consumes huge computing resources and is difficult to ensure real-time performance, but also is difficult to ensure the tracking accuracy in a complex scene.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
Based on the crowd gathering detection method provided by the embodiment, the embodiment of the invention further provides an embodiment of a device for realizing the embodiment of the method.
Referring to fig. 4, fig. 4 is a schematic view of a crowd detection device according to an embodiment of the present disclosure. The units are included for performing the steps in the corresponding embodiment of fig. 2. Please refer to fig. 2 for a related description of the embodiment. For convenience of explanation, only the portions related to the present embodiment are shown. Referring to fig. 4, the crowd detection apparatus 400 includes:
a first determining module 401, configured to obtain a real-time video stream of a preset area, and determine, for any video frame in the real-time video stream, a pedestrian in the video frame;
an obtaining module 402, configured to obtain a position coordinate of each pedestrian in the video frame;
a first obtaining module 403, configured to perform kernel function mapping on the position coordinate of each pedestrian to obtain a point set corresponding to the position coordinate of each pedestrian and a pixel value of each point in the point set, where the pixel value represents a degree of people clustering;
a second obtaining module 404, configured to accumulate the pixel values corresponding to the point in each of the point sets that are overlapped to obtain a crowd density distribution map of the video frame;
a second determining module 405, configured to determine a crowd region in the video frame according to the crowd density distribution map and a preset crowd threshold.
In an optional implementation manner, the function corresponding to the kernel function mapping is a radial basis kernel function;
the point set corresponding to the position coordinate of each pedestrian is a circle with the position coordinate as an origin and a preset length as a radius, and the pixel value of each point in the circle decreases progressively from the origin along the outward direction of the radius.
In an optional implementation manner, for any video frame in the real-time video stream, determining a pedestrian in the video frame includes:
aiming at any video frame in the real-time video stream, recognizing the head and shoulders of each pedestrian in the video frame according to a preset head and shoulder detection model; and determining the pedestrian in the video frame according to the head and the shoulder of each pedestrian.
In an optional implementation manner, the obtaining module 402 is specifically configured to:
and acquiring a first position coordinate of the head of each pedestrian in the video frame and a second position coordinate of the shoulder of each pedestrian in the video frame.
In an optional implementation manner, the first obtaining module 403 includes:
a first obtaining unit, configured to perform kernel function mapping on the first position coordinate of each pedestrian to obtain a first point set corresponding to the first position coordinate of each pedestrian and a first pixel value of each point in the first point set;
a second obtaining unit, configured to perform kernel function mapping on the second position coordinate of each pedestrian to obtain a second point set corresponding to the second position coordinate of each pedestrian and a second pixel value of each point in the second point set, where the first pixel value and the second pixel value both represent a crowd aggregation degree.
In an optional implementation manner, the method further includes:
a third determining module, configured to determine, according to the number of crowd aggregation areas included in the video frame corresponding to each time, a probability that the real-time video stream includes the crowd aggregation areas;
a fourth determining module, configured to determine that the real-time video stream includes the crowd aggregation region if the probability that the real-time video stream includes the crowd aggregation region is greater than a preset probability threshold;
a fifth determining module, configured to determine that the real-time video stream does not include the crowd aggregation region if the probability that the real-time video stream includes the crowd aggregation region is smaller than or equal to a preset probability threshold.
In an optional implementation manner, the method further includes:
the calculation module is used for calculating a crowd gradient map corresponding to the real-time video stream according to the crowd density distribution map;
and the sixth determining module is used for determining the movement direction of the crowd in the real-time video stream according to the crowd gradient map.
It should be noted that, because the contents of information interaction, execution process, and the like between the modules are based on the same concept as that of the embodiment of the method of the present application, specific functions and technical effects thereof may be specifically referred to a part of the embodiment of the method, and details are not described here.
Fig. 5 is a schematic diagram of a crowd gathering detection device provided in an embodiment of the present application. As shown in fig. 5, the crowd accumulation detecting device 5 of this embodiment includes: a processor 500, a memory 501 and a computer program 502, such as a crowd detection program, stored in the memory 501 and executable on the processor 500. The processor 500 executes the computer program 502 to implement the steps of the above-mentioned embodiments of the crowd detection method, such as the step 201-. Alternatively, the processor 500 executes the computer program 502 to implement the functions of the modules/units in the device embodiments, such as the functions of the units 401 and 405 shown in fig. 4.
Illustratively, the computer program 502 may be partitioned into one or more modules/units that are stored in the memory 501 and executed by the processor 500 to accomplish the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 502 in the crowd detection device 5. For example, the computer program 502 may be divided into a first determining module, an obtaining module, a first obtaining module, and a second obtaining module, and specific functions of each module are described in the embodiment corresponding to fig. 4, which is not described herein again.
The crowd detection device may include, but is not limited to, a processor 500, a memory 501. It will be appreciated by those skilled in the art that fig. 5 is only an example of the crowd detection device 5 and does not constitute a limitation of the crowd detection device 5 and may comprise more or less components than shown, or some components in combination, or different components, e.g. the video processing device may also comprise an input output device, a network access device, a bus, etc.
The Processor 500 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 501 may be an internal storage unit of the crowd detection device 5, such as a hard disk or a memory of the crowd detection device 5. The memory 501 may also be an external storage device of the crowd detection device 5, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, provided on the crowd detection device 5. Further, the memory 501 may also comprise both an internal storage unit and an external storage device of the crowd detection device 5. The memory 501 is used to store the computer program and other programs and data required by the crowd detection device. The memory 501 may also be used to temporarily store data that has been output or is to be output.
An embodiment of the present application further provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the method for detecting crowd aggregation may be implemented.
The embodiment of the application provides a computer program product, when the computer program product runs on a crowd detection device, the crowd detection device can realize the crowd detection method when executing the crowd detection device.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (9)

1. A method of crowd gathering detection, comprising:
acquiring a real-time video stream of a preset area, and determining pedestrians in any video frame in the real-time video stream;
acquiring the position coordinates of each pedestrian in the video frame;
performing kernel function mapping on the position coordinate of each pedestrian to obtain a point set corresponding to the position coordinate of each pedestrian and a pixel value of each point in the point set, wherein the pixel value represents the degree of people clustering; the function corresponding to the kernel function mapping is a radial basis kernel function;
the point set corresponding to the position coordinate of each pedestrian is a circle with the position coordinate as an origin and a preset length as a radius, and the pixel value of each point in the circle decreases progressively from the origin along the outward direction of the radius;
accumulating the pixel values corresponding to the point which is overlapped in each point set to obtain a crowd density distribution map of the video frame;
and determining a crowd gathering area in the video frame according to the crowd density distribution map and a preset crowd gathering threshold value.
2. The method of claim 1, wherein determining, for any video frame in the real-time video stream, a pedestrian in the video frame comprises:
aiming at any video frame in the real-time video stream, recognizing the head and shoulders of each pedestrian in the video frame according to a preset head and shoulder detection model; and determining the pedestrian in the video frame according to the head and the shoulder of each pedestrian.
3. The method of claim 2, wherein obtaining the position coordinates of each of the pedestrians in the video frame comprises:
and acquiring a first position coordinate of the head of each pedestrian in the video frame and a second position coordinate of the shoulder of each pedestrian in the video frame.
4. The method of claim 3, wherein performing a kernel function mapping on the position coordinates of each of the pedestrians to obtain a point set corresponding to the position coordinates of each of the pedestrians and a pixel value of each point in the point set, wherein the pixel value is indicative of a degree of people clustering, and the method comprises:
performing kernel function mapping on the first position coordinate of each pedestrian to obtain a first point set corresponding to the first position coordinate of each pedestrian and a first pixel value of each point in the first point set;
and performing kernel function mapping on the second position coordinate of each pedestrian to obtain a second point set corresponding to the second position coordinate of each pedestrian and a second pixel value of each point in the second point set, wherein the first pixel value and the second pixel value both represent the crowd gathering degree.
5. The method of claim 4, further comprising, after determining the crowd-sourcing region in the video frame based on the crowd-density profile and a predetermined threshold:
determining the probability of including the crowd gathering area in the real-time video stream according to the number of the crowd gathering areas included in the video frame corresponding to each moment;
if the probability of the real-time video stream including the crowd gathering area is larger than a preset probability threshold, determining that the real-time video stream includes the crowd gathering area;
and if the probability of the real-time video stream including the crowd gathering area is smaller than or equal to a preset probability threshold, determining that the real-time video stream does not include the crowd gathering area.
6. The method of claim 5, further comprising, after determining the crowd-sourcing region in the video frame based on the crowd-density profile and a predetermined threshold:
calculating a crowd gradient map corresponding to the real-time video stream according to the crowd density distribution map;
and determining the movement direction of people in the real-time video stream according to the people gradient map.
7. A crowd gathering detection device, comprising:
the device comprises a first determining module, a second determining module and a third determining module, wherein the first determining module is used for acquiring a real-time video stream of a preset area and determining pedestrians in any video frame in the real-time video stream;
the acquisition module is used for acquiring the position coordinates of each pedestrian in the video frame;
a first obtaining module, configured to perform kernel function mapping on the position coordinate of each pedestrian to obtain a point set corresponding to the position coordinate of each pedestrian and a pixel value of each point in the point set, where the pixel value represents a degree of people clustering; the function corresponding to the kernel function mapping is a radial basis kernel function;
the point set corresponding to the position coordinate of each pedestrian is a circle with the position coordinate as an origin and a preset length as a radius, and the pixel value of each point in the circle decreases progressively from the origin along the outward direction of the radius;
a second obtaining module, configured to accumulate the pixel values corresponding to the point in each of the point sets, so as to obtain a crowd density distribution map of the video frame;
and the second determining module is used for determining the crowd gathering area in the video frame according to the crowd density distribution map and a preset crowd gathering threshold value.
8. A crowd detection device comprising a processor, a memory, and a computer program stored in the memory and executable on the processor, the computer program when executed by the processor implementing the crowd detection method according to any one of claims 1 to 6.
9. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method of people group detection according to any one of claims 1 to 6.
CN202010741058.7A 2020-07-29 2020-07-29 Crowd gathering detection method, device, equipment and storage medium Active CN111738225B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010741058.7A CN111738225B (en) 2020-07-29 2020-07-29 Crowd gathering detection method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010741058.7A CN111738225B (en) 2020-07-29 2020-07-29 Crowd gathering detection method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111738225A CN111738225A (en) 2020-10-02
CN111738225B true CN111738225B (en) 2020-12-11

Family

ID=72656328

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010741058.7A Active CN111738225B (en) 2020-07-29 2020-07-29 Crowd gathering detection method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111738225B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767629B (en) * 2020-12-24 2021-08-31 中标慧安信息技术股份有限公司 Indoor security method and system based on holder monitoring
CN113128430A (en) * 2021-04-25 2021-07-16 科大讯飞股份有限公司 Crowd gathering detection method and device, electronic equipment and storage medium
CN115810178B (en) * 2023-02-03 2023-04-28 中电信数字城市科技有限公司 Crowd abnormal aggregation early warning method and device, electronic equipment and medium
CN117156259B (en) * 2023-10-30 2024-03-22 海信集团控股股份有限公司 Video stream acquisition method and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447458A (en) * 2015-11-17 2016-03-30 深圳市商汤科技有限公司 Large scale crowd video analysis system and method thereof
CN107330364A (en) * 2017-05-27 2017-11-07 上海交通大学 A kind of people counting method and system based on cGAN networks
CN107944327A (en) * 2016-10-10 2018-04-20 杭州海康威视数字技术股份有限公司 A kind of demographic method and device
CN110555397A (en) * 2019-08-21 2019-12-10 武汉大千信息技术有限公司 crowd situation analysis method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10733756B2 (en) * 2017-08-31 2020-08-04 Nec Corporation Online flow guided memory networks for object detection in video
CN111274864A (en) * 2019-12-06 2020-06-12 长沙千视通智能科技有限公司 Method and device for judging crowd aggregation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447458A (en) * 2015-11-17 2016-03-30 深圳市商汤科技有限公司 Large scale crowd video analysis system and method thereof
CN107944327A (en) * 2016-10-10 2018-04-20 杭州海康威视数字技术股份有限公司 A kind of demographic method and device
CN107330364A (en) * 2017-05-27 2017-11-07 上海交通大学 A kind of people counting method and system based on cGAN networks
CN110555397A (en) * 2019-08-21 2019-12-10 武汉大千信息技术有限公司 crowd situation analysis method

Also Published As

Publication number Publication date
CN111738225A (en) 2020-10-02

Similar Documents

Publication Publication Date Title
CN111738225B (en) Crowd gathering detection method, device, equipment and storage medium
CN110427905B (en) Pedestrian tracking method, device and terminal
CN110400332B (en) Target detection tracking method and device and computer equipment
CN111857356B (en) Method, device, equipment and storage medium for recognizing interaction gesture
CN108182396B (en) Method and device for automatically identifying photographing behavior
US8792722B2 (en) Hand gesture detection
US8750573B2 (en) Hand gesture detection
CN108875540B (en) Image processing method, device and system and storage medium
JP2017191501A (en) Information processing apparatus, information processing method, and program
US9280703B2 (en) Apparatus and method for tracking hand
CN110807410B (en) Key point positioning method and device, electronic equipment and storage medium
CN111047626A (en) Target tracking method and device, electronic equipment and storage medium
Yoshinaga et al. Object detection based on spatiotemporal background models
CN109446364A (en) Capture search method, image processing method, device, equipment and storage medium
CN111382637A (en) Pedestrian detection tracking method, device, terminal equipment and medium
CN112287802A (en) Face image detection method, system, storage medium and equipment
WO2023284358A1 (en) Camera calibration method and apparatus, electronic device, and storage medium
CN113989858B (en) Work clothes identification method and system
CN111382606A (en) Tumble detection method, tumble detection device and electronic equipment
WO2024022301A1 (en) Visual angle path acquisition method and apparatus, and electronic device and medium
TWI732374B (en) Method and apparatus for object recognition
CN111915713A (en) Three-dimensional dynamic scene creating method, computer equipment and storage medium
CN113762027B (en) Abnormal behavior identification method, device, equipment and storage medium
CN114399729A (en) Monitoring object movement identification method, system, terminal and storage medium
CN109493349B (en) Image feature processing module, augmented reality equipment and corner detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant