CN111626134A

CN111626134A - Dense crowd counting method, system and terminal based on hidden density distribution

Info

Publication number: CN111626134A
Application number: CN202010349623.5A
Authority: CN
Inventors: 杨华; 高宇康
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2020-04-28
Filing date: 2020-04-28
Publication date: 2020-09-04
Anticipated expiration: 2040-04-28
Also published as: CN111626134B

Abstract

The invention discloses a dense crowd counting method, a system and a terminal based on hidden density distribution, wherein the method comprises the following steps: obtaining a self-adaptive hidden Gaussian density map through a Gaussian network according to a crowd point map; according to the counting loss term, the smoothing term and the Bayes term, guiding the optimization of the hidden Gaussian density map to ensure that the generated quality is higher; according to the hidden Gaussian density map serving as a training target, combining a confrontation loss function and a Bayesian loss function, and outputting the dense crowd image as a predicted density distribution map; and summing the predicted density distribution maps to obtain the predicted number of density people. And (3) alternately training a density predictor, a hidden Gaussian density generator and a discriminator and performing cooperative optimization. The invention improves the precision to a greater extent, has good robustness, and has stronger application value because the parameter quantity and the operation quantity of the deduction stage are not increased.

Description

Dense crowd counting method, system and terminal based on hidden density distribution

Technical Field

The invention relates to the technical field of computer vision, in particular to a dense crowd counting method, a dense crowd counting system and a dense crowd counting terminal based on hidden density distribution.

Background

With the rapid growth of the world population and the acceleration of urbanization construction, how to accurately count the population at high density so as to perform early warning in time, effectively control and dredge the flow of people becomes a very important hotspot problem. Most existing methods extract image features based on a multi-layer convolutional neural network and regress the count results.

However, in the existing method, the generated density distribution map often has the problems of low quality, inaccurate prediction of high-density parts, high parameter redundancy, large calculation amount, poor generalization capability caused by the need of manually adjusting hyper-parameters according to each scene, and the like, and in the application of an actual scene, the model is often required to save storage resources and calculation resources while having a considerable prediction precision, and the generated density distribution map has good robustness for different scenes.

Through retrieval, the chinese patent application No. 201810986919.0 discloses a dense population counting method and apparatus, which obtains an image to be detected including a human figure, inputs the image to be detected into a convolutional neural network model to obtain a population density map of the image to be detected, and determines the number of human figures in the image to be detected according to the population density map. The above-mentioned process fully extracts the characteristic information in the image to be detected, realizes effectual crowd's count and density estimation, brings very big facility for subsequent applications such as safety monitoring, crowd management and control. However, the patent is low in counting precision, the challenges caused by the problems of cross-scene, cross-scale, cross-density grade and the like are difficult to solve, and the hyper-parameters need to be manually adjusted according to each scene, so that the application capability of the hyper-parameters in the actual scene is limited.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, provides a dense crowd counting method, a system and a terminal based on hidden density distribution, improves the performance, and realizes self-adaption solution of crowd counting in various scenes.

According to a first aspect of the present invention, there is provided a dense population counting method based on implicit density distribution, comprising:

acquiring dense crowd image I_c(x, y) dense crowd coordinate data and converting into a dense crowd point diagram D_t(x,y)；

The dense crowd point diagram D_t(x, y) obtaining an adaptive hidden Gaussian density map G (x, y) by a hidden Gaussian density generator;

taking the hidden Gaussian density map G (x, y) as a learning target of a density predictor, and adopting a multi-level loss function to constrain a generation target;

the dense crowd image I_c(x, y) inputting the density predictor and outputting a predicted density prediction map D_p(x,y)；

Predicting the density of the image D_pAnd (x, y) summing all the pixel values to obtain the final predicted number of people.

Optionally, the dense population point map D_t(x, y) obtaining an adaptive hidden gaussian density map G (x, y) by a hidden gaussian density generator, comprising:

the hidden Gaussian density generator adopts a Gaussian network to make the dense crowd point diagram D_t(x, y) convolving with N Gaussian kernels K with different variance sigma values to obtain first feature maps with different scale information, and performing the same convolution operation on the first feature maps to obtain second feature maps;

extracting the second feature graph by adopting a plurality of mask Gaussian convolution modules, extracting and decomposing the second feature graph into features of different levels by the plurality of mask Gaussian convolution modules through Gaussian envelope constraint initialization parameters, and sequentially adding the features of different levels input by two adjacent mask Gaussian convolution modules by utilizing residual operation to obtain more robust features;

through a decoding module formed by multilayer convolution, the number of output channels of each convolution layer is gradually reduced compared with the number of input channels, and finally, an implicit Gaussian density map D is obtained_s(x,y)。

Optionally, the constraining the generation goal by using a multi-level loss function includes:

the density prediction graph D output by the density predictor is subjected to pixel-by-pixel constraint by adopting a mean square error loss function_p(x, y) and the hidden Gaussian density map D_sThe distribution of (x, y) is guaranteed to be similar;

adopting a Bayesian loss function, and enabling a density prediction graph D output by the density predictor to be restricted by a pedestrian point range_p(x, y) is kept close to the probability distribution of the manually marked pedestrian coordinate position;

identification of predictions by discriminators using a penalty functionMeasured Density prediction map D_p(x, y) authentication, and generating a density prediction map D_p(x, y) more high frequency information is retained, i.e. the dense region prediction accuracy is improved.

Optionally, the dense crowd image I_c(x, y) inputting the density predictor and outputting the predicted density prediction map D_p(x, y) comprising:

taking pre-trained VggNet as a feature extraction network, and taking an image I of a dense crowd_c(x, y) inputting VggNet network to obtain characteristic diagram, up-sampling the characteristic diagram, and obtaining output density prediction diagram D after multilayer convolution layer_p(x,y)。

Optionally, the density prediction graph D is output according to the density predictor_p(x, y) and the hidden-gaussian density map G (x, y), updating discriminator parameters using an LSGAN loss function.

Optionally, the method further includes optimizing the hidden gaussian density map G (x, y), where the optimization is performed according to a count loss term or a bayesian term, or according to a smoothing term and a count loss term, or according to a smoothing term and a bayesian term, to generate a higher quality hidden gaussian density map; wherein the content of the first and second substances,

designing a counting loss item, and adopting an L1 distance to constrain that the total number of the hidden Gaussian density graph G (x, y) is close to the total number of the marked people;

designing a smoothing term, and adopting a smoothing term loss function constraint to enable pixel points of the hidden Gaussian density map G (x, y) to have coherence with surrounding pixels;

designing a Bayesian term, and constraining the probability distribution of a hidden Gaussian density map G (x, y) to be consistent with the probability distribution of manually marked marking points in training data by carrying out Gaussian modeling on a foreground point and a background so as to reduce the interference of a background noise region on a target crowd region;

and taking the implicit Gaussian density map with higher quality as a learning target of the density predictor.

Optionally, the method further comprises: and updating the parameters of the hidden Gaussian density generator according to the target loss function smoothing term, the Bayesian term and the counting error term.

Optionally, the density map predictor is updated according to a mean square error term output by the hidden gaussian density generator, a confrontation generation loss term obtained by the feedback of the discriminator, and a bayesian term loss function.

According to a second aspect of the present invention, there is provided a dense population counting system based on hidden density distribution, comprising:

dense crowd point diagram acquisition module for acquiring dense crowd image I_c(x, y) dense crowd coordinate data and converting into a dense crowd point diagram D_t(x,y)；

A hidden Gaussian density generator for generating the dense crowd point diagram D_t(x, y) obtaining an adaptive hidden Gaussian density map G (x, y) by a hidden Gaussian density generator;

the density map predictor is used for taking the hidden Gaussian density map G (x, y) as a learning target and adopting a multi-level loss function to constrain a generation target; the density map predictor is used for converting the dense crowd image I_c(x, y) output as predicted Density prediction map D_p(x,y)；

A population number prediction module for predicting the density of the image D_pAnd (x, y) summing all the pixel values to obtain the final predicted number of people.

According to a third aspect of the present invention, there is provided an electronic terminal, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the above dense crowd counting method based on implicit density distribution when executing the computer program.

Compared with the prior art, the invention has at least one of the following beneficial effects:

according to the method, the system and the terminal, the adaptive hidden Gaussian density map is obtained through the Gaussian network according to the crowd point map, the quality is higher, and the method, the system and the terminal are more beneficial to network learning and optimization of a predictor. Meanwhile, by adopting a multi-level loss function, the learning target is optimized and the precision of the method is improved under the condition that model parameters and calculated amount are not increased.

The method, the system and the terminal extract more robust characteristics based on the pre-trained VggNet network, improve the precision and have good robustness.

The method, the system and the terminal of the invention guide the optimization of the hidden Gaussian density map according to the counting loss term, the smoothing term and the Bayesian term, are used for the constraint optimization process, and constrain the learning target in multiple scales such as pixel-by-pixel, line-by-line human points, image block-by-image block and the like, so that the generation quality is higher.

According to the method, the system and the terminal, the density predictor, the hidden Gaussian density generator and the discriminator are alternately trained and cooperatively optimized to form a cooperative learning framework, so that the precision can be further improved, and the method has a strong application value because the parameters and the operation amount of an inference stage are not increased.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a flow chart of a dense population counting method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a dense population counting method in a preferred embodiment of the present invention;

fig. 3 is a schematic diagram illustrating an effect of the dense people counting method according to an embodiment of the invention.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.

Fig. 1 is a schematic diagram illustrating a dense population counting method based on implicit density distribution according to an embodiment of the present invention. As shown in fig. 1, the dense population counting method based on the implicit density distribution includes:

s100, acquiring an image I of the dense crowd_c(x, y) dense crowd coordinate data and converting into a dense crowd point diagram D_t(x,y)；

S200, dense crowd point diagram D_t(x, y) obtaining an adaptive hidden Gaussian density map G (x, y) by a hidden Gaussian density generator;

s300, using the hidden Gaussian density map G (x, y) as a learning target of a density predictor, and adopting a multi-level loss function to constrain a generation target;

s400, the dense crowd image I_c(x, y) inputting the density predictor and outputting the predicted density prediction map D_p(x,y)；

S500, predicting the density D_pAnd (x, y) summing all the pixel values to obtain the final predicted number of people.

According to the embodiment of the invention, the adaptive hidden Gaussian density map is obtained through the Gaussian network according to the crowd point map, so that the quality is higher, and the network learning and optimization of a predictor are facilitated. Meanwhile, by adopting a multi-level loss function, the learning target is optimized and the precision of the method is improved under the condition that model parameters and calculated amount are not increased.

FIG. 2 is a schematic diagram of a dense people counting method in a preferred embodiment of the present invention. In the graph, an implicit Gaussian density generation network is used as an implicit Gaussian density generator, a density prediction network is used as a density graph predictor, and a discrimination network is used as a discriminator. The method comprises the steps of converting crowd marking information into a point diagram, obtaining an initial density diagram by utilizing Gaussian convolution, splicing the density diagram with dense crowd Coordinate data (Coordinate map), inputting the density diagram into a Gaussian convolution module to obtain an implicit Gaussian density diagram, and using the implicit Gaussian density diagram as a learning target of a density predictor. The density predictor consists of a VggNet extraction high-dimensional feature and an up-sampling decoding layer used for recovering and generating density map details. The discriminator is used for restraining the output of the density predictor so that more high-frequency characteristics can be reserved.

In another embodiment, the present invention further provides a dense population counting system based on implicit density distribution, which can be used to implement the above method, and specifically includes a dense population point diagram obtaining module, an implicit gaussian density generator, a density diagram predictor, and a population number prediction module, wherein: intensive crowd point diagram acquisition module for acquiring intensive crowd image I_c(x, y) dense crowd coordinate data and converting into dense crowdDot diagram D_t(x, y); hidden Gaussian density generator points diagram D of dense population_t(x, y) obtaining an adaptive hidden Gaussian density map G (x, y) by a hidden Gaussian density generator; the density map predictor takes the hidden Gaussian density map G (x, y) as a learning target, and a multilevel loss function is adopted to constrain the generation target; density map predictor is used for converting intensive crowd image I_c(x, y) output as predicted Density prediction map D_p(x, y); people number prediction module density prediction graph D_pAnd (x, y) summing all the pixel values to obtain the final predicted number of people.

In order to better illustrate the implementation of the technical solution of the present invention, a specific application example of the dense population counting method based on implicit density distribution is given below, and the specific operation steps may include:

s101, acquiring an image I of a dense crowd_c(x,y)；

In this embodiment, the original target set may include a three-channel color map, or may include a single-channel grayscale map.

S102, acquiring dense crowd coordinate data and converting the dense crowd coordinate data into a dense crowd point diagram D_t(x,y)；

In this embodiment, the crowd point map is a picture scaled to the size of the dense crowd image 1/8, where pixel values of pixel points where pedestrians exist are 1, and pixel values of other points are 0;

s103, drawing a point diagram D by the crowd_t(x, y) to obtain a hidden Gaussian density map G (x, y).

In this embodiment, the dense population point diagram D_t(x, y) are convolved with 18 gaussian kernels K with different variance σ values, which are sampled from three normal distributions: sigma₁～N(0.5,0.02),σ₂～N(1,0.02),σ₃N (1.5, 0.02). So as to obtain a first characteristic diagram with different scale information, and the same convolution operation is carried out on the first characteristic diagram to obtain a further second characteristic diagram.

In this embodiment, a mask gaussian convolution module is designed to further extract the feature map, and the initialization parameter W is constrained by a gaussian envelope G, so that the feature map X is extracted and decomposed into features O of different levels. Adding the original characteristic diagram input into the mask Gaussian convolution module and the characteristic diagram multiplied by the mask points to obtain more robust characteristics; and inputting the more robust features into a decoding module formed by multilayer convolution to obtain a hidden Gaussian density map G (x, y).

Specifically, in a preferred embodiment, the implicit gaussian density generator is composed of two layers of gaussian convolution layers, six mask gaussian convolution modules and a decoding module formed by ordinary convolution. In order to extract multi-scale features, every two mask Gaussian convolution modules are connected through a short circuit to output a feature map, the feature map is spliced, and then an implicit Gaussian density map is decoded through a series of common convolution layers with gradually reduced channel numbers. For example, the number of output channels per convolution is 1/2, which is the number of input channels, thereby achieving a reduction in the number of channels. The features O of different levels are related to the internal structure of the Gaussian mask convolution module, the general Gaussian mask convolution module is composed of three columns, and the Gaussian kernel variance parameters corresponding to each column are different, so that the features of different levels are obtained.

Point diagram D of dense crowd by first layer of Gaussian convolution layer_t(x, y) convolving with N Gaussian kernels K with different variance sigma values to obtain first feature maps with different scale information, and performing the same convolution operation on the first feature maps by the second layer of Gaussian convolution layer to obtain second feature maps; and then, extracting the second feature graph by adopting six mask Gaussian convolution modules, and extracting and decomposing the second feature graph into six features of different levels through Gaussian envelope constraint initialization parameters. Of course, in other embodiments, other numbers of mask gaussian convolution modules may be used as desired.

The feature map multiplied by the point is the input of the mask gaussian convolution module, for example, if the input feature map is x1, the weight mask is w1, and the output is y1, then y1 is x1+ x1 x w 1; the second signature is the input to the first mask gaussian convolution module, then the input to the second mask gaussian convolution module is the output of the first mask gaussian convolution module, and so on. Because there are a plurality of mask Gaussian convolution modules, the input feature map of the first mask Gaussian convolution module is used as the original feature map, namely the second feature map. For the first mask Gaussian convolution module, the point-multiplied feature map is the second feature map, for the second mask Gaussian convolution module, the point-multiplied feature map is the output of the first mask Gaussian convolution module, and in the same way, the input of the next mask Gaussian convolution module (namely, the point-multiplied feature map) is the output of the last mask Gaussian convolution module. Namely: coordinate point diagram (manual labeling data) -generating a first characteristic diagram-generating a second characteristic diagram-inputting the second characteristic diagram into a first mask Gaussian convolution module-outputting a former mask Gaussian convolution module as the input of a latter mask Gaussian convolution module.

And S104, guiding the optimization of the hidden Gaussian density map according to the counting loss term, the smoothing term and the Bayes term to generate a higher-quality hidden Gaussian density map G (x, y).

In this embodiment, a count loss term is designed, and an L1 distance is used to constrain the total number of the hidden gaussian density maps to be close to the total number of the labeled population. Is provided with C_gtThe real number of people in the picture is as follows:

L_c＝||∑D_s(x,y)-C_gt||₁

L_ca count loss term is represented.

In the embodiment, the smoothing term is designed, and the part of the hidden Gaussian density image with severe pixel value change is not beneficial to network modeling learning, so that the pixel point and the surrounding pixels have coherence by adopting the constraint of the smoothing term loss function, the proportion of an abnormal area is reduced, network convergence is easy, and the performance is improved.

L_sRepresenting the smoothing term, H, W represents the length and width of the generated implicit gaussian density map.

In the embodiment, a Bayesian term is designed, the probability distribution of the hidden Gaussian density map and the labeled points is constrained to be consistent by performing Gaussian modeling on the foreground point and the background, and the probability distribution of the background noise area to the target person is reducedInterference in the cluster area. Let F (-) denote the L1 distance function, c_nIndicates the total number of people associated with each of the marked points, c₀Indicating a total number of people associated with the context.

L_bay＝F(1-E[c_n])+F(0-E[c₀])

L_bayA bayesian term is represented. E [ c ]_n]It is desirable to have the expectation of the distribution of the number of people associated with each annotation point be as close to 1 as possible. E [ c ]₀]Indicating a desire for a population distribution that is relevant to the context.

In this embodiment, the label point refers to a label labeled manually in the training data, for example, 100 people exist in a drawing, 100 coordinate positions need to be labeled when the training data is made, and the data is actually projected on the drawing to be the dense crowd point diagram D_t(x, y), i.e., the input to the hidden gaussian density generator.

In this embodiment, the counting loss term, the smoothing term and the bayesian term are optimized together, but it should be noted that in other embodiments, the counting loss term and the bayesian term may be optimized separately, or the smoothing term may be optimized in combination with any one of the counting loss term and the bayesian term, and it is of course most preferable that the counting loss term, the smoothing term and the bayesian term are optimized together, and the effect is the best. For example, in practice, only the count loss term may be retained, but some precision loss may result.

S105, taking the pre-trained VggNet as a feature extraction network, and taking the dense crowd image I_c(x, y) inputting the VggNet network to obtain a characteristic diagram, and performing up-sampling on the characteristic diagram to obtain an output density prediction diagram D_p(x, y). And summing all pixel values of the density prediction image to obtain the final number of predicted people.

In this embodiment, the VggNet network is used as a front-end portion of the density predictor, and forms a complete density predictor together with upsampling and multilayer convolution.

In this example, VggNet takes the feature extraction part except the full connection layer, down-samples to 1/16 size of the original image, decodes the original image by one up-sampling layer and the multilayer convolution layer, and outputs the density prediction map D_p(x, y) is input dense person1/8 size of the cluster image.

In another preferred embodiment, based on the above embodiment, a multi-level loss function can be used to constrain the generation goal. The multi-level loss function may include a pixel-by-pixel mean square error loss function, a line-by-line human point bayesian loss function, and a countering loss function. In the specific implementation:

using the mean square error loss function, the output D of the density prediction network is obtained through the pixel-by-pixel constraint_p(x, y) and output D of the hidden Gaussian density generator_sThe (x, y) distributions are guaranteed to be similar. In this embodiment, the mean square error loss function L is adopted₁The method comprises the following specific steps:

L₁＝||D_p(x,y)-D_s(x,y)||₁

the Bayesian loss function is adopted, and the density prediction graph D of the output of the density prediction network is constrained by the range of the pedestrian points_p(x, y) the probability distribution of the pedestrian coordinate position of the manually labeled group route remains close. Let F (-) denote the L1 distance function, c_nIndicates the total number of people associated with each of the marked points, c₀Indicating a total number of people associated with the context. In this embodiment, the Bayesian loss function L₂The method comprises the following specific steps:

L₂＝F(1-E[c_n])+F(0-E[c₀])

E[c_n]it is desirable to have the expectation of the distribution of the number of people associated with each annotation point be as close to 1 as possible. E [ c ]₀]Indicating a desire for a population distribution that is relevant to the context.

Identification of predicted Density prediction map D by discriminator Using penalty function_p(x, y) authentication, and generating a density prediction map D_p(x, y) more high frequency information is retained, i.e. the dense region prediction accuracy is improved. x is the number of_rRepresenting true density profiles, i.e. D_s(x,y)，x_fRepresenting the output of the predictor, i.e. D_p(x, y). Penalty function L in this embodiment₃The method comprises the following specific steps:

L₃＝E[(Dis(x_r)-0)²]+E[(Dis(x_f)-1)²]

the discriminator adopts PatchGAN network structure with the reception field size of 16 × 16 and dense crowd graph I_c(x, y) is downsampled to 1/8 size and concatenated with the gaussian density map as the input to the discriminator. The discriminator loss function uses LSGAN to ensure its stability and robustness.

In the preferred embodiment, the hidden gaussian density generator parameters are updated based on the objective loss function smoothing term, the bayesian term, and the count error term.

In another preferred embodiment, on the basis of the above embodiment, the density predictor, the discriminator and the hidden gaussian density generator can be cooperatively optimized, and the discriminator parameter is updated based on the LSGAN loss function according to the output of the density predictor and the output of the hidden gaussian density generator. And updating a density map predictor according to a mean square error term output by the hidden Gaussian density generator, a confrontation loss term obtained by feedback of the discriminator and a Bayesian term loss function.

In order to enable the optimization target of the density predictor to be less influenced by noise, the hidden Gaussian density generator can be pre-trained by inputting training data, and then the output G (x, y) of the hidden Gaussian density generator is used as the group of the density predictor.

In order to ensure the stability of the training process, the density predictor and the discriminator are updated alternately and sequentially, the hidden Gaussian density generator is updated once every 100 times, and the learning rate of the hidden Gaussian density generator is adjusted to 1/5 of the learning rate of the density predictor.

Based on the above embodiment steps, the training data of the specific example is respectively from the shanghai science data set and the UCF _ QNRF data set, wherein the former comprises 300 crowd pictures with different scenes and different sizes, the latter comprises 1200 larger-sized images with different viewing angles, sizes and density levels, and the test data respectively comprises 182 pictures and 334 pictures. Each picture has 50 to 3500 pedestrians.

The evaluation standard adopts MAE (mean absolute error) and MSE (mean square error), N is set as the number of pictures in the test set, C_iThe number of people is predicted for the ith picture,

the real number of people in the ith picture is defined as follows:

the method and the device have the advantages that the accuracy is improved to a large extent by the result obtained by the embodiment of the invention, the robustness is good, in addition, compared with the baseline, the embodiment of the invention does not increase the parameter and the operation amount of the inference stage, and the application value is strong.

Fig. 3 is a schematic diagram illustrating the effect of the dense crowd counting method based on implicit density distribution according to the embodiment of the present invention, as shown in fig. 3, if the implicit density distribution method according to the above embodiment of the present invention is not used, the generated density map has low counting accuracy and poor quality, and it is difficult to accurately reflect the crowd distribution.

In another embodiment of the present invention, an electronic terminal is further provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the dense crowd counting method based on implicit density distribution is implemented.

It should be noted that, the steps in the method provided by the present invention may be implemented by using corresponding modules, devices, units, and the like in the system, and those skilled in the art may refer to the technical solution of the system to implement the step flow of the method, that is, the embodiment in the system may be understood as a preferred example for implementing the method, and details are not described herein.

Those skilled in the art will appreciate that, in addition to implementing the system and its various devices provided by the present invention in purely computer readable program code means, the method steps can be fully programmed to implement the same functions by implementing the system and its various devices in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices thereof provided by the present invention can be regarded as a hardware component, and the devices included in the system and various devices thereof for realizing various functions can also be regarded as structures in the hardware component; means for performing the functions may also be regarded as structures within both software modules and hardware components for performing the methods.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention.

Claims

1. A dense crowd counting method based on hidden density distribution is characterized by comprising the following steps:

2. The base of claim 1A dense crowd counting method based on a hidden density distribution, characterized in that the dense crowd point diagram D_t(x, y) obtaining an adaptive hidden gaussian density map G (x, y) by a hidden gaussian density generator, comprising:

inputting the more robust features into a decoding module formed by multilayer convolution, wherein the number of output channels of each convolution layer is gradually reduced compared with the input channels, and finally obtaining a hidden Gaussian density map D_s(x,y)。

3. The method for dense population counting based on implicit density distribution according to claim 1, wherein the constraint on the generation goal by using the multi-level loss function comprises:

identification of predicted Density prediction map D by discriminator Using penalty function_p(x, y) authentication, and generating a density prediction map D_p(x, y) retaining more high frequency information, i.e. increasing dense area predictionAnd (6) measuring the precision.

4. The method according to claim 1, wherein the dense crowd image I is obtained by dividing the dense crowd image into a plurality of groups_c(x, y) inputting the density predictor and outputting the predicted density prediction map D_p(x, y) comprising:

taking pre-trained VggNet as a feature extraction network, and taking an image I of a dense crowd_c(x, y) inputting VggNet network to obtain characteristic diagram, up-sampling the characteristic diagram, and obtaining output density prediction diagram D after multilayer convolution layer_p(x, y), the feature extraction network, the upsampling layer and the multilayer convolution layer form a density predictor.

5. The dense crowd counting method based on implicit density distribution according to claim 4, wherein the density prediction graph D is obtained from the density prediction graph output by the density predictor_p(x, y) and the hidden-gaussian density map G (x, y), updating discriminator parameters using an LSGAN loss function.

6. The dense crowd counting method based on the implicit density distribution according to any one of claims 1 to 5, further comprising optimizing the implicit Gaussian density map G (x, y), wherein the optimization is performed according to a count loss term or a Bayesian term, or according to a smoothing term and a count loss term, or according to a smoothing term and a Bayesian term, so as to generate a higher-quality implicit Gaussian density map; wherein the content of the first and second substances,

7. The method of claim 6, further comprising: and updating the parameters of the hidden Gaussian density generator according to the target loss function smoothing term, the Bayesian term and the counting error term.

8. The method according to claim 6, wherein the dense population counting method based on implicit density distribution is characterized in that a density map predictor is updated according to a mean square error term output by the implicit Gaussian density generator, an antagonistic generation loss term fed back by the discriminator, and a Bayesian loss function.

9. A dense population counting system based on hidden density distribution, comprising:

10. An electronic terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method according to any of claims 1-8 when executing the computer program.