WO2020169043A1 - Dense crowd counting method, apparatus and device, and storage medium - Google Patents

Dense crowd counting method, apparatus and device, and storage medium Download PDF

Info

Publication number
WO2020169043A1
WO2020169043A1 PCT/CN2020/075795 CN2020075795W WO2020169043A1 WO 2020169043 A1 WO2020169043 A1 WO 2020169043A1 CN 2020075795 W CN2020075795 W CN 2020075795W WO 2020169043 A1 WO2020169043 A1 WO 2020169043A1
Authority
WO
WIPO (PCT)
Prior art keywords
column
neural network
image
convolutional
convolutional neural
Prior art date
Application number
PCT/CN2020/075795
Other languages
French (fr)
Chinese (zh)
Inventor
张莉
陆金刚
周伟达
王邦军
章晓芳
屈蕴茜
赵雷
Original Assignee
苏州大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州大学 filed Critical 苏州大学
Publication of WO2020169043A1 publication Critical patent/WO2020169043A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the present invention relates to the field of computer vision technology, in particular to a method, device, equipment and computer-readable storage medium for counting dense crowds.
  • the mainstream estimation method adopts the idea of density map, which is to design a neural network, the input of the network is the original image, and the output is the density map of the crowd.
  • the first step of this kind of method for image processing of dense crowds is to pass a Gaussian filter to obtain the density map corresponding to the image according to the ground-truth of the image.
  • Zhang et al. proposed a multi-column convolutional neural network in "Single-Image Crowd Counting via Multi-Column Convolutional Neural Network".
  • the network is composed of three parallel convolutional neural networks.
  • the present invention provides a dense crowd counting method, including: inputting the image to be tested into a pre-trained target multi-scale and multi-column convolutional neural network model; wherein the target multi-scale and multi-column
  • the convolutional neural network model includes multiple columns of parallel convolutional neural networks, and each column of convolutional neural networks includes multiple convolutional layers with different sizes and numbers of convolution kernels; the images to be tested are input to each In the column convolutional neural network, each convolutional layer in each column of the convolutional neural network is used to process the image to be tested, and the feature map output by the preselected convolutional layer in each column of the convolutional neural network is processed Fusion, so as to separately obtain the estimated density map output by each column of the convolutional neural network; after fusing the estimated density map output by each column of the convolutional neural network, the target estimated density map of the image to be tested is obtained; According to the target estimated density map of the image to be tested, the number of people in
  • obtaining a density map of each image in the crowd image data set to construct a target training set includes:
  • the training a multi-scale and multi-column convolutional neural network model using the target training set includes:
  • each column of the convolutional neural network in the multi-scale and multi-column convolutional neural network model is parallel to each other, and the convolutional neural network of each column has the same network structure except for the size and number of convolution kernels;
  • each column of the convolutional neural network of the multi-scale and multi-column convolutional neural network model includes:
  • the size of the convolution kernels of the first convolutional layer and other convolutional layers are different, and the second convolutional layer, the third convolutional layer, the fourth convolutional layer, and the fifth convolutional layer
  • the size of the convolution kernel is the same as that of the sixth convolutional layer, and the convolutional layers of the third convolutional layer, the fourth convolutional layer, the fifth convolutional layer and the sixth convolutional layer
  • the number of product cores is the same;
  • the pooling layer selection area between the first convolutional layer, the second convolutional layer, the third convolutional layer, and the fourth convolutional layer is 2*2, and the maximum step size is 2 Pooling
  • the pooling layer between the fourth convolutional layer and the fifth convolutional layer selects a 3*3 area with a maximum pooling step of 1 in order to maintain the output feature map of the fourth convolutional layer and The size of the feature map after the output feature pooling of the fourth convolutional layer remains unchanged;
  • the activation function of each convolutional layer adopts the ReLU function
  • the output module includes:
  • the present invention also provides a device for counting dense crowds, including:
  • the present invention also provides a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above-mentioned dense crowd counting method are realized.
  • the dense crowd counting method provided by the present invention uses a pre-trained target multi-scale multi-column convolutional neural network model to predict the test image.
  • the target multi-scale multi-column convolutional neural network model includes multiple parallel convolutional neural networks. After inputting the image to be tested into the target multi-scale and multi-column convolutional neural network model, inputting the image to be tested into the convolutional neural network of each column respectively.
  • Each column of the convolutional neural network includes multiple convolutional layers with different sizes and numbers of convolution kernels, and different convolutional layers in each column of the convolutional neural network are used to calculate the image to be tested,
  • the feature maps output by the convolutional layer preselected in each column of the convolutional neural network are merged to extract features of different scales of the image to be tested; the previous convolutional neural network in the prior art is solved
  • Some features extracted by the multi-layer may be discarded in the subsequent process, resulting in insufficient features, which affects the accuracy of the test image prediction results.
  • the method provided by the present invention introduces the idea of multi-scale, which can combine the features extracted from the previous convolutional layer with the features extracted from the subsequent convolutional layer, that is, to combine features with different levels of detail to extract the features. It compensates for some of the features that may be discarded after pooling in the feature map obtained by the convolution layer in front of the traditional neural network, and improves the performance of the dense crowd counting neural network and the accuracy of the dense crowd image prediction result.
  • Figure 2 is a structure diagram of a multi-scale and multi-column convolutional neural network provided by the present invention
  • FIG. 3 is a flowchart of a second specific embodiment of the method for counting dense crowds provided by the present invention.
  • Fig. 4 is a structural block diagram of a device for counting dense crowds according to an embodiment of the present invention.
  • FIG. 1 is a flowchart of a first specific embodiment of a method for counting dense crowds provided by the present invention. the specific operation steps are as follows:
  • a Gaussian filter is first used to analyze the pre-created crowd image data set After filtering, the acquired image data set population density maps M i X i of each image, to construct the training set target
  • X-i is the i-th groups of image data sets of images
  • size is m * n
  • Y i is the i-images corresponding to the head coordinate point view of size m * n
  • N is the image groups The total number of images in the dataset.
  • the multi-scale multi-column convolutional neural network may include a multi-column convolutional neural network.
  • a three-column convolutional neural network is taken as an example.
  • Each column of the convolutional neural network includes a first convolution layer, a second convolution layer, a third convolution layer, a fourth convolution layer, a fifth convolution layer, a deconvolution layer, a sixth convolution layer, and The seventh convolutional layer.
  • the size of the convolution kernels of the first convolutional layer and other convolutional layers are different, and the second convolutional layer, the third convolutional layer, the fourth convolutional layer, and the fifth convolutional layer
  • the size of the convolution kernel is the same as that of the sixth convolutional layer, and the convolutional layers of the third convolutional layer, the fourth convolutional layer, the fifth convolutional layer and the sixth convolutional layer
  • the number of product cores is the same.
  • the activation function of each convolutional layer adopts the ReLU function.
  • the pooling layer selection area between the first convolutional layer, the second convolutional layer, the third convolutional layer, and the fourth convolutional layer is 2*2, and the maximum step size is 2 Pooling; the pooling layer between the fourth convolutional layer and the fifth convolutional layer selects a 3*3 area, and the maximum pooling with a step length of 1, so as to maintain the output of the fourth convolutional layer
  • the size of the feature map and the feature map after the output feature pooling of the fourth convolutional layer remains unchanged.
  • the feature map output by the fourth convolution layer and the feature map output by the fifth convolution layer are connected in series in the number of channels and then input to the deconvolution layer.
  • the feature map output by the deconvolution layer and the The feature map output by the third convolutional layer is connected in series on the number of channels and then input to the sixth convolutional layer.
  • the eighth convolutional layer outputs the estimated density map of the image to be tested as the convolutional neural network for each column The output of the model.
  • the convolutional neural network After concatenating the estimated density map of the current crowd image output by each column of the convolutional neural network on the number of channels, it passes through a total convolutional layer with a convolution kernel size of 1*1, and the total convolution The feature map output by the layer is mapped to the target estimated density map of the current crowd image, so that the target estimated density map of the current crowd image is used as the network output of the multi-scale and multi-column convolutional neural network model.
  • Step S102 Input the image to be tested into each column of the convolutional neural network, use each convolutional layer in each column of the convolutional neural network to process the image to be tested, and The feature maps output by the preselected convolutional layers in the column convolutional neural network are fused, so as to obtain the estimated density maps output by each column of the convolutional neural network respectively;
  • the convolution layer in each column of the convolution application network processes the data to be tested.
  • the deconvolution layer is used to up-sample the previous feature maps, and then the feature maps obtained by the third convolution layer are connected in series with the number of channels.
  • Step S103 After fusing the estimated density maps output by each column of the convolutional neural network, the target estimated density map of the image to be tested is obtained;
  • Step S104 According to the target estimated density map of the image to be tested, the number of people in the image to be tested is calculated.
  • Step 301 After filtering the crowd images in the second part of the Shanghai tech data set by using a Gaussian filter, obtain a degree map of the crowd images in the second part to construct a target training set;
  • the Shanghai tech data set contains 1,198 annotated images and 330165 head center annotations; the Shanghai tech data set is divided into two parts, of which, the first part includes 482 images randomly crawled from the Internet, of which 300 For training, 182 images were used for testing; the second part included 716 images taken on the streets of Shanghai, 400 of which were used for training and 316 were used for testing.
  • Step 303 Input the image T to be tested into the target multi-scale and multi-column convolutional neural network model, where the target multi-scale and multi-column convolutional neural network model includes multiple columns of parallel convolutional neural networks, each column
  • the convolutional neural network includes multiple convolutional layers with different sizes and numbers of convolution kernels;
  • Step S305 Calculate the estimated density map The sum of all pixel values in the image to get the number of people in the image to be tested
  • the multi-scale and multi-column convolutional neural network model provided in this embodiment and the multi-column convolutional neural network model are compared on the same data set for crowd counting. It can be obtained from Table 1 that the average complete error (MAE) and mean square error (MSE) of the counting result of the network model proposed in this embodiment are both smaller than the counting result of the network model in the prior art, and better performance is obtained.
  • MAE average complete error
  • MSE mean square error
  • FIG. 4 is a block diagram of a device for counting dense crowds according to an embodiment of the present invention.
  • Specific devices may include:
  • the processing module 200 is configured to input the image to be tested into each column of the convolutional neural network, use each convolutional layer in each column of the convolutional neural network to process the image to be tested, and Fusing the feature maps output by the preselected convolutional layers in each column of the convolutional neural network, so as to obtain the estimated density maps output by the convolutional neural network of each column respectively;
  • the output module 300 is configured to fuse the estimated density map output by each column of the convolutional neural network to obtain the target estimated density map of the image to be tested;
  • the calculation module 400 is configured to calculate the number of people in the image to be tested according to the target estimated density map of the image to be tested.
  • the device for counting dense crowds of this embodiment is used to implement the aforementioned method for counting dense crowds. Therefore, the specific implementation of the device for counting dense crowds can be seen in the foregoing embodiment of the method for counting dense crowds, for example, the input module 100 , The processing module 200, the output module 300, and the calculation module 400 are respectively used to implement steps S101, S102, S103, and S104 in the above-mentioned dense crowd counting method. Therefore, for the specific implementation, please refer to the description of the respective parts of the embodiment. I will not repeat them here.
  • Specific embodiments of the present invention also provide a device for counting crowds of people, including: a memory for storing a computer program; a processor for implementing the steps of the method for counting a crowd of people when executing the computer program.
  • a specific embodiment of the present invention also provides a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above-mentioned dense crowd counting method are realized.
  • the steps of the method or algorithm described in the embodiments disclosed in this document can be directly implemented by hardware, a software module executed by a processor, or a combination of the two.
  • the software module can be placed in random access memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disks, removable disks, CD-ROMs, or all areas in the technical field. Any other known storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

Provided are a dense crowd counting method, apparatus and device, and a computer-readable storage medium. The method comprises: inputting an image to be tested into a target multi-scale multi-column convolutional neural network model comprising multiple columns of parallel convolutional neural networks, wherein each column of convolutional neural networks comprises multiple convolutional layers with different convolutional kernel sizes and quantities; processing the image to be tested by using each convolutional layer in each column of convolutional neural networks, and fusing feature maps output by pre-selected convolutional layers in each column of convolutional neural networks, so as to obtain estimated density maps output by each column of convolutional neural networks; fusing the estimated density maps output by each column of convolutional neural networks to obtain a target estimated density map of the image to be tested; and calculating the number of people in the image to be tested according to the target estimated density map. By means of the provided method, apparatus and device and computer-readable storage medium, the accuracy of a dense crowd image prediction result is improved.

Description

一种密集人群计数的方法、装置、设备以及存储介质Method, device, equipment and storage medium for counting dense crowds
本申请要求于2019年2月21日提交中国专利局、申请号为201910129612.3、发明名称为“一种密集人群计数的方法、装置、设备以及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on February 21, 2019, the application number is 201910129612.3, and the invention title is "A method, device, equipment, and storage medium for dense crowd counting", and its entire contents Incorporated in this application by reference.
技术领域Technical field
本发明涉及计算机视觉技术领域,特别是涉及一种密集人群计数的方法、装置、设备以及计算机可读存储介质。The present invention relates to the field of computer vision technology, in particular to a method, device, equipment and computer-readable storage medium for counting dense crowds.
背景技术Background technique
为了人群控制和公众安全,准确地估计来自图像或视频的人群已经成为计算机视觉技术越来越重要的应用。计算机视觉中的人群计数任务是自动计算图像或视频中的人数。为了在诸如公众集会和体育赛事等许多场景中帮助控制人群和公共安全,需要准确的人群计数。For crowd control and public safety, accurately estimating the crowd from images or videos has become an increasingly important application of computer vision technology. The task of crowd counting in computer vision is to automatically count the number of people in an image or video. To help control crowds and public safety in many scenarios such as public gatherings and sports events, accurate crowd counting is required.
传统的密集人群计数方法包括两种:基于检测的方法和基于回归的方法。基于检测的方法将人群视为一组检测到的个体实体。但是,行人经常被密集的人群遮挡,这在静止图像中估计人群时尤其具有挑战性。基于回归的方法对从人群图像中提取的各种特征的标量值(如人数)或密度图进行回归。它们基本上有两个步骤:首先,从人群图像中提取有效特征;第二,利用各种回归函数来估计人群数量。但是,通过回归进行的人群计数容易受到视角和尺度的急剧变化的影响,而这种变化通常存在于人群图像中。Traditional dense crowd counting methods include two types: detection-based methods and regression-based methods. Detection-based methods treat the population as a set of detected individual entities. However, pedestrians are often obscured by dense crowds, which is especially challenging when estimating crowds in still images. The regression-based method regresses the scalar value (such as the number of people) or density map of various features extracted from the crowd image. They basically have two steps: first, extract effective features from crowd images; second, use various regression functions to estimate the number of crowds. However, crowd counting by regression is susceptible to sharp changes in viewing angles and scales, which usually exist in crowd images.
与此同时,深度学习已经被成功地应用在密集人群图像的估计中。主流的估计方法采用密度图的思想,即设计一个神经网络,网络的输入为原始图像,而输出为人群的密度图。这类方法对密集人群图像处理的第一步,就是要通过一个高斯滤波器,根据图像的真实值ground-truth得到图像对应的密度图。Zhang等人在“Single-Image Crowd Counting via Multi-Column Convolutional Neural Network”中提出了一个多列卷积神经网络。该网络由 三列平行的卷积神经网络组成,每一列使用感受野大小不同的卷积核,分别对应尺度大小不一样的人头;每一列除了卷积核大小和数量,其他构成都相同;采用大小为的最大池化和ReLU激活函数;最后将三列的特征图在通道数上串联起来,用一个的卷积核将其映射到估计的密度图输出。然而多列卷积神经网络结构偏简单,层数较少,前面的卷积层提取到的一些特征在后续过程中可能被丢弃且提取到的特征不够而影响到最终的结果。At the same time, deep learning has been successfully applied to the estimation of dense crowd images. The mainstream estimation method adopts the idea of density map, which is to design a neural network, the input of the network is the original image, and the output is the density map of the crowd. The first step of this kind of method for image processing of dense crowds is to pass a Gaussian filter to obtain the density map corresponding to the image according to the ground-truth of the image. Zhang et al. proposed a multi-column convolutional neural network in "Single-Image Crowd Counting via Multi-Column Convolutional Neural Network". The network is composed of three parallel convolutional neural networks. Each column uses convolution kernels with different receptive field sizes, corresponding to human heads with different scales; each column has the same composition except for the size and number of convolution kernels; adopts The maximum pooling and ReLU activation function of size is; finally, the three columns of feature maps are connected in series on the number of channels, and a convolution kernel is used to map them to the estimated density map output. However, the structure of the multi-column convolutional neural network is simple and the number of layers is small. Some features extracted by the previous convolutional layer may be discarded in the subsequent process and the extracted features are not enough to affect the final result.
综上所述可以看出,如何提高密集人群图像预测结果的准确性是目前有待解决的问题。In summary, it can be seen that how to improve the accuracy of the prediction results of dense crowds is a problem to be solved at present.
发明内容Summary of the invention
本发明的目的是提供一种密集人群计数的方法、装置、设备以及计算机可读存储介质,以解决现有技术中提供的密集人群计数的神经网络性能较差的问题。The purpose of the present invention is to provide a method, device, device, and computer-readable storage medium for dense crowd counting to solve the problem of poor performance of the neural network for dense crowd counting provided in the prior art.
为解决上述技术问题,本发明提供一种密集人群计数的方法,包括:将待测试图像输入至预先完成训练的目标多尺度多列卷积神经网络模型中;其中,所述目标多尺度多列卷积神经网络模型包括多列平行的卷积神经网络,每列卷积神经网络中包括多个卷积核大小和个数不同的卷积层;将所述待测试图像分别输入至所述每列卷积神经网络中,利用所述每列卷积神经网络中各个卷积层对所述待测试图像进行处理,并将所述每列卷积神经网络中预选卷积层输出的特征图进行融合,以便于分别得到所述每列卷积神经网络输出的估计密度图;将所述每列卷积神经网络输出的估计密度图进行融合后,得到所述待测试图像的目标估计密度图;依据所述待测试图像的目标估计密度图,计算得到所述待测试图像中的人数。In order to solve the above technical problems, the present invention provides a dense crowd counting method, including: inputting the image to be tested into a pre-trained target multi-scale and multi-column convolutional neural network model; wherein the target multi-scale and multi-column The convolutional neural network model includes multiple columns of parallel convolutional neural networks, and each column of convolutional neural networks includes multiple convolutional layers with different sizes and numbers of convolution kernels; the images to be tested are input to each In the column convolutional neural network, each convolutional layer in each column of the convolutional neural network is used to process the image to be tested, and the feature map output by the preselected convolutional layer in each column of the convolutional neural network is processed Fusion, so as to separately obtain the estimated density map output by each column of the convolutional neural network; after fusing the estimated density map output by each column of the convolutional neural network, the target estimated density map of the image to be tested is obtained; According to the target estimated density map of the image to be tested, the number of people in the image to be tested is calculated.
优选地,所述将待测试图像输入至预先完成训练的目标多尺度多列卷积神经网络模型中前包括:Preferably, the input of the image to be tested into the pre-trained target multi-scale multi-column convolutional neural network model includes:
利用高斯滤波器对预先创建的人群图像数据集进行滤波处理后,获取所述人群图像数据集中每幅图像的密度图,从而构建目标训练集;After performing filtering processing on the pre-created crowd image data set by using a Gaussian filter, a density map of each image in the crowd image data set is obtained, thereby constructing a target training set;
采用所述目标训练集对多尺度多列卷积神经网络模型进行训练,得到完成训练后的目标多尺度多列卷积神经网络模型。The target training set is used to train the multi-scale and multi-column convolutional neural network model to obtain the target multi-scale and multi-column convolutional neural network model after the training is completed.
优选地,所述利用高斯滤波器对预先创建的人群图像数据集进行滤波处理后,获取所述人群图像数据集中每幅图像的密度图,从而构建目标训练集包括:Preferably, after performing filtering processing on a pre-created crowd image data set by using a Gaussian filter, obtaining a density map of each image in the crowd image data set to construct a target training set includes:
获取预先采集的人群图像数据集
Figure PCTCN2020075795-appb-000001
其中,X i为所述人群图像数据集第i张图像,大小为m*n;Y i为所述第i张图像对应的人头坐标点图,大小为m*n,N为所述人群图像数据集中图像总数;
Obtain pre-collected crowd image dataset
Figure PCTCN2020075795-appb-000001
Wherein, X-i is the i-th groups of image data sets of images, size is m * n; Y i is the i-images corresponding to the head coordinate point view of size m * n, N is the image groups The total number of images in the data set;
利用高斯滤波器对所述人群图像数据集
Figure PCTCN2020075795-appb-000002
中的每幅图像X i进行滤波处理后,得到所述每幅图像X i的密度图M i,利用所述每幅图像X i的密度图M i构建目标训练集
Figure PCTCN2020075795-appb-000003
Use Gaussian filter on the crowd image data set
Figure PCTCN2020075795-appb-000002
Each of the X i images after filtering, to obtain the density map M i X i of each image, using the density of each image in FIG M i X i of the training set target construct
Figure PCTCN2020075795-appb-000003
优选地,所述采用所述目标训练集对多尺度多列卷积神经网络模型进行训练包括:Preferably, the training a multi-scale and multi-column convolutional neural network model using the target training set includes:
将所述目标训练集中的当前人群图像分别输入至所述多尺度多列卷积神经网络模型的每列卷积神经网络中;Input the current crowd image in the target training set into each column of the convolutional neural network of the multi-scale and multi-column convolutional neural network model;
其中,所述多尺度多列卷积神经网络模型中的每列卷积神经网络相互平行,所述每列卷积神经网络除卷积核大小和个数外,其他网络结构相同;Wherein, each column of the convolutional neural network in the multi-scale and multi-column convolutional neural network model is parallel to each other, and the convolutional neural network of each column has the same network structure except for the size and number of convolution kernels;
将所述每列卷积神经网络输出的所述当前人群图像的估计密度图在通道数上串联后,经过一个卷积核大小为1*1的总卷积层,并将所述总卷积层输出的特征图映射为所述当前人群图像的目标估计密度图,以便于将所述当前人群图像的目标估计密度图作为所述多尺度多列卷积神经网络模型的网络输出。After concatenating the estimated density map of the current crowd image output by each column of the convolutional neural network on the number of channels, it passes through a total convolutional layer with a convolution kernel size of 1*1, and the total convolution The feature map output by the layer is mapped to the target estimated density map of the current crowd image, so that the target estimated density map of the current crowd image is used as the network output of the multi-scale and multi-column convolutional neural network model.
优选地,所述多尺度多列卷积神经网络模型的每列卷积神经网络包括:Preferably, each column of the convolutional neural network of the multi-scale and multi-column convolutional neural network model includes:
第一卷积层、第二卷积层、第三卷积层、第四卷积层、第五卷积层、反卷积层、第六卷积层和第七卷积层;The first convolutional layer, the second convolutional layer, the third convolutional layer, the fourth convolutional layer, the fifth convolutional layer, the deconvolutional layer, the sixth convolutional layer, and the seventh convolutional layer;
其中,所述第一卷积层和其他卷积层的卷积核大小不同,所述第二卷积层、所述第三卷积层、所述第四卷积层、所述第五卷积层和所述第六卷积层的卷积核大小相同,所述第三卷积层、所述第四卷积层、所述第五卷积层和所述第六卷积层的卷积核的个数相同;Wherein, the size of the convolution kernels of the first convolutional layer and other convolutional layers are different, and the second convolutional layer, the third convolutional layer, the fourth convolutional layer, and the fifth convolutional layer The size of the convolution kernel is the same as that of the sixth convolutional layer, and the convolutional layers of the third convolutional layer, the fourth convolutional layer, the fifth convolutional layer and the sixth convolutional layer The number of product cores is the same;
所述第一卷积层、所述第二卷积层、所述第三卷积层和所述第四个卷 积层之间的池化层选用区域2*2,步长为2的最大池化;The pooling layer selection area between the first convolutional layer, the second convolutional layer, the third convolutional layer, and the fourth convolutional layer is 2*2, and the maximum step size is 2 Pooling
所述第四卷积层和所述第五卷积层之间的池化层选用3*3区域,步长为1的最大池化,以便于保持所述第四卷积层输出特征图和对所述第四卷积层输出特征池化后的特征图大小不变;The pooling layer between the fourth convolutional layer and the fifth convolutional layer selects a 3*3 area with a maximum pooling step of 1 in order to maintain the output feature map of the fourth convolutional layer and The size of the feature map after the output feature pooling of the fourth convolutional layer remains unchanged;
所述各个卷积层的激活函数采用ReLU函数;The activation function of each convolutional layer adopts the ReLU function;
所述第四卷积层输出的特征图和所述第五卷积层输出的特征图在通道数上串联后输入所述反卷积层,所述反卷积层输出的特征图和所述第三卷积层输出的特征图在通道数上串联后输入所述第六卷积层,所述第八卷积层输出所述待测试图像的估计密度图作为所述每列卷积神经网络模型的输出结果。The feature map output by the fourth convolution layer and the feature map output by the fifth convolution layer are connected in series in the number of channels and then input to the deconvolution layer. The feature map output by the deconvolution layer and the The feature map output by the third convolutional layer is connected in series on the number of channels and then input to the sixth convolutional layer. The eighth convolutional layer outputs the estimated density map of the image to be tested as the convolutional neural network for each column The output of the model.
优选地,所述依据所述待测试图像的目标估计密度图,计算得到所述待测试图像中的人数包括:Preferably, said calculating the number of persons in the image to be tested according to the target estimated density map of the image to be tested includes:
将所述待测试图像T输入至所述目标多尺度多列卷积神经网络模型,得到所述待测试图像T的估计密度图
Figure PCTCN2020075795-appb-000004
后,计算所述估计密度图
Figure PCTCN2020075795-appb-000005
中所有像素值的和,得到所述待测试图像中的人数
Figure PCTCN2020075795-appb-000006
Input the image T to be tested into the target multi-scale multi-column convolutional neural network model to obtain the estimated density map of the image T to be tested
Figure PCTCN2020075795-appb-000004
After calculating the estimated density map
Figure PCTCN2020075795-appb-000005
The sum of all pixel values in the image to get the number of people in the image to be tested
Figure PCTCN2020075795-appb-000006
本发明还提供了一种密集人群计数的装置,包括:The present invention also provides a device for counting dense crowds, including:
输入模块,用于将待测试图像输入至预先完成训练的目标多尺度多列卷积神经网络模型中;其中,所述目标多尺度多列卷积神经网络模型包括多列平行的卷积神经网络,每列卷积神经网络中包括多个卷积核大小和个数不同的卷积层;The input module is used to input the image to be tested into the pre-trained target multi-scale and multi-column convolutional neural network model; wherein the target multi-scale and multi-column convolutional neural network model includes a multi-column parallel convolutional neural network , Each column of convolutional neural network includes multiple convolutional layers with different sizes and numbers of convolution kernels;
处理模块,用于将所述待测试图像分别输入至所述每列卷积神经网络中,利用所述每列卷积神经网络中各个卷积层对所述待测试图像进行处理,并将所述每列卷积神经网络中预选卷积层输出的特征图进行融合,以便于分别得到所述每列卷积神经网络输出的估计密度图;The processing module is configured to input the image to be tested into each column of the convolutional neural network, use each convolutional layer in each column of the convolutional neural network to process the image to be tested, and Fuse the feature maps output by the preselected convolutional layers in each column of the convolutional neural network, so as to obtain the estimated density maps output by each column of the convolutional neural network respectively;
输出模块,用于将所述每列卷积神经网络输出的估计密度图进行融合后,得到所述待测试图像的目标估计密度图;The output module is used to fuse the estimated density map output by each column of the convolutional neural network to obtain the target estimated density map of the image to be tested;
计算模块,用于依据所述待测试图像的目标估计密度图,计算得到所述待测试图像中的人数。The calculation module is used to calculate the number of people in the image to be tested according to the target estimated density map of the image to be tested.
优选地,所述输出模块前包括:Preferably, the output module includes:
训练模块,用于利用高斯滤波器对预先创建的人群图像数据集进行滤波处理后,获取所述人群图像数据集中每幅图像的密度图,从而构建目标训练集;The training module is used to filter the pre-created crowd image data set by using a Gaussian filter, and then obtain the density map of each image in the crowd image data set, thereby constructing a target training set;
采用所述目标训练集对多尺度多列卷积神经网络模型进行训练,得到完成训练后的目标多尺度多列卷积神经网络模型。The target training set is used to train the multi-scale and multi-column convolutional neural network model to obtain the target multi-scale and multi-column convolutional neural network model after the training is completed.
本发明还提供了一种密集人群计数的设备,包括:The present invention also provides a device for counting dense crowds, including:
存储器,用于存储计算机程序;处理器,用于执行所述计算机程序时实现上述一种密集人群计数的方法的步骤。The memory is used to store a computer program; the processor is used to implement the steps of the above-mentioned dense crowd counting method when the computer program is executed.
本发明还提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现上述一种密集人群计数的方法的步骤。The present invention also provides a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above-mentioned dense crowd counting method are realized.
本发明所提供的密集人群计数的方法,利用预先完成训练的目标多尺度多列卷积神经网络模型对待测试图像进行预测。所述目标多尺度多列卷积神经网络模型包括多列平行的卷积神经网络。将所述待测试图像输入所述目标多尺度多列卷积神经网络模型后,将所述待测试图像分别输入所述每列卷积神经网络中。所述每列卷积神经网络中包括多个卷积核大小和个数不同的卷积层,分别利用所述每列卷积神经网络中的不同卷积层对所述待测试图像进行计算,并将所述每列卷积神经网络中预选的卷积层输出的特征图进行融合,提取到所述待测试图像的不同尺度的特征;解决了现有技术中的卷积神经网络中前面卷积层提取到的一些特征在后续过程中可能被丢弃导致提取到的特征不够从而影响了对待测试图像预测结果的准确性的问题。本发明所提供的方法,引入了多尺度的思想,可以将前面卷积层提取到的特征和后面卷积层提取到的特征结合起来,即将详细程度不同的特征结合起来进而提取特征,这就弥补了传统神经网络前面的卷积层得到的特征图经过池化可能被丢弃的一些特征,提高了密集人群计数的神经网络的性能以及密集人群图像预测结果的准确性。The dense crowd counting method provided by the present invention uses a pre-trained target multi-scale multi-column convolutional neural network model to predict the test image. The target multi-scale multi-column convolutional neural network model includes multiple parallel convolutional neural networks. After inputting the image to be tested into the target multi-scale and multi-column convolutional neural network model, inputting the image to be tested into the convolutional neural network of each column respectively. Each column of the convolutional neural network includes multiple convolutional layers with different sizes and numbers of convolution kernels, and different convolutional layers in each column of the convolutional neural network are used to calculate the image to be tested, The feature maps output by the convolutional layer preselected in each column of the convolutional neural network are merged to extract features of different scales of the image to be tested; the previous convolutional neural network in the prior art is solved Some features extracted by the multi-layer may be discarded in the subsequent process, resulting in insufficient features, which affects the accuracy of the test image prediction results. The method provided by the present invention introduces the idea of multi-scale, which can combine the features extracted from the previous convolutional layer with the features extracted from the subsequent convolutional layer, that is, to combine features with different levels of detail to extract the features. It compensates for some of the features that may be discarded after pooling in the feature map obtained by the convolution layer in front of the traditional neural network, and improves the performance of the dense crowd counting neural network and the accuracy of the dense crowd image prediction result.
附图说明Description of the drawings
为了更清楚的说明本发明实施例或现有技术的技术方案,下面将对实 施例或现有技术描述中所需要使用的附图作简单的介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions of the prior art more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are merely For some of the embodiments of the present invention, those of ordinary skill in the art can obtain other drawings based on these drawings without creative work.
图1为本发明所提供的密集人群计数的方法的第一种具体实施例的流程图;FIG. 1 is a flowchart of a first specific embodiment of a method for counting dense crowds provided by the present invention;
图2为本发明所提供的多尺度多列卷积神经网络结构图;Figure 2 is a structure diagram of a multi-scale and multi-column convolutional neural network provided by the present invention;
图3为本发明所提供的密集人群计数的方法的第二种具体实施例的流程图;3 is a flowchart of a second specific embodiment of the method for counting dense crowds provided by the present invention;
图4为本发明实施例提供的一种密集人群计数的装置的结构框图。Fig. 4 is a structural block diagram of a device for counting dense crowds according to an embodiment of the present invention.
具体实施方式detailed description
本发明的核心是提供一种密集人群计数的方法、装置、设备以及计算机可读存储介质,提高了密集人群计数的神经网络的性能以及密集人群图像预测结果的准确性。The core of the present invention is to provide a dense crowd counting method, device, equipment and computer readable storage medium, which improve the performance of the dense crowd counting neural network and the accuracy of the dense crowd image prediction result.
为了使本技术领域的人员更好地理解本发明方案,下面结合附图和具体实施方式对本发明作进一步的详细说明。显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to enable those skilled in the art to better understand the solution of the present invention, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
请参考图1,图1为本发明所提供的密集人群计数的方法的第一种具体实施例的流程图;具体操作步骤如下:Please refer to Fig. 1, which is a flowchart of a first specific embodiment of a method for counting dense crowds provided by the present invention; the specific operation steps are as follows:
步骤S101:将待测试图像输入至预先完成训练的目标多尺度多列卷积神经网络模型中,其中,所述目标多尺度多列卷积神经网络模型包括多列平行的卷积神经网络,每列卷积神经网络中包括多个卷积核大小和个数不同的卷积层;Step S101: Input the image to be tested into the pre-trained target multi-scale and multi-column convolutional neural network model, where the target multi-scale and multi-column convolutional neural network model includes a multi-column convolutional neural network. The column convolutional neural network includes multiple convolutional layers with different sizes and numbers of convolution kernels;
将待测试图像输入至预先完成训练的目标多尺度多列卷积神经网络模型中前需要对多尺度多列卷积神经网络(SaMCNN)进行训练。Before inputting the image to be tested into the pre-trained target multi-scale and multi-column convolutional neural network model, it is necessary to train the multi-scale and multi-column convolutional neural network (SaMCNN).
对所述多尺度多列卷积神经网络进行训练时,首先利用高斯滤波 器对预先创建的人群图像数据集
Figure PCTCN2020075795-appb-000007
进行滤波处理后,获取所述人群图像数据集中每幅图像X i的密度图M i,从而构建目标训练集
Figure PCTCN2020075795-appb-000008
其中,X i为所述人群图像数据集第i张图像,大小为m*n;Y i为所述第i张图像对应的人头坐标点图,大小为m*n,N为所述人群图像数据集中图像总数。采用所述目标训练集
Figure PCTCN2020075795-appb-000009
对多尺度多列卷积神经网络模型进行训练,得到完成训练后的目标多尺度多列卷积神经网络模型。
When training the multi-scale and multi-column convolutional neural network, a Gaussian filter is first used to analyze the pre-created crowd image data set
Figure PCTCN2020075795-appb-000007
After filtering, the acquired image data set population density maps M i X i of each image, to construct the training set target
Figure PCTCN2020075795-appb-000008
Wherein, X-i is the i-th groups of image data sets of images, size is m * n; Y i is the i-images corresponding to the head coordinate point view of size m * n, N is the image groups The total number of images in the dataset. Use the target training set
Figure PCTCN2020075795-appb-000009
Train the multi-scale and multi-column convolutional neural network model to obtain the target multi-scale and multi-column convolutional neural network model after training.
如图2所示,所述多尺度多列卷积神经网络中可以包括多列卷积神经网络,在本实施例中以三列平行的卷积神经网络为例。所述每列卷积神经网络包括第一卷积层、第二卷积层、第三卷积层、第四卷积层、第五卷积层、反卷积层、第六卷积层和第七卷积层。其中,所述第一卷积层和其他卷积层的卷积核大小不同,所述第二卷积层、所述第三卷积层、所述第四卷积层、所述第五卷积层和所述第六卷积层的卷积核大小相同,所述第三卷积层、所述第四卷积层、所述第五卷积层和所述第六卷积层的卷积核的个数相同。所述各个卷积层的激活函数采用ReLU函数。As shown in FIG. 2, the multi-scale multi-column convolutional neural network may include a multi-column convolutional neural network. In this embodiment, a three-column convolutional neural network is taken as an example. Each column of the convolutional neural network includes a first convolution layer, a second convolution layer, a third convolution layer, a fourth convolution layer, a fifth convolution layer, a deconvolution layer, a sixth convolution layer, and The seventh convolutional layer. Wherein, the size of the convolution kernels of the first convolutional layer and other convolutional layers are different, and the second convolutional layer, the third convolutional layer, the fourth convolutional layer, and the fifth convolutional layer The size of the convolution kernel is the same as that of the sixth convolutional layer, and the convolutional layers of the third convolutional layer, the fourth convolutional layer, the fifth convolutional layer and the sixth convolutional layer The number of product cores is the same. The activation function of each convolutional layer adopts the ReLU function.
所述第一卷积层、所述第二卷积层、所述第三卷积层和所述第四个卷积层之间的池化层选用区域2*2,步长为2的最大池化;所述第四卷积层和所述第五卷积层之间的池化层选用3*3区域,步长为1的最大池化,以便于保持所述第四卷积层输出特征图和对所述第四卷积层输出特征池化后的特征图大小不变。The pooling layer selection area between the first convolutional layer, the second convolutional layer, the third convolutional layer, and the fourth convolutional layer is 2*2, and the maximum step size is 2 Pooling; the pooling layer between the fourth convolutional layer and the fifth convolutional layer selects a 3*3 area, and the maximum pooling with a step length of 1, so as to maintain the output of the fourth convolutional layer The size of the feature map and the feature map after the output feature pooling of the fourth convolutional layer remains unchanged.
所述第四卷积层输出的特征图和所述第五卷积层输出的特征图在通道数上串联后输入所述反卷积层,所述反卷积层输出的特征图和所述第三卷积层输出的特征图在通道数上串联后输入所述第六卷积层,所述第八卷积层输出所述待测试图像的估计密度图作为所述每列卷积神经网络模型的输出结果。The feature map output by the fourth convolution layer and the feature map output by the fifth convolution layer are connected in series in the number of channels and then input to the deconvolution layer. The feature map output by the deconvolution layer and the The feature map output by the third convolutional layer is connected in series on the number of channels and then input to the sixth convolutional layer. The eighth convolutional layer outputs the estimated density map of the image to be tested as the convolutional neural network for each column The output of the model.
将所述每列卷积神经网络输出的所述当前人群图像的估计密度图在通道数上串联后,经过一个卷积核大小为1*1的总卷积层,并将所述总卷积层输出的特征图映射为所述当前人群图像的目标估计密度图,以便于将所述 当前人群图像的目标估计密度图作为所述多尺度多列卷积神经网络模型的网络输出。After concatenating the estimated density map of the current crowd image output by each column of the convolutional neural network on the number of channels, it passes through a total convolutional layer with a convolution kernel size of 1*1, and the total convolution The feature map output by the layer is mapped to the target estimated density map of the current crowd image, so that the target estimated density map of the current crowd image is used as the network output of the multi-scale and multi-column convolutional neural network model.
步骤S102:将所述待测试图像分别输入至所述每列卷积神经网络中,利用所述每列卷积神经网络中各个卷积层对所述待测试图像进行处理,并将所述每列卷积神经网络中预选卷积层输出的特征图进行融合,以便于分别得到所述每列卷积神经网络输出的估计密度图;Step S102: Input the image to be tested into each column of the convolutional neural network, use each convolutional layer in each column of the convolutional neural network to process the image to be tested, and The feature maps output by the preselected convolutional layers in the column convolutional neural network are fused, so as to obtain the estimated density maps output by each column of the convolutional neural network respectively;
将所述待测试图像输入所述目标多尺度多列卷积神经网络模型中,并分别将所述待测试图像分别输入所述目标多尺度多列卷积神经网络模型的每列卷积神经网络中。所述每列卷积申请网络中的卷积层对所述待测试数据进行处理。利用所述每列卷积网络神经网络中的各个卷积层和池化层进行处理,将所述每列卷积申请网络的第四卷积层和第五卷积层之间选用3*3区域,步长为1的最大池化,来保持池化前后的特征图大小不变,便于将两次卷积之后的特征图在通道数上串联起来。在所述第五卷积层之后,使用反卷积层对之前的特征图进行上采样,进而和第三个卷积层得到的特征图在通道数上串联起来。Input the image to be tested into the target multi-scale and multi-column convolutional neural network model, and input the image to be tested into each column of the target multi-scale and multi-column convolutional neural network model. in. The convolution layer in each column of the convolution application network processes the data to be tested. Use each convolutional layer and pooling layer in each column of the convolutional network neural network for processing, and select 3*3 between the fourth convolutional layer and the fifth convolutional layer of each column of the convolution application network Area, the maximum pooling with a step length of 1, to keep the size of the feature map before and after pooling unchanged, so that the feature map after two convolutions can be connected in series on the number of channels. After the fifth convolution layer, the deconvolution layer is used to up-sample the previous feature maps, and then the feature maps obtained by the third convolution layer are connected in series with the number of channels.
步骤S103:将所述每列卷积神经网络输出的估计密度图进行融合后,得到所述待测试图像的目标估计密度图;Step S103: After fusing the estimated density maps output by each column of the convolutional neural network, the target estimated density map of the image to be tested is obtained;
步骤S104:依据所述待测试图像的目标估计密度图,计算得到所述待测试图像中的人数。Step S104: According to the target estimated density map of the image to be tested, the number of people in the image to be tested is calculated.
本实施例所提供的方法,利用多尺度多列卷积神经网络对待测试图像进行测试。所述多尺度多列卷积神经网络相对于多列卷积神经网络,增加了每列卷积神经网络的层数,并且引入了多尺度的思想,将前面卷积层提取到的特征图和后面卷积层提取到的特征图相结合;从而提高了密集人群计数的神经网络的性能以及密集人群图像预测结果的准确性。The method provided in this embodiment uses a multi-scale and multi-column convolutional neural network to test the image to be tested. Compared with the multi-column convolutional neural network, the multi-scale and multi-column convolutional neural network increases the number of layers of each column of the convolutional neural network, and introduces the idea of multi-scale, which extracts the feature maps from the previous convolutional layer and The feature maps extracted by the subsequent convolutional layers are combined; thus, the performance of the neural network for counting dense crowds and the accuracy of the prediction results of dense crowds are improved.
基于上述实施例,在本实施例中,可以选择Shanghai tech数据集的第二部分作为人群图像数据集,利用所述人群图像数据集中第二部分图像的密级图对所述多尺度多列卷积神经网络模型进行训练。请参考图3,图3为本发明所提供的密集人群计数的方法的第二种具体实施例的流程图;具 体操作步骤如下:Based on the above embodiment, in this embodiment, the second part of the Shanghai tech dataset can be selected as the crowd image dataset, and the dense level map of the second part of the crowd image dataset is used to convolve the multi-scale and multi-column The neural network model is trained. Please refer to FIG. 3, which is a flowchart of a second specific embodiment of a method for counting dense crowds provided by the present invention; the specific operation steps are as follows:
步骤301:利用高斯滤波器对Shanghai tech数据集的第二部分的人群图像进行滤波处理后,获取所述第二部分的人群图像的度图,构建目标训练集;Step 301: After filtering the crowd images in the second part of the Shanghai tech data set by using a Gaussian filter, obtain a degree map of the crowd images in the second part to construct a target training set;
在本实施例中可以选择Shanghai tech数据集的第二部分作为人群图像数据集
Figure PCTCN2020075795-appb-000010
X i为所述人群图像数据集第i张图像,大小为768*1024;Y i为所述第i张图像对应的人头坐标点图,大小为768*1024,N为所述人群图像数据集中图像总数。
In this embodiment, the second part of the Shanghai tech dataset can be selected as the crowd image dataset
Figure PCTCN2020075795-appb-000010
X i is the i-th image of the crowd image data set, with a size of 768*1024; Y i is the human head coordinate point map corresponding to the i-th image, with a size of 768*1024, and N is the crowd image data set The total number of images.
所述Shanghai tech数据集包含1198张标注的图像和330165个人头中心标注;所述Shanghai tech数据集被划分为两个部分,其中,第一部分包括482张随机从网上爬取的图像,其中300张用于训练,182张用于测试;第二部分包括716张在上海街头拍取的图像,其中400张用于训练,316张用于测试。The Shanghai tech data set contains 1,198 annotated images and 330165 head center annotations; the Shanghai tech data set is divided into two parts, of which, the first part includes 482 images randomly crawled from the Internet, of which 300 For training, 182 images were used for testing; the second part included 716 images taken on the streets of Shanghai, 400 of which were used for training and 316 were used for testing.
步骤302:采用所述目标训练集对多尺度多列卷积神经网络模型进行训练,得到完成训练后的目标多尺度多列卷积神经网络模型;Step 302: Use the target training set to train the multi-scale and multi-column convolutional neural network model to obtain the target multi-scale and multi-column convolutional neural network model after training;
步骤303:将待测试图像T输入至所述目标多尺度多列卷积神经网络模型中,其中,所述目标多尺度多列卷积神经网络模型包括多列平行的卷积神经网络,每列卷积神经网络中包括多个卷积核大小和个数不同的卷积层;Step 303: Input the image T to be tested into the target multi-scale and multi-column convolutional neural network model, where the target multi-scale and multi-column convolutional neural network model includes multiple columns of parallel convolutional neural networks, each column The convolutional neural network includes multiple convolutional layers with different sizes and numbers of convolution kernels;
步骤304:将所述待测试图像T输入至所述目标多尺度多列卷积神经网络模型后,输出所述待测试图像T的估计密度图
Figure PCTCN2020075795-appb-000011
Step 304: After inputting the image T to be tested into the target multi-scale and multi-column convolutional neural network model, output an estimated density map of the image T to be tested
Figure PCTCN2020075795-appb-000011
步骤S305:计算所述估计密度图
Figure PCTCN2020075795-appb-000012
中所有像素值的和,得到所述待测试图像中的人数
Figure PCTCN2020075795-appb-000013
Step S305: Calculate the estimated density map
Figure PCTCN2020075795-appb-000012
The sum of all pixel values in the image to get the number of people in the image to be tested
Figure PCTCN2020075795-appb-000013
将本实施例所提供的多尺度多列卷积神经网络模型与多列卷积神经网络模型在相同的数据集上做人群计数比较。从表1可以得到,本实施例提出的网络模型的计数结果的平均完全误差(MAE)和均方误差(MSE)都比现有技术中网络模型的计数结果小,获得了更好的性能。The multi-scale and multi-column convolutional neural network model provided in this embodiment and the multi-column convolutional neural network model are compared on the same data set for crowd counting. It can be obtained from Table 1 that the average complete error (MAE) and mean square error (MSE) of the counting result of the network model proposed in this embodiment are both smaller than the counting result of the network model in the prior art, and better performance is obtained.
表-1人群计数结果的对比Table-1 Comparison of population count results
Figure PCTCN2020075795-appb-000014
Figure PCTCN2020075795-appb-000014
Figure PCTCN2020075795-appb-000015
Figure PCTCN2020075795-appb-000015
请参考图4,图4为本发明实施例提供的一种密集人群计数的装置的结构框图。具体装置可以包括:Please refer to FIG. 4, which is a block diagram of a device for counting dense crowds according to an embodiment of the present invention. Specific devices may include:
输入模块100,用于将待测试图像输入至预先完成训练的目标多尺度多列卷积神经网络模型中;其中,所述目标多尺度多列卷积神经网络模型包括多列平行的卷积神经网络,每列卷积神经网络中包括多个卷积核大小和个数不同的卷积层;The input module 100 is used to input the image to be tested into a pre-trained target multi-scale multi-column convolutional neural network model; wherein the target multi-scale multi-column convolutional neural network model includes multiple columns of parallel convolutional neural networks Network, each column of convolutional neural network includes multiple convolutional layers with different sizes and numbers of convolution kernels;
处理模块200,用于将所述待测试图像分别输入至所述每列卷积神经网络中,利用所述每列卷积神经网络中各个卷积层对所述待测试图像进行处理,并将所述每列卷积神经网络中预选卷积层输出的特征图进行融合,以便于分别得到所述每列卷积神经网络输出的估计密度图;The processing module 200 is configured to input the image to be tested into each column of the convolutional neural network, use each convolutional layer in each column of the convolutional neural network to process the image to be tested, and Fusing the feature maps output by the preselected convolutional layers in each column of the convolutional neural network, so as to obtain the estimated density maps output by the convolutional neural network of each column respectively;
输出模块300,用于将所述每列卷积神经网络输出的估计密度图进行融合后,得到所述待测试图像的目标估计密度图;The output module 300 is configured to fuse the estimated density map output by each column of the convolutional neural network to obtain the target estimated density map of the image to be tested;
计算模块400,用于依据所述待测试图像的目标估计密度图,计算得到所述待测试图像中的人数。The calculation module 400 is configured to calculate the number of people in the image to be tested according to the target estimated density map of the image to be tested.
本实施例的密集人群计数的装置用于实现前述的密集人群计数的方法,因此密集人群计数的装置中的具体实施方式可见前文中的密集人群计数的方法的实施例部分,例如,输入模块100,处理模块200,输出模块300,计算模块400,分别用于实现上述密集人群计数的方法中步骤S101,S102,S103和S104,所以,其具体实施方式可以参照相应的各个部分实施例的描述,在此不再赘述。The device for counting dense crowds of this embodiment is used to implement the aforementioned method for counting dense crowds. Therefore, the specific implementation of the device for counting dense crowds can be seen in the foregoing embodiment of the method for counting dense crowds, for example, the input module 100 , The processing module 200, the output module 300, and the calculation module 400 are respectively used to implement steps S101, S102, S103, and S104 in the above-mentioned dense crowd counting method. Therefore, for the specific implementation, please refer to the description of the respective parts of the embodiment. I will not repeat them here.
本发明具体实施例还提供了一种密集人群计数的设备,包括:存储器,用于存储计算机程序;处理器,用于执行所述计算机程序时实现上述一种密集人群计数的方法的步骤。Specific embodiments of the present invention also provide a device for counting crowds of people, including: a memory for storing a computer program; a processor for implementing the steps of the method for counting a crowd of people when executing the computer program.
本发明具体实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现上 述一种密集人群计数的方法的步骤。A specific embodiment of the present invention also provides a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above-mentioned dense crowd counting method are realized.
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。The various embodiments in this specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments, and the same or similar parts between the various embodiments can be referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant part can be referred to the description of the method part.
专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Professionals may further realize that the units and algorithm steps of the examples described in the embodiments disclosed in this article can be implemented by electronic hardware, computer software, or a combination of both, in order to clearly illustrate the possibilities of hardware and software. Interchangeability. In the above description, the composition and steps of each example have been generally described in accordance with the function. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of the present invention.
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of the method or algorithm described in the embodiments disclosed in this document can be directly implemented by hardware, a software module executed by a processor, or a combination of the two. The software module can be placed in random access memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disks, removable disks, CD-ROMs, or all areas in the technical field. Any other known storage medium.
以上对本发明所提供的密集人群计数的方法、装置、设备以及计算机可读存储介质进行了详细介绍。本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想。应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以对本发明进行若干改进和修饰,这些改进和修饰也落入本发明权利要求的保护范围内。The method, device, equipment and computer-readable storage medium for counting dense crowds provided by the present invention have been introduced in detail above. Specific examples are used in this article to illustrate the principle and implementation of the present invention. The description of the above examples is only used to help understand the method and core idea of the present invention. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of the present invention, several improvements and modifications can be made to the present invention, and these improvements and modifications also fall within the protection scope of the claims of the present invention.

Claims (10)

  1. 一种密集人群计数的方法,其特征在于,包括:A method for counting dense crowds is characterized in that it includes:
    将待测试图像输入至预先完成训练的目标多尺度多列卷积神经网络模型中;其中,所述目标多尺度多列卷积神经网络模型包括多列平行的卷积神经网络,每列卷积神经网络中包括多个卷积核大小和个数不同的卷积层;Input the image to be tested into the pre-trained target multi-scale and multi-column convolutional neural network model; wherein the target multi-scale and multi-column convolutional neural network model includes multiple columns of parallel convolutional neural networks, and each column of convolution The neural network includes multiple convolutional layers with different sizes and numbers of convolution kernels;
    将所述待测试图像分别输入至所述每列卷积神经网络中,利用所述每列卷积神经网络中各个卷积层对所述待测试图像进行处理,并将所述每列卷积神经网络中预选卷积层输出的特征图进行融合,以便于分别得到所述每列卷积神经网络输出的估计密度图;Input the image to be tested into each column of the convolutional neural network, use each convolutional layer in each column of the convolutional neural network to process the image to be tested, and convolve each column The feature maps output by the preselected convolutional layers in the neural network are fused, so as to obtain the estimated density maps output by each column of the convolutional neural network respectively;
    将所述每列卷积神经网络输出的估计密度图进行融合后,得到所述待测试图像的目标估计密度图;After fusing the estimated density maps output by each column of the convolutional neural network, the target estimated density map of the image to be tested is obtained;
    依据所述待测试图像的目标估计密度图,计算得到所述待测试图像中的人数。According to the target estimated density map of the image to be tested, the number of people in the image to be tested is calculated.
  2. 如权利要求1所述的方法,其特征在于,所述将待测试图像输入至预先完成训练的目标多尺度多列卷积神经网络模型中前包括:The method according to claim 1, wherein the inputting the image to be tested into the pre-trained target multi-scale multi-column convolutional neural network model comprises:
    利用高斯滤波器对预先创建的人群图像数据集进行滤波处理后,获取所述人群图像数据集中每幅图像的密度图,从而构建目标训练集;After performing filtering processing on the pre-created crowd image data set by using a Gaussian filter, a density map of each image in the crowd image data set is obtained, thereby constructing a target training set;
    采用所述目标训练集对多尺度多列卷积神经网络模型进行训练,得到完成训练后的目标多尺度多列卷积神经网络模型。The target training set is used to train the multi-scale and multi-column convolutional neural network model to obtain the target multi-scale and multi-column convolutional neural network model after the training is completed.
  3. 如权利要求2所述的方法,其特征在于,所述利用高斯滤波器对预先创建的人群图像数据集进行滤波处理后,获取所述人群图像数据集中每幅图像的密度图,从而构建目标训练集包括:The method according to claim 2, characterized in that, after filtering a pre-created crowd image data set by using a Gaussian filter, a density map of each image in the crowd image data set is obtained, thereby constructing target training The set includes:
    获取预先采集的人群图像数据集
    Figure PCTCN2020075795-appb-100001
    其中,X i为所述人群图像数据集第i张图像,大小为m*n;Y i为所述第i张图像对应的人头坐标点图,大小为m*n,N为所述人群图像数据集中图像总数;
    Obtain pre-collected crowd image dataset
    Figure PCTCN2020075795-appb-100001
    Wherein, X-i is the i-th groups of image data sets of images, size is m * n; Y i is the i-images corresponding to the head coordinate point view of size m * n, N is the image groups The total number of images in the data set;
    利用高斯滤波器对所述人群图像数据集
    Figure PCTCN2020075795-appb-100002
    中的每幅图像X i进行滤波处理后,得到所述每幅图像X i的密度图M i,利用所述每幅图像X i的密度图M i构建目标训练集
    Figure PCTCN2020075795-appb-100003
    Use Gaussian filter on the crowd image data set
    Figure PCTCN2020075795-appb-100002
    Each of the X i images after filtering, to obtain the density map M i X i of each image, using the density of each image in FIG M i X i of the training set target construct
    Figure PCTCN2020075795-appb-100003
  4. 如权利要求2所述的方法,其特征在于,所述采用所述目标训练集对多尺度多列卷积神经网络模型进行训练包括:The method according to claim 2, wherein said training a multi-scale and multi-column convolutional neural network model by using the target training set comprises:
    将所述目标训练集中的当前人群图像分别输入至所述多尺度多列卷积神经网络模型的每列卷积神经网络中;Input the current crowd image in the target training set into each column of the convolutional neural network of the multi-scale and multi-column convolutional neural network model;
    其中,所述多尺度多列卷积神经网络模型中的每列卷积神经网络相互平行,所述每列卷积神经网络除卷积核大小和个数外,其他网络结构相同;Wherein, each column of the convolutional neural network in the multi-scale and multi-column convolutional neural network model is parallel to each other, and the convolutional neural network of each column has the same network structure except for the size and number of convolution kernels;
    将所述每列卷积神经网络输出的所述当前人群图像的估计密度图在通道数上串联后,经过一个卷积核大小为1*1的总卷积层,并将所述总卷积层输出的特征图映射为所述当前人群图像的目标估计密度图,以便于将所述当前人群图像的目标估计密度图作为所述多尺度多列卷积神经网络模型的网络输出。After concatenating the estimated density map of the current crowd image output by each column of the convolutional neural network on the number of channels, it passes through a total convolutional layer with a convolution kernel size of 1*1, and the total convolution The feature map output by the layer is mapped to the target estimated density map of the current crowd image, so that the target estimated density map of the current crowd image is used as the network output of the multi-scale and multi-column convolutional neural network model.
  5. 如权利要求4所述的方法,其特征在于,所述多尺度多列卷积神经网络模型的每列卷积神经网络包括:The method according to claim 4, wherein each column of the convolutional neural network of the multi-scale and multi-column convolutional neural network model comprises:
    第一卷积层、第二卷积层、第三卷积层、第四卷积层、第五卷积层、反卷积层、第六卷积层和第七卷积层;The first convolutional layer, the second convolutional layer, the third convolutional layer, the fourth convolutional layer, the fifth convolutional layer, the deconvolutional layer, the sixth convolutional layer, and the seventh convolutional layer;
    其中,所述第一卷积层和其他卷积层的卷积核大小不同,所述第二卷积层、所述第三卷积层、所述第四卷积层、所述第五卷积层和所述第六卷积层的卷积核大小相同,所述第三卷积层、所述第四卷积层、所述第五卷积层和所述第六卷积层的卷积核的个数相同;Wherein, the size of the convolution kernels of the first convolutional layer and other convolutional layers are different, and the second convolutional layer, the third convolutional layer, the fourth convolutional layer, and the fifth convolutional layer The size of the convolution kernel is the same as that of the sixth convolutional layer, and the convolutional layers of the third convolutional layer, the fourth convolutional layer, the fifth convolutional layer and the sixth convolutional layer The number of product cores is the same;
    所述第一卷积层、所述第二卷积层、所述第三卷积层和所述第四个卷积层之间的池化层选用区域2*2,步长为2的最大池化;The pooling layer selection area between the first convolutional layer, the second convolutional layer, the third convolutional layer, and the fourth convolutional layer is 2*2, and the maximum step size is 2 Pooling
    所述第四卷积层和所述第五卷积层之间的池化层选用3*3区域,步长为1的最大池化,以便于保持所述第四卷积层输出特征图和对所述第四卷积层输出特征池化后的特征图大小不变;The pooling layer between the fourth convolutional layer and the fifth convolutional layer selects a 3*3 area with a maximum pooling step of 1 in order to maintain the output feature map of the fourth convolutional layer and The size of the feature map after the output feature pooling of the fourth convolutional layer remains unchanged;
    所述各个卷积层的激活函数采用ReLU函数;The activation function of each convolutional layer adopts the ReLU function;
    所述第四卷积层输出的特征图和所述第五卷积层输出的特征图在通道数上串联后输入所述反卷积层,所述反卷积层输出的特征图和所述第三卷积层输出的特征图在通道数上串联后输入所述第六卷积层,所述第八卷积层输出所述待测试图像的估计密度图作为所述每列卷积神经网络模型的输 出结果。The feature map output by the fourth convolution layer and the feature map output by the fifth convolution layer are connected in series in the number of channels and then input to the deconvolution layer. The feature map output by the deconvolution layer and the The feature map output by the third convolutional layer is connected in series on the number of channels and then input to the sixth convolutional layer. The eighth convolutional layer outputs the estimated density map of the image to be tested as the convolutional neural network for each column The output of the model.
  6. 如权利要求1至5任一项所述的方法,其特征在于,所述依据所述待测试图像的目标估计密度图,计算得到所述待测试图像中的人数包括:The method according to any one of claims 1 to 5, wherein the calculating the number of people in the image to be tested according to the target estimated density map of the image to be tested comprises:
    将所述待测试图像T输入至所述目标多尺度多列卷积神经网络模型,得到所述待测试图像T的估计密度图
    Figure PCTCN2020075795-appb-100004
    后,计算所述估计密度图
    Figure PCTCN2020075795-appb-100005
    中所有像素值的和,得到所述待测试图像中的人数
    Figure PCTCN2020075795-appb-100006
    Input the image T to be tested into the target multi-scale multi-column convolutional neural network model to obtain the estimated density map of the image T to be tested
    Figure PCTCN2020075795-appb-100004
    After calculating the estimated density map
    Figure PCTCN2020075795-appb-100005
    The sum of all the pixel values in the image to be tested
    Figure PCTCN2020075795-appb-100006
  7. 一种密集人群计数的装置,其特征在于,包括:A device for counting dense crowds is characterized in that it comprises:
    输入模块,用于将待测试图像输入至预先完成训练的目标多尺度多列卷积神经网络模型中;其中,所述目标多尺度多列卷积神经网络模型包括多列平行的卷积神经网络,每列卷积神经网络中包括多个卷积核大小和个数不同的卷积层;The input module is used to input the image to be tested into the pre-trained target multi-scale and multi-column convolutional neural network model; wherein the target multi-scale and multi-column convolutional neural network model includes a multi-column parallel convolutional neural network , Each column of convolutional neural network includes multiple convolutional layers with different sizes and numbers of convolution kernels;
    处理模块,用于将所述待测试图像分别输入至所述每列卷积神经网络中,利用所述每列卷积神经网络中各个卷积层对所述待测试图像进行处理,并将所述每列卷积神经网络中预选卷积层输出的特征图进行融合,以便于分别得到所述每列卷积神经网络输出的估计密度图;The processing module is configured to input the image to be tested into each column of the convolutional neural network, use each convolutional layer in each column of the convolutional neural network to process the image to be tested, and Fuse the feature maps output by the preselected convolutional layers in each column of the convolutional neural network, so as to obtain the estimated density maps output by each column of the convolutional neural network respectively;
    输出模块,用于将所述每列卷积神经网络输出的估计密度图进行融合后,得到所述待测试图像的目标估计密度图;The output module is used to fuse the estimated density map output by each column of the convolutional neural network to obtain the target estimated density map of the image to be tested;
    计算模块,用于依据所述待测试图像的目标估计密度图,计算得到所述待测试图像中的人数。The calculation module is used to calculate the number of people in the image to be tested according to the target estimated density map of the image to be tested.
  8. 如权利要求7所述的装置,其特征在于,所述输出模块前包括:8. The device according to claim 7, wherein the output module front comprises:
    训练模块,用于利用高斯滤波器对预先创建的人群图像数据集进行滤波处理后,获取所述人群图像数据集中每幅图像的密度图,从而构建目标训练集;The training module is used to filter the pre-created crowd image data set by using a Gaussian filter, and then obtain the density map of each image in the crowd image data set, thereby constructing a target training set;
    采用所述目标训练集对多尺度多列卷积神经网络模型进行训练,得到完成训练后的目标多尺度多列卷积神经网络模型。The target training set is used to train the multi-scale and multi-column convolutional neural network model to obtain the target multi-scale and multi-column convolutional neural network model after the training is completed.
  9. 一种密集人群计数的设备,其特征在于,包括:A device for counting dense crowds is characterized in that it includes:
    存储器,用于存储计算机程序;Memory, used to store computer programs;
    处理器,用于执行所述计算机程序时实现如权利要求1至7任一项所述一种密集人群计数的方法的步骤。The processor is configured to implement the steps of the dense crowd counting method according to any one of claims 1 to 7 when the computer program is executed.
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至7任一项所述一种密集人群计数的方法的步骤。A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, a dense group of people as claimed in any one of claims 1 to 7 is realized Steps of counting method.
PCT/CN2020/075795 2019-02-21 2020-02-19 Dense crowd counting method, apparatus and device, and storage medium WO2020169043A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910129612.3A CN109858461B (en) 2019-02-21 2019-02-21 Method, device, equipment and storage medium for counting dense population
CN201910129612.3 2019-02-21

Publications (1)

Publication Number Publication Date
WO2020169043A1 true WO2020169043A1 (en) 2020-08-27

Family

ID=66898471

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/075795 WO2020169043A1 (en) 2019-02-21 2020-02-19 Dense crowd counting method, apparatus and device, and storage medium

Country Status (2)

Country Link
CN (1) CN109858461B (en)
WO (1) WO2020169043A1 (en)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112101190A (en) * 2020-09-11 2020-12-18 西安电子科技大学 Remote sensing image classification method, storage medium and computing device
CN112132023A (en) * 2020-09-22 2020-12-25 上海应用技术大学 Crowd counting method based on multi-scale context enhanced network
CN112396000A (en) * 2020-11-19 2021-02-23 中山大学 Method for constructing multi-mode dense prediction depth information transmission model
CN112699741A (en) * 2020-12-10 2021-04-23 广州广电运通金融电子股份有限公司 Method, system and equipment for calculating internal congestion degree of bus
CN112733714A (en) * 2021-01-11 2021-04-30 北京大学 Automatic crowd counting image identification method based on VGG network
CN112818849A (en) * 2021-01-31 2021-05-18 南京工业大学 Crowd density detection algorithm based on context attention convolutional neural network of counterstudy
CN112861795A (en) * 2021-03-12 2021-05-28 云知声智能科技股份有限公司 Method and device for detecting salient target of remote sensing image based on multi-scale feature fusion
CN112966600A (en) * 2021-03-04 2021-06-15 上海应用技术大学 Adaptive multi-scale context aggregation method for crowded crowd counting
CN113139489A (en) * 2021-04-30 2021-07-20 广州大学 Crowd counting method and system based on background extraction and multi-scale fusion network
CN113205078A (en) * 2021-05-31 2021-08-03 上海应用技术大学 Multi-branch-based progressive attention-enhancing crowd counting method
CN113283356A (en) * 2021-05-31 2021-08-20 上海应用技术大学 Multi-level attention scale perception crowd counting method
CN113468995A (en) * 2021-06-22 2021-10-01 之江实验室 Crowd counting method based on density grade perception
CN113516029A (en) * 2021-04-28 2021-10-19 上海科技大学 Image crowd counting method, device, medium and terminal based on partial annotation
CN113687326A (en) * 2021-07-13 2021-11-23 广州杰赛科技股份有限公司 Vehicle-mounted radar echo noise reduction method, device, equipment and medium
CN113807274A (en) * 2021-09-23 2021-12-17 山东建筑大学 Crowd counting method and system based on image inverse perspective transformation
CN114120233A (en) * 2021-11-29 2022-03-01 上海应用技术大学 Training method of lightweight pyramid hole convolution aggregation network for crowd counting
CN114255203A (en) * 2020-09-22 2022-03-29 中国农业大学 Fry quantity estimation method and system
CN114463694A (en) * 2022-01-06 2022-05-10 中山大学 Semi-supervised crowd counting method and device based on pseudo label
CN114639070A (en) * 2022-03-15 2022-06-17 福州大学 Crowd movement flow analysis method integrating attention mechanism
CN114973112A (en) * 2021-02-19 2022-08-30 四川大学 Scale-adaptive dense crowd counting method based on antagonistic learning network
CN116311083A (en) * 2023-05-19 2023-06-23 华东交通大学 Crowd counting model training method and system
CN116704266A (en) * 2023-07-28 2023-09-05 国网浙江省电力有限公司信息通信分公司 Power equipment fault detection method, device, equipment and storage medium
CN117405570A (en) * 2023-12-13 2024-01-16 长沙思辰仪器科技有限公司 Automatic detection method and system for oil particle size counter
CN117670892A (en) * 2023-12-07 2024-03-08 百鸟数据科技(北京)有限责任公司 Aquatic bird density estimation method and device, computer equipment and storage medium
CN114639070B (en) * 2022-03-15 2024-06-04 福州大学 Crowd movement flow analysis method integrating attention mechanism

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109858461B (en) * 2019-02-21 2023-06-16 苏州大学 Method, device, equipment and storage medium for counting dense population
CN110674704A (en) * 2019-09-05 2020-01-10 同济大学 Crowd density estimation method and device based on multi-scale expansion convolutional network
CN110889360A (en) * 2019-11-20 2020-03-17 山东师范大学 Crowd counting method and system based on switching convolutional network
CN110956122B (en) * 2019-11-27 2022-08-02 深圳市商汤科技有限公司 Image processing method and device, processor, electronic device and storage medium
CN111062274B (en) * 2019-12-02 2023-11-28 汇纳科技股份有限公司 Context-aware embedded crowd counting method, system, medium and electronic equipment
CN111126177B (en) * 2019-12-05 2023-05-09 杭州飞步科技有限公司 Method and device for counting number of people
CN111178235A (en) * 2019-12-27 2020-05-19 卓尔智联(武汉)研究院有限公司 Target quantity determination method, device, equipment and storage medium
CN113496150B (en) * 2020-03-20 2023-03-21 长沙智能驾驶研究院有限公司 Dense target detection method and device, storage medium and computer equipment
CN111523470B (en) * 2020-04-23 2022-11-18 苏州浪潮智能科技有限公司 Pedestrian re-identification method, device, equipment and medium
CN111626134B (en) * 2020-04-28 2023-04-21 上海交通大学 Dense crowd counting method, system and terminal based on hidden density distribution
CN111783934A (en) * 2020-05-15 2020-10-16 北京迈格威科技有限公司 Convolutional neural network construction method, device, equipment and medium
CN111640101B (en) * 2020-05-29 2022-04-29 苏州大学 Ghost convolution characteristic fusion neural network-based real-time traffic flow detection system and method
CN111652152A (en) * 2020-06-04 2020-09-11 上海眼控科技股份有限公司 Crowd density detection method and device, computer equipment and storage medium
CN111723742A (en) * 2020-06-19 2020-09-29 苏州大学 Crowd density analysis method, system and device and computer readable storage medium
CN111950443B (en) * 2020-08-10 2023-12-29 北京师范大学珠海分校 Dense crowd counting method of multi-scale convolutional neural network
US20240005649A1 (en) * 2020-09-07 2024-01-04 Intel Corporation Poly-scale kernel-wise convolution for high-performance visual recognition applications
CN112712518B (en) * 2021-01-13 2024-01-09 中国农业大学 Fish counting method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018052586A1 (en) * 2016-09-14 2018-03-22 Konica Minolta Laboratory U.S.A., Inc. Method and system for multi-scale cell image segmentation using multiple parallel convolutional neural networks
CN109101930A (en) * 2018-08-18 2018-12-28 华中科技大学 A kind of people counting method and system
CN109214337A (en) * 2018-09-05 2019-01-15 苏州大学 A kind of Demographics' method, apparatus, equipment and computer readable storage medium
CN109271960A (en) * 2018-10-08 2019-01-25 燕山大学 A kind of demographic method based on convolutional neural networks
CN109858461A (en) * 2019-02-21 2019-06-07 苏州大学 A kind of method, apparatus, equipment and storage medium that dense population counts

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018052586A1 (en) * 2016-09-14 2018-03-22 Konica Minolta Laboratory U.S.A., Inc. Method and system for multi-scale cell image segmentation using multiple parallel convolutional neural networks
CN109101930A (en) * 2018-08-18 2018-12-28 华中科技大学 A kind of people counting method and system
CN109214337A (en) * 2018-09-05 2019-01-15 苏州大学 A kind of Demographics' method, apparatus, equipment and computer readable storage medium
CN109271960A (en) * 2018-10-08 2019-01-25 燕山大学 A kind of demographic method based on convolutional neural networks
CN109858461A (en) * 2019-02-21 2019-06-07 苏州大学 A kind of method, apparatus, equipment and storage medium that dense population counts

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SIQI TANG, WEI TAO , LIANGLIANG ZHAGN , ZHISONG PAN: "A Deep Crowd Counting Algorithm Based on Multi-Column Feature Map Fusion", JOURNAL OF ZHENGZHOU UNIVERSITY (NATURAL SCIENCE EDITION), vol. 50, no. 2, 30 June 2018 (2018-06-30), pages 69 - 74, XP055729523, DOI: 10.13705/j.issn.1671-6841.2017204 *

Cited By (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112101190A (en) * 2020-09-11 2020-12-18 西安电子科技大学 Remote sensing image classification method, storage medium and computing device
CN112101190B (en) * 2020-09-11 2023-11-03 西安电子科技大学 Remote sensing image classification method, storage medium and computing device
CN112132023A (en) * 2020-09-22 2020-12-25 上海应用技术大学 Crowd counting method based on multi-scale context enhanced network
CN112132023B (en) * 2020-09-22 2024-05-17 上海应用技术大学 Crowd counting method based on multi-scale context enhancement network
CN114255203A (en) * 2020-09-22 2022-03-29 中国农业大学 Fry quantity estimation method and system
CN114255203B (en) * 2020-09-22 2024-04-09 中国农业大学 Fry quantity estimation method and system
CN112396000A (en) * 2020-11-19 2021-02-23 中山大学 Method for constructing multi-mode dense prediction depth information transmission model
CN112396000B (en) * 2020-11-19 2023-09-05 中山大学 Method for constructing multi-mode dense prediction depth information transmission model
CN112699741A (en) * 2020-12-10 2021-04-23 广州广电运通金融电子股份有限公司 Method, system and equipment for calculating internal congestion degree of bus
CN112733714B (en) * 2021-01-11 2024-03-01 北京大学 VGG network-based automatic crowd counting image recognition method
CN112733714A (en) * 2021-01-11 2021-04-30 北京大学 Automatic crowd counting image identification method based on VGG network
CN112818849A (en) * 2021-01-31 2021-05-18 南京工业大学 Crowd density detection algorithm based on context attention convolutional neural network of counterstudy
CN112818849B (en) * 2021-01-31 2024-03-08 南京工业大学 Crowd density detection algorithm based on context attention convolutional neural network for countermeasure learning
CN114973112B (en) * 2021-02-19 2024-04-05 四川大学 Scale self-adaptive dense crowd counting method based on countermeasure learning network
CN114973112A (en) * 2021-02-19 2022-08-30 四川大学 Scale-adaptive dense crowd counting method based on antagonistic learning network
CN112966600B (en) * 2021-03-04 2024-04-16 上海应用技术大学 Self-adaptive multi-scale context aggregation method for crowded population counting
CN112966600A (en) * 2021-03-04 2021-06-15 上海应用技术大学 Adaptive multi-scale context aggregation method for crowded crowd counting
CN112861795A (en) * 2021-03-12 2021-05-28 云知声智能科技股份有限公司 Method and device for detecting salient target of remote sensing image based on multi-scale feature fusion
CN113516029A (en) * 2021-04-28 2021-10-19 上海科技大学 Image crowd counting method, device, medium and terminal based on partial annotation
CN113516029B (en) * 2021-04-28 2023-11-07 上海科技大学 Image crowd counting method, device, medium and terminal based on partial annotation
CN113139489B (en) * 2021-04-30 2023-09-05 广州大学 Crowd counting method and system based on background extraction and multi-scale fusion network
CN113139489A (en) * 2021-04-30 2021-07-20 广州大学 Crowd counting method and system based on background extraction and multi-scale fusion network
CN113205078B (en) * 2021-05-31 2024-04-16 上海应用技术大学 Crowd counting method based on multi-branch progressive attention-strengthening
CN113205078A (en) * 2021-05-31 2021-08-03 上海应用技术大学 Multi-branch-based progressive attention-enhancing crowd counting method
CN113283356A (en) * 2021-05-31 2021-08-20 上海应用技术大学 Multi-level attention scale perception crowd counting method
CN113283356B (en) * 2021-05-31 2024-04-05 上海应用技术大学 Multistage attention scale perception crowd counting method
CN113468995A (en) * 2021-06-22 2021-10-01 之江实验室 Crowd counting method based on density grade perception
CN113687326A (en) * 2021-07-13 2021-11-23 广州杰赛科技股份有限公司 Vehicle-mounted radar echo noise reduction method, device, equipment and medium
CN113687326B (en) * 2021-07-13 2024-01-05 广州杰赛科技股份有限公司 Vehicle-mounted radar echo noise reduction method, device, equipment and medium
CN113807274B (en) * 2021-09-23 2023-07-04 山东建筑大学 Crowd counting method and system based on image anti-perspective transformation
CN113807274A (en) * 2021-09-23 2021-12-17 山东建筑大学 Crowd counting method and system based on image inverse perspective transformation
CN114120233B (en) * 2021-11-29 2024-04-16 上海应用技术大学 Training method of lightweight pyramid cavity convolution aggregation network for crowd counting
CN114120233A (en) * 2021-11-29 2022-03-01 上海应用技术大学 Training method of lightweight pyramid hole convolution aggregation network for crowd counting
CN114463694B (en) * 2022-01-06 2024-04-05 中山大学 Pseudo-label-based semi-supervised crowd counting method and device
CN114463694A (en) * 2022-01-06 2022-05-10 中山大学 Semi-supervised crowd counting method and device based on pseudo label
CN114639070A (en) * 2022-03-15 2022-06-17 福州大学 Crowd movement flow analysis method integrating attention mechanism
CN114639070B (en) * 2022-03-15 2024-06-04 福州大学 Crowd movement flow analysis method integrating attention mechanism
CN116311083B (en) * 2023-05-19 2023-09-05 华东交通大学 Crowd counting model training method and system
CN116311083A (en) * 2023-05-19 2023-06-23 华东交通大学 Crowd counting model training method and system
CN116704266B (en) * 2023-07-28 2023-10-31 国网浙江省电力有限公司信息通信分公司 Power equipment fault detection method, device, equipment and storage medium
CN116704266A (en) * 2023-07-28 2023-09-05 国网浙江省电力有限公司信息通信分公司 Power equipment fault detection method, device, equipment and storage medium
CN117670892A (en) * 2023-12-07 2024-03-08 百鸟数据科技(北京)有限责任公司 Aquatic bird density estimation method and device, computer equipment and storage medium
CN117405570B (en) * 2023-12-13 2024-03-08 长沙思辰仪器科技有限公司 Automatic detection method and system for oil particle size counter
CN117405570A (en) * 2023-12-13 2024-01-16 长沙思辰仪器科技有限公司 Automatic detection method and system for oil particle size counter

Also Published As

Publication number Publication date
CN109858461A (en) 2019-06-07
CN109858461B (en) 2023-06-16

Similar Documents

Publication Publication Date Title
WO2020169043A1 (en) Dense crowd counting method, apparatus and device, and storage medium
WO2020215985A1 (en) Medical image segmentation method and device, electronic device and storage medium
Wang et al. A deep network solution for attention and aesthetics aware photo cropping
US20210035304A1 (en) Training method for image semantic segmentation model and server
Ma et al. Image retargeting quality assessment: A study of subjective scores and objective metrics
CN110084155B (en) Method, device and equipment for counting dense people and storage medium
WO2019119505A1 (en) Face recognition method and device, computer device and storage medium
CN109214337B (en) Crowd counting method, device, equipment and computer readable storage medium
CN109189991A (en) Repeat video frequency identifying method, device, terminal and computer readable storage medium
WO2020192442A1 (en) Method for generating classifier using a small number of annotated images
CN110879982A (en) Crowd counting system and method
CN109558902A (en) A kind of fast target detection method
CN107481218B (en) Image aesthetic feeling evaluation method and device
WO2021051547A1 (en) Violent behavior detection method and system
CN106372111A (en) Local feature point screening method and system
WO2022205502A1 (en) Image classification model construction method, image classification method, and storage medium
CN110211685B (en) Sugar network screening network structure model based on complete attention mechanism
WO2014036813A1 (en) Method and device for extracting image features
CN111986180A (en) Face forged video detection method based on multi-correlation frame attention mechanism
CN111639230B (en) Similar video screening method, device, equipment and storage medium
CN108875505A (en) Pedestrian neural network based recognition methods and device again
CN109614990A (en) A kind of object detecting device
CN111461211B (en) Feature extraction method for lightweight target detection and corresponding detection method
WO2017202086A1 (en) Image screening method and device
Burrell Using the Gamma‐Poisson model to predict library circulations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20759601

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20759601

Country of ref document: EP

Kind code of ref document: A1