CN110414301A - It is a kind of based on double compartment crowd density estimation methods for taking the photograph head - Google Patents

It is a kind of based on double compartment crowd density estimation methods for taking the photograph head Download PDF

Info

Publication number
CN110414301A
CN110414301A CN201810408662.0A CN201810408662A CN110414301A CN 110414301 A CN110414301 A CN 110414301A CN 201810408662 A CN201810408662 A CN 201810408662A CN 110414301 A CN110414301 A CN 110414301A
Authority
CN
China
Prior art keywords
layers
crowd density
taking
feature vector
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810408662.0A
Other languages
Chinese (zh)
Other versions
CN110414301B (en
Inventor
陈汉嵘
谢晓华
韦宝典
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN201810408662.0A priority Critical patent/CN110414301B/en
Publication of CN110414301A publication Critical patent/CN110414301A/en
Application granted granted Critical
Publication of CN110414301B publication Critical patent/CN110414301B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The present invention discloses a kind of based on double compartment crowd density estimation methods for taking the photograph head, it include: to propose multi-angle of view crowd density estimation network, the network consists of two parts, a part is the convolutional neural networks of parameter sharing, another part is full articulamentum, which can distinguish the crowd density grade in current train compartment.Model training stage is iterated optimization using the sample with 5 class density ratings;The model application stage, according to the regular Sampling Estimation of subway practical operation situation.The present invention is based on deep learning methods to estimate crowd density, replaces the feature of previous hand-designed, using the automatic learning characteristic of convolutional neural networks to improve the accuracy rate and robustness of crowd density estimation.

Description

It is a kind of based on double compartment crowd density estimation methods for taking the photograph head
Technical field
It is the present invention relates to crowd density estimation technical field, in particular to a kind of close based on double compartment crowds for taking the photograph head Spend estimation method.
Background technique
Existing crowd density estimation technology still has many deficiencies.Method pixel-based is simply easy to accomplish, but only It can be suitably used for the lower scene of crowd density.Although the method effect based on texture analysis is pretty good, operation is many and diverse, in reality Real-time is often not achieved in.And the method based on target detection can reliably be tied in the case where relatively crowded Fruit, but application power is lost in the high scene of crowd's degree of overlapping.
Existing crowd density estimation technology mainly has following a few classes:
1) method based on pixels statistics [1].The pixel of the statistics crowd gross area and crowd edge, according to the pixel obtained Crowd density estimation is carried out as the linear relationship between feature and total number of persons.This method passes through background subtraction and edge detection skill Art obtains prospect, background and edge pixel number in image.This method is mainly used in Crowds Distribute than sparse field Scape.
2) method based on texture analysis [2].Image line is extracted by the method for gray level co-occurrence matrixes and WAVELET PACKET DECOMPOSITION Feature is managed, support vector machines, adaboost and neural network is then used to carry out learning training to these features as disaggregated model. This method is mainly used in scene of the Crowds Distribute than comparatively dense.
3) method based on target detection [3].By the head detector based on haar-like and haar wavelet transformation, It discriminates whether to estimate the density of whole crowd finally for head using SVM classifier.
Summary of the invention
The main object of the present invention be propose it is a kind of based on double compartment crowd density estimation methods for taking the photograph head, it is intended to gram Take problem above.
To achieve the above object, a kind of based on double compartment crowd density estimation methods for taking the photograph head, include the following steps:
S10 prepares training sample: the neural network of the convolutional layer comprising 4 parameter sharings and 5 full articulamentums is established, it is defeated Entering the video frame of two different perspectivess in same compartment mutually in the same time, training has the sample of the label of density rating, wherein Convolutional layer is used to extract the feature vector of video, and the feature vector that full articulamentum is used to be extracted convolutional layer is by density rating Classify;
S20 neural metwork training: the neural network of iteration optimization training for several times;
The S30 application stage: the video frame in the double current train compartments for taking the photograph head shooting of interception is separately input into the mind after optimization Through network, the image classification result in current train compartment is obtained.
Preferably, 4 convolutional layers described in the S10 include the first Conv layers, the 2nd Conv layers, the 3rd Conv layers and the Four Conv layers, the described first Conv layers of convolution kernel size is 9 × 9, and step-length 1, convolution kernel number is 16, image input the One Conv layers of generation, 16 characteristic patterns, after connecting Max-pooling layers of Relu layers of line rectification function and maximum value pond, output The characteristic pattern that size is 288 × 464 × 16.
Preferably, the described 2nd Conv layers convolution kernel size be 7 × 7, step-length 1, convolution kernel number is 32, image The 2nd 32 characteristic patterns of Conv layers of generation are inputted, connect Relu layers and Max-pooling layers, output size is 144 × 232 × 32 characteristic pattern.
Preferably, the described 3rd Conv layers convolution kernel size be 7 × 7, step-length 1, convolution kernel number is 16, image Input the 3rd 16 characteristic patterns of Conv layers of generation, then connect Relu layers and Max-pooling layers, output size be 72 × 116 × 16 characteristic pattern.
Preferably, the 4th Conv layers of the convolution kernel size is 7 × 7, and step-length 1, convolution kernel number is 8, figure As the 4th 8 characteristic patterns of Conv layers of generation of input, Relu layers and Max-pooling layers are connected, output size is 36 × 58 × 8 Characteristic pattern.
Preferably, described 5 full articulamentums include FC5, FC6, FC7, FC8 and Softmax layers, the 5th Conv layers of output Double two 36 × 58 × 8 characteristic patterns for taking the photograph head, are separately input to full articulamentum FC5_0 and FC5_1, obtain two group of 1024 dimension Feature vector;Two groups of vectors are separately input to FC6_0 layers and FC6_1 layer obtain two group of 512 feature vector tieed up, and then two The feature vector of 512 dimension of group carries out phase add operation and obtains one group of 512 new dimensional feature vector;One group of 512 new dimensional feature to Amount is input to FC7 layers and obtains the feature vector of 256 dimensions;256 feature vector tieed up is input to FC8 layers again and obtains 128 dimensions Feature vector;128 feature vector tieed up finally is input to Softmax layers and obtains one group of 5 probability vector tieed up.
Preferably, the density rating includes ex-low, low, medium, high, ex-high, the sample of the ex-low This label is [1,0,0,0,0], and the sample label of the low is [0,1,0,0,0], the sample label of the medium be [0, 0,1,0,0], the sample label of the high is [0,0,0,1,0], and the sample label of the ex-high is [0,0,0,0,1], The crowd density grade of image is determined according to the last layer output valve size.
Preferably, it described 20 specifically includes:
Each batch that neural network is arranged in S201 is a predetermined value, and each iteration inputs the sample of the predetermined value number;
S202 initializes full connection using xavier method using the parameter of gaussian initialization neural network convolutional layer Layer parameter, the gaussian " are that Gaussian Profile is initialized as W~N (μ, σ2), wherein μ=0, σ2=0.01, the xavier It is to initialize parameter in a uniformly distributed manner, speciallyWherein layer where n expression parameter Input dimension, m then indicate output dimension;
S203 optimizes training using Adam algorithm, and wherein the formula of Adam algorithm is as follows:
Parameter updates rule:
In above formula,Indicate the gradient of the t times iteration, β1, β2It is hyper parameter, is traditionally arranged to be 0.9 He 0.999, ∈ be the numerical value of a very little to prevent denominator is zero, is typically provided to 10-8, mtApproximation is regarded as pairPhase It hopes, vtApproximation is regarded as pairExpectation, andWithIt is then to m respectivelytAnd vtZero deflection estimation;
S204 utilizes loss function Softmax recursive neural network for several times, until being optimal, the loss function The formula of Softmax is as follows:
Wherein, left item is cross entropy cost function, [f1, f2..., fK] for the output vector of network, the K=in this task 5, yiIt is expressed as class density corresponding to i-th of sample in this iteration, right item R (W) is regular terms, and W indicates network ginseng Number,λ is hyper parameter, is set as 0.0002.
Preferably, two video frames are weighted fusion in described 30, obtain the image classification result in current train compartment Calculation formula it is as follows:
Class=argmax { [F (X1;θ)+F(X2;θ)]/2}
Wherein F (Xi;It is θ) output of network model, X1, X2The image of respectively two cameras input, θ are convergence mould The parameter of type.
The present invention is the crowd density estimation method of multi-angle of view crowd based on deep learning a kind of, using convolutional Neural net The automatic learning characteristic of network replaces the feature of previous hand-designed, obtains more robust model, and special for subway carriage Environment proposes the input of both-end camera, so as to handle the problem of seriously blocking under extreme case, improves crowd density estimation Accuracy rate.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with The structure shown according to these attached drawings obtains other attached drawings.
Fig. 1 is that the present invention is based on the method flow diagrams of double compartment crowd density estimation methods for taking the photograph head;
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiment is only a part of the embodiments of the present invention, instead of all the embodiments.Base Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts it is all its His embodiment, shall fall within the protection scope of the present invention.
It is to be appreciated that if relating to directionality instruction (such as up, down, left, right, before and after ...) in the embodiment of the present invention, Then directionality instruction be only used for explain under a certain particular pose (as shown in the picture) between each component relative positional relationship, Motion conditions etc., if the particular pose changes, directionality instruction is also correspondingly changed correspondingly.
In addition, being somebody's turn to do " first ", " second " etc. if relating to the description of " first ", " second " etc. in the embodiment of the present invention Description be used for description purposes only, be not understood to indicate or imply its relative importance or implicitly indicate indicated skill The quantity of art feature." first " is defined as a result, the feature of " second " can explicitly or implicitly include at least one spy Sign.It in addition, the technical solution between each embodiment can be combined with each other, but must be with those of ordinary skill in the art's energy It is enough realize based on, will be understood that the knot of this technical solution when conflicting or cannot achieve when occurs in the combination of technical solution Conjunction is not present, also not the present invention claims protection scope within.
As shown in Figure 1, Fig. 1 is the present invention is based on the method flow diagram of double compartment crowd density estimation methods for taking the photograph head, (a) it is the training stage, (b) is the application stage, is calculated in the training stage using backpropagation (Back Propagation, abbreviation BP) Method iteration optimization model parameter;Test phase improves classification accuracy using the method for probability vector fusion.
It is a kind of based on double compartment crowd density estimation methods for taking the photograph head, include the following steps:
S10 prepares training sample: the neural network of the convolutional layer comprising 4 parameter sharings and 5 full articulamentums is established, it is defeated Entering the video frame of two different perspectivess in same compartment mutually in the same time, training has the sample of the label of density rating, wherein Convolutional layer is used to extract the feature vector of video, and the feature vector that full articulamentum is used to be extracted convolutional layer is by density rating Classify;
S20 neural metwork training: the neural network of iteration optimization training for several times;
The S30 application stage: the video frame in the double current train compartments for taking the photograph head shooting of interception is separately input into the mind after optimization Through network, the image classification result in current train compartment is obtained.
Preferably, 5 convolutional layers described in the S10 include the first Conv layers, the 2nd Conv layers, the 3rd Conv layers, the Four Conv layers and the 5th Conv layer, the described first Conv layers of convolution kernel size is 9 × 9, and step-length 1, convolution kernel number is 16 A, image inputs the first 16 characteristic patterns of Conv layers of generation, connects Relu layers of line rectification function and maximum value pond Max- After pooling layers, output size be 288 × 464 × 16 characteristic pattern.
Preferably, the described 2nd Conv layers convolution kernel size be 7 × 7, step-length 1, convolution kernel number is 32, image The 2nd 32 characteristic patterns of Conv layers of generation are inputted, connect Relu layers and Max-pooling layers, output size is 144 × 232 × 32 characteristic pattern.
Preferably, the described 3rd Conv layers convolution kernel size be 7 × 7, step-length 1, convolution kernel number is 16, image Input the 3rd 16 characteristic patterns of Conv layers of generation, then connect Relu layers and Max-pooling layers, output size be 72 × 116 × 16 characteristic pattern.
Preferably, the 4th Conv layers of the convolution kernel size is 7 × 7, and step-length 1, convolution kernel number is 8, figure As the 4th 8 characteristic patterns of Conv layers of generation of input, Relu layers and Max-pooling layers are connected, output size is 36 × 58 × 8 Characteristic pattern.
Preferably, described 5 full articulamentums include FC5, FC6, FC7, FC8 and Softmax layers, the 5th Conv layers of output Double two 36 × 58 × 8 characteristic patterns for taking the photograph head, are separately input to full articulamentum FC5_0 and FC5_1, obtain two group of 1024 dimension Feature vector;Two groups of vectors are separately input to FC6_0 layers and FC6_1 layer obtain two group of 512 feature vector tieed up, and then two The feature vector of 512 dimension of group carries out phase add operation and obtains one group of 512 new dimensional feature vector;One group of 512 new dimensional feature to Amount is input to FC7 layers and obtains the feature vector of 256 dimensions;256 feature vector tieed up is input to FC8 layers again and obtains 128 dimensions Feature vector;128 feature vector tieed up finally is input to Softmax layers and obtains one group of 5 probability vector tieed up.
Preferably, the density rating includes ex-low, low, medium, high, ex-high, the sample of the ex-low This label is [1,0,0,0,0], and the sample label of the low is [0,1,0,0,0], the sample label of the medium be [0, 0,1,0,0], the sample label of the high is [0,0,0,1,0], and the sample label of the ex-high is [0,0,0,0,1], The crowd density grade of image is determined according to the last layer output valve size.
Preferably, it described 20 specifically includes:
Each batch that neural network is arranged in S201 is a predetermined value, and each iteration inputs the sample of the predetermined value number;
S202 initializes full connection using xavier method using the parameter of gaussian initialization neural network convolutional layer Layer parameter, the gaussian " are that Gaussian Profile is initialized as W~N (μ, σ2), wherein μ=0, σ2=0.01, the xavier It is to initialize parameter in a uniformly distributed manner, speciallyWherein layer where n expression parameter Input dimension, m then indicate output dimension;
S203 optimizes training using Adam algorithm, and wherein the formula of Adam algorithm is as follows:
Parameter updates rule:
In above formula,Indicate the gradient of the t times iteration, β1, β2It is hyper parameter, is traditionally arranged to be 0.9 He 0.999, ∈ be the numerical value of a very little to prevent denominator is zero, is typically provided to 10, mt approximation and regards as pairPhase It hopes, vtApproximation is regarded as pairExpectation, andWithIt is then to m respectivelytAnd vtZero deflection estimation;
S204 utilizes loss function Softmax recursive neural network for several times, until being optimal, the loss function The formula of Softmax is as follows:
Wherein, left item is cross entropy cost function, [f1, f2..., fK] for the output vector of network, the K=in this task 5, yiIt is expressed as class density corresponding to i-th of sample in this iteration, right item R (W) is regular terms, and W indicates network ginseng Number,λ is hyper parameter, is set as 0.0002.
Preferably, two video frames are weighted fusion in described 30, obtain the image classification result in current train compartment Calculation formula it is as follows:
Class=argmax { [F (X1;θ)+F(X2;θ)]/2}
Wherein F (Xi;It is θ) output of network model, X1, X2The image of respectively two cameras input, θ are convergence mould The parameter of type.
In embodiments of the present invention, the present invention proposes the Classification Neural comprising 5 convolutional layers.By subway crowd Picture and its tag along sort input network, optimize network parameter using the continuous repetitive exercise of loss function Softmax;The present invention is also Multi-cam input is proposed to solve occlusion issue, final prediction result is the fusion of each input results, and it is quasi- to improve classification True rate.
Relative to pervious crowd density estimation technology, the present invention is had the advantage that
1, there can be better robust in subway carriage from sparse satisfied to achieving the effect that under extremely crowded environment Property;
2, end-to-end completion training, is compared with the traditional method, not many and diverse calculating process, and can be in practical applications Reach real-time.
Basis of the invention is accurately estimate crowd density grade in the intensive place such as subway carriage, and With robustness, real-time and other effects.Therefore, any all to be wrapped based on crowd density grade separation application technology proposed by the present invention It is contained within the present invention, such as video brainpower watch and control.

Claims (9)

1. a kind of based on double compartment crowd density estimation methods for taking the photograph head, which comprises the steps of:
S10 prepares training sample: establishing includes the convolutional layer of 4 parameter sharings and the neural network of 5 full articulamentums, and input is same The video frame of two different perspectivess in one compartment mutually in the same time, training have the sample of the label of density rating, wherein convolution Layer is carried out for extracting the feature vector of video, the feature vector that full articulamentum is used to be extracted convolutional layer by density rating Classification;
S20 neural metwork training: the neural network of iteration optimization training for several times;
The S30 application stage: the video frame in the double current train compartments for taking the photograph head shooting of interception is separately input into the nerve net after optimization Network obtains the image classification result in current train compartment.
2. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described 4 convolutional layers described in S10 include the first Conv layers, the 2nd Conv layers, the 3rd Conv layers and the 4th Conv layers, and described first Conv layers of convolution kernel size is 9 × 9, and step-length 1, convolution kernel number is 16, and Conv layers of image input the first generates 16 Characteristic pattern, after connecting Max-pooling layers of Relu layers of line rectification function and maximum value pond, output size for 288 × 464 × 16 characteristic pattern.
3. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described Two Conv layers of convolution kernel size is 7 × 7, and step-length 1, convolution kernel number is 32, and Conv layers of image input the 2nd generates 32 A characteristic pattern connects Relu layers and Max-pooling layers, the characteristic pattern that output size is 144 × 232 × 32.
4. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described Three Conv layers of convolution kernel size is 7 × 7, and step-length 1, convolution kernel number is 16, and Conv layers of image input the 3rd generates 16 A characteristic pattern, then Relu layers and Max-pooling layers are connected, the characteristic pattern that output size is 72 × 116 × 16.
5. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described Four Conv layers of convolution kernel size is 7 × 7, and step-length 1, convolution kernel number is 8, and Conv layers of image input the 4th generates 8 A characteristic pattern connects Relu layers and Max-pooling layers, the characteristic pattern that output size is 36 × 58 × 8.
6. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described 5 A full articulamentum includes FC5, FC6, FC7, FC8 and Softmax layers, and the 5th double two 36 × 58 for taking the photograph head of Conv layers of output × 8 characteristic pattern is separately input to full articulamentum FC5_0 and FC5_1, obtains the feature vector of two group of 1024 dimension;Two groups of vectors point It is not input to FC6_0 layers and FC6_1 layers and obtains the feature vector of two group of 512 dimension, then the feature vector of two group of 512 dimension carries out phase Add operation obtains one group of 512 new dimensional feature vector;One group of 512 new dimensional feature vector is input to FC7 layers and obtains 256 dimensions Feature vector;256 feature vector tieed up is input to FC8 layers again and obtains the feature vector of 128 dimensions;Finally this 128 is tieed up Feature vector is input to Softmax layers and obtains the probability vector of one group of 5 dimension.
7. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described close Degree grade includes ex-low, low, medium, high, ex-high, and the sample label of the ex-low is [1,0,0,0,0], institute The sample label for stating low is [0,1,0,0,0], and the sample label of the medium is [0,0,1,0,0], the sample of the high Label is [0,0,0,1,0], and the sample label of the ex-high is [0,0,0,0,1], is sentenced according to the last layer output valve size Determine the crowd density grade of image.
8. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described 20 It specifically includes:
Each batch that neural network is arranged in S201 is a predetermined value, and each iteration inputs the sample of the predetermined value number;
S202 is initialized full articulamentum using xavier method and is joined using the parameter of gaussian initialization neural network convolutional layer Number, the gaussian " is that Gaussian Profile is initialized as W~N (μ, σ2), wherein μ=0, σ2=0.01, the xavier be by Parameter initializes in a uniformly distributed manner, speciallyWherein layer where n expression parameter is defeated Enter dimension, m then indicates output dimension;
S203 optimizes training using Adam algorithm, and wherein the formula of Adam algorithm is as follows:
Parameter updates rule:
In above formula,Indicate the gradient of the t times iteration, β1, β2It is hyper parameter, is traditionally arranged to be 0.9 and 0.999, ∈ is The numerical value of one very little is zero to prevent denominator, is typically provided to 10-8, mtApproximation is regarded as pairExpectation, vtIt is approximate Regard as pairExpectation, andWithIt is then to m respectivelytAnd vtZero deflection estimation;
S204 utilizes loss function Softmax recursive neural network for several times, until being optimal, the loss function Softmax Formula it is as follows:
Wherein, left item is cross entropy cost function, [f1,f2,…,fK] for the output vector of network, K=5, y in this taskiTable It is shown as class density corresponding to i-th of sample in this iteration, right item R (W) is regular terms, and W indicates network parameter,λ is hyper parameter, is set as 0.0002.
9. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described 30 Middle that two video frames are weighted fusion, the calculation formula for obtaining the image classification result in current train compartment is as follows:
Class=argmax { [F (X1;θ)tF(X2;θ)]/2}
Wherein F (Xi;It is θ) output of network model, X1,X2The image of respectively two cameras input, θ are convergence model Parameter.
CN201810408662.0A 2018-04-28 2018-04-28 Train carriage crowd density estimation method based on double cameras Active CN110414301B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810408662.0A CN110414301B (en) 2018-04-28 2018-04-28 Train carriage crowd density estimation method based on double cameras

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810408662.0A CN110414301B (en) 2018-04-28 2018-04-28 Train carriage crowd density estimation method based on double cameras

Publications (2)

Publication Number Publication Date
CN110414301A true CN110414301A (en) 2019-11-05
CN110414301B CN110414301B (en) 2023-06-23

Family

ID=68357852

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810408662.0A Active CN110414301B (en) 2018-04-28 2018-04-28 Train carriage crowd density estimation method based on double cameras

Country Status (1)

Country Link
CN (1) CN110414301B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113158780A (en) * 2021-03-09 2021-07-23 中国科学院深圳先进技术研究院 Regional crowd density estimation method, electronic device and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103258232A (en) * 2013-04-12 2013-08-21 中国民航大学 Method for estimating number of people in public place based on two cameras
CN104992223A (en) * 2015-06-12 2015-10-21 安徽大学 Intensive population estimation method based on deep learning
CN106909924A (en) * 2017-02-18 2017-06-30 北京工业大学 A kind of remote sensing image method for quickly retrieving based on depth conspicuousness
CN107133220A (en) * 2017-06-07 2017-09-05 东南大学 Name entity recognition method in a kind of Geography field
CN107220657A (en) * 2017-05-10 2017-09-29 中国地质大学(武汉) A kind of method of high-resolution remote sensing image scene classification towards small data set
CN107316295A (en) * 2017-07-02 2017-11-03 苏州大学 A kind of fabric defects detection method based on deep neural network
CN107560849A (en) * 2017-08-04 2018-01-09 华北电力大学 A kind of Wind turbines Method for Bearing Fault Diagnosis of multichannel depth convolutional neural networks
CN107944386A (en) * 2017-11-22 2018-04-20 天津大学 Visual scene recognition methods based on convolutional neural networks

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103258232A (en) * 2013-04-12 2013-08-21 中国民航大学 Method for estimating number of people in public place based on two cameras
CN104992223A (en) * 2015-06-12 2015-10-21 安徽大学 Intensive population estimation method based on deep learning
CN106909924A (en) * 2017-02-18 2017-06-30 北京工业大学 A kind of remote sensing image method for quickly retrieving based on depth conspicuousness
CN107220657A (en) * 2017-05-10 2017-09-29 中国地质大学(武汉) A kind of method of high-resolution remote sensing image scene classification towards small data set
CN107133220A (en) * 2017-06-07 2017-09-05 东南大学 Name entity recognition method in a kind of Geography field
CN107316295A (en) * 2017-07-02 2017-11-03 苏州大学 A kind of fabric defects detection method based on deep neural network
CN107560849A (en) * 2017-08-04 2018-01-09 华北电力大学 A kind of Wind turbines Method for Bearing Fault Diagnosis of multichannel depth convolutional neural networks
CN107944386A (en) * 2017-11-22 2018-04-20 天津大学 Visual scene recognition methods based on convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谭智勇等: "一种基于深度卷积神经网络的人群密度估计方法", 《计算机应用与软件》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113158780A (en) * 2021-03-09 2021-07-23 中国科学院深圳先进技术研究院 Regional crowd density estimation method, electronic device and storage medium
CN113158780B (en) * 2021-03-09 2023-10-27 中国科学院深圳先进技术研究院 Regional crowd density estimation method, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN110414301B (en) 2023-06-23

Similar Documents

Publication Publication Date Title
CN109389055B (en) Video classification method based on mixed convolution and attention mechanism
CN108520535B (en) Object classification method based on depth recovery information
Li et al. Building-a-nets: Robust building extraction from high-resolution remote sensing images with adversarial networks
Sun et al. Lattice long short-term memory for human action recognition
Tao et al. Smoke detection based on deep convolutional neural networks
CN110135243B (en) Pedestrian detection method and system based on two-stage attention mechanism
Fu et al. Fast crowd density estimation with convolutional neural networks
CN108510012A (en) A kind of target rapid detection method based on Analysis On Multi-scale Features figure
CN112507777A (en) Optical remote sensing image ship detection and segmentation method based on deep learning
CN109886225A (en) A kind of image gesture motion on-line checking and recognition methods based on deep learning
CN108960059A (en) A kind of video actions recognition methods and device
CN108537743A (en) A kind of face-image Enhancement Method based on generation confrontation network
CN109919011A (en) A kind of action video recognition methods based on more duration informations
CN112288627A (en) Recognition-oriented low-resolution face image super-resolution method
CN113379771B (en) Hierarchical human body analysis semantic segmentation method with edge constraint
CN107818307A (en) A kind of multi-tag Video Events detection method based on LSTM networks
Desai et al. Next frame prediction using ConvLSTM
CN112906520A (en) Gesture coding-based action recognition method and device
CN114627269A (en) Virtual reality security protection monitoring platform based on degree of depth learning target detection
CN113255464A (en) Airplane action recognition method and system
Wu et al. Spatial-temporal graph network for video crowd counting
CN112084952A (en) Video point location tracking method based on self-supervision training
CN116977674A (en) Image matching method, related device, storage medium and program product
Liu et al. Axial assembled correspondence network for few-shot semantic segmentation
Liu et al. Online human action recognition with spatial and temporal skeleton features using a distributed camera network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant