CN110414301A - It is a kind of based on double compartment crowd density estimation methods for taking the photograph head - Google Patents
It is a kind of based on double compartment crowd density estimation methods for taking the photograph head Download PDFInfo
- Publication number
- CN110414301A CN110414301A CN201810408662.0A CN201810408662A CN110414301A CN 110414301 A CN110414301 A CN 110414301A CN 201810408662 A CN201810408662 A CN 201810408662A CN 110414301 A CN110414301 A CN 110414301A
- Authority
- CN
- China
- Prior art keywords
- layers
- crowd density
- taking
- feature vector
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000005457 optimization Methods 0.000 claims abstract description 8
- 239000013598 vector Substances 0.000 claims description 41
- 238000013528 artificial neural network Methods 0.000 claims description 16
- 238000011176 pooling Methods 0.000 claims description 12
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 6
- 230000004927 fusion Effects 0.000 claims description 5
- 230000001537 neural effect Effects 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 3
- 210000004218 nerve net Anatomy 0.000 claims 1
- 238000013135 deep learning Methods 0.000 abstract description 2
- 238000013527 convolutional neural network Methods 0.000 abstract 2
- 238000005070 sampling Methods 0.000 abstract 1
- 230000006870 function Effects 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 5
- 241000208340 Araliaceae Species 0.000 description 2
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 2
- 235000003140 Panax quinquefolius Nutrition 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 235000008434 ginseng Nutrition 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 1
- 230000036624 brainpower Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000001151 other effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 230000001550 time effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/53—Recognition of crowd images, e.g. recognition of crowd congestion
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The present invention discloses a kind of based on double compartment crowd density estimation methods for taking the photograph head, it include: to propose multi-angle of view crowd density estimation network, the network consists of two parts, a part is the convolutional neural networks of parameter sharing, another part is full articulamentum, which can distinguish the crowd density grade in current train compartment.Model training stage is iterated optimization using the sample with 5 class density ratings;The model application stage, according to the regular Sampling Estimation of subway practical operation situation.The present invention is based on deep learning methods to estimate crowd density, replaces the feature of previous hand-designed, using the automatic learning characteristic of convolutional neural networks to improve the accuracy rate and robustness of crowd density estimation.
Description
Technical field
It is the present invention relates to crowd density estimation technical field, in particular to a kind of close based on double compartment crowds for taking the photograph head
Spend estimation method.
Background technique
Existing crowd density estimation technology still has many deficiencies.Method pixel-based is simply easy to accomplish, but only
It can be suitably used for the lower scene of crowd density.Although the method effect based on texture analysis is pretty good, operation is many and diverse, in reality
Real-time is often not achieved in.And the method based on target detection can reliably be tied in the case where relatively crowded
Fruit, but application power is lost in the high scene of crowd's degree of overlapping.
Existing crowd density estimation technology mainly has following a few classes:
1) method based on pixels statistics [1].The pixel of the statistics crowd gross area and crowd edge, according to the pixel obtained
Crowd density estimation is carried out as the linear relationship between feature and total number of persons.This method passes through background subtraction and edge detection skill
Art obtains prospect, background and edge pixel number in image.This method is mainly used in Crowds Distribute than sparse field
Scape.
2) method based on texture analysis [2].Image line is extracted by the method for gray level co-occurrence matrixes and WAVELET PACKET DECOMPOSITION
Feature is managed, support vector machines, adaboost and neural network is then used to carry out learning training to these features as disaggregated model.
This method is mainly used in scene of the Crowds Distribute than comparatively dense.
3) method based on target detection [3].By the head detector based on haar-like and haar wavelet transformation,
It discriminates whether to estimate the density of whole crowd finally for head using SVM classifier.
Summary of the invention
The main object of the present invention be propose it is a kind of based on double compartment crowd density estimation methods for taking the photograph head, it is intended to gram
Take problem above.
To achieve the above object, a kind of based on double compartment crowd density estimation methods for taking the photograph head, include the following steps:
S10 prepares training sample: the neural network of the convolutional layer comprising 4 parameter sharings and 5 full articulamentums is established, it is defeated
Entering the video frame of two different perspectivess in same compartment mutually in the same time, training has the sample of the label of density rating, wherein
Convolutional layer is used to extract the feature vector of video, and the feature vector that full articulamentum is used to be extracted convolutional layer is by density rating
Classify;
S20 neural metwork training: the neural network of iteration optimization training for several times;
The S30 application stage: the video frame in the double current train compartments for taking the photograph head shooting of interception is separately input into the mind after optimization
Through network, the image classification result in current train compartment is obtained.
Preferably, 4 convolutional layers described in the S10 include the first Conv layers, the 2nd Conv layers, the 3rd Conv layers and the
Four Conv layers, the described first Conv layers of convolution kernel size is 9 × 9, and step-length 1, convolution kernel number is 16, image input the
One Conv layers of generation, 16 characteristic patterns, after connecting Max-pooling layers of Relu layers of line rectification function and maximum value pond, output
The characteristic pattern that size is 288 × 464 × 16.
Preferably, the described 2nd Conv layers convolution kernel size be 7 × 7, step-length 1, convolution kernel number is 32, image
The 2nd 32 characteristic patterns of Conv layers of generation are inputted, connect Relu layers and Max-pooling layers, output size is 144 × 232 ×
32 characteristic pattern.
Preferably, the described 3rd Conv layers convolution kernel size be 7 × 7, step-length 1, convolution kernel number is 16, image
Input the 3rd 16 characteristic patterns of Conv layers of generation, then connect Relu layers and Max-pooling layers, output size be 72 × 116 ×
16 characteristic pattern.
Preferably, the 4th Conv layers of the convolution kernel size is 7 × 7, and step-length 1, convolution kernel number is 8, figure
As the 4th 8 characteristic patterns of Conv layers of generation of input, Relu layers and Max-pooling layers are connected, output size is 36 × 58 × 8
Characteristic pattern.
Preferably, described 5 full articulamentums include FC5, FC6, FC7, FC8 and Softmax layers, the 5th Conv layers of output
Double two 36 × 58 × 8 characteristic patterns for taking the photograph head, are separately input to full articulamentum FC5_0 and FC5_1, obtain two group of 1024 dimension
Feature vector;Two groups of vectors are separately input to FC6_0 layers and FC6_1 layer obtain two group of 512 feature vector tieed up, and then two
The feature vector of 512 dimension of group carries out phase add operation and obtains one group of 512 new dimensional feature vector;One group of 512 new dimensional feature to
Amount is input to FC7 layers and obtains the feature vector of 256 dimensions;256 feature vector tieed up is input to FC8 layers again and obtains 128 dimensions
Feature vector;128 feature vector tieed up finally is input to Softmax layers and obtains one group of 5 probability vector tieed up.
Preferably, the density rating includes ex-low, low, medium, high, ex-high, the sample of the ex-low
This label is [1,0,0,0,0], and the sample label of the low is [0,1,0,0,0], the sample label of the medium be [0,
0,1,0,0], the sample label of the high is [0,0,0,1,0], and the sample label of the ex-high is [0,0,0,0,1],
The crowd density grade of image is determined according to the last layer output valve size.
Preferably, it described 20 specifically includes:
Each batch that neural network is arranged in S201 is a predetermined value, and each iteration inputs the sample of the predetermined value number;
S202 initializes full connection using xavier method using the parameter of gaussian initialization neural network convolutional layer
Layer parameter, the gaussian " are that Gaussian Profile is initialized as W~N (μ, σ2), wherein μ=0, σ2=0.01, the xavier
It is to initialize parameter in a uniformly distributed manner, speciallyWherein layer where n expression parameter
Input dimension, m then indicate output dimension;
S203 optimizes training using Adam algorithm, and wherein the formula of Adam algorithm is as follows:
Parameter updates rule:
In above formula,Indicate the gradient of the t times iteration, β1, β2It is hyper parameter, is traditionally arranged to be 0.9 He
0.999, ∈ be the numerical value of a very little to prevent denominator is zero, is typically provided to 10-8, mtApproximation is regarded as pairPhase
It hopes, vtApproximation is regarded as pairExpectation, andWithIt is then to m respectivelytAnd vtZero deflection estimation;
S204 utilizes loss function Softmax recursive neural network for several times, until being optimal, the loss function
The formula of Softmax is as follows:
Wherein, left item is cross entropy cost function, [f1, f2..., fK] for the output vector of network, the K=in this task
5, yiIt is expressed as class density corresponding to i-th of sample in this iteration, right item R (W) is regular terms, and W indicates network ginseng
Number,λ is hyper parameter, is set as 0.0002.
Preferably, two video frames are weighted fusion in described 30, obtain the image classification result in current train compartment
Calculation formula it is as follows:
Class=argmax { [F (X1;θ)+F(X2;θ)]/2}
Wherein F (Xi;It is θ) output of network model, X1, X2The image of respectively two cameras input, θ are convergence mould
The parameter of type.
The present invention is the crowd density estimation method of multi-angle of view crowd based on deep learning a kind of, using convolutional Neural net
The automatic learning characteristic of network replaces the feature of previous hand-designed, obtains more robust model, and special for subway carriage
Environment proposes the input of both-end camera, so as to handle the problem of seriously blocking under extreme case, improves crowd density estimation
Accuracy rate.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with
The structure shown according to these attached drawings obtains other attached drawings.
Fig. 1 is that the present invention is based on the method flow diagrams of double compartment crowd density estimation methods for taking the photograph head;
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiment is only a part of the embodiments of the present invention, instead of all the embodiments.Base
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts it is all its
His embodiment, shall fall within the protection scope of the present invention.
It is to be appreciated that if relating to directionality instruction (such as up, down, left, right, before and after ...) in the embodiment of the present invention,
Then directionality instruction be only used for explain under a certain particular pose (as shown in the picture) between each component relative positional relationship,
Motion conditions etc., if the particular pose changes, directionality instruction is also correspondingly changed correspondingly.
In addition, being somebody's turn to do " first ", " second " etc. if relating to the description of " first ", " second " etc. in the embodiment of the present invention
Description be used for description purposes only, be not understood to indicate or imply its relative importance or implicitly indicate indicated skill
The quantity of art feature." first " is defined as a result, the feature of " second " can explicitly or implicitly include at least one spy
Sign.It in addition, the technical solution between each embodiment can be combined with each other, but must be with those of ordinary skill in the art's energy
It is enough realize based on, will be understood that the knot of this technical solution when conflicting or cannot achieve when occurs in the combination of technical solution
Conjunction is not present, also not the present invention claims protection scope within.
As shown in Figure 1, Fig. 1 is the present invention is based on the method flow diagram of double compartment crowd density estimation methods for taking the photograph head,
(a) it is the training stage, (b) is the application stage, is calculated in the training stage using backpropagation (Back Propagation, abbreviation BP)
Method iteration optimization model parameter;Test phase improves classification accuracy using the method for probability vector fusion.
It is a kind of based on double compartment crowd density estimation methods for taking the photograph head, include the following steps:
S10 prepares training sample: the neural network of the convolutional layer comprising 4 parameter sharings and 5 full articulamentums is established, it is defeated
Entering the video frame of two different perspectivess in same compartment mutually in the same time, training has the sample of the label of density rating, wherein
Convolutional layer is used to extract the feature vector of video, and the feature vector that full articulamentum is used to be extracted convolutional layer is by density rating
Classify;
S20 neural metwork training: the neural network of iteration optimization training for several times;
The S30 application stage: the video frame in the double current train compartments for taking the photograph head shooting of interception is separately input into the mind after optimization
Through network, the image classification result in current train compartment is obtained.
Preferably, 5 convolutional layers described in the S10 include the first Conv layers, the 2nd Conv layers, the 3rd Conv layers, the
Four Conv layers and the 5th Conv layer, the described first Conv layers of convolution kernel size is 9 × 9, and step-length 1, convolution kernel number is 16
A, image inputs the first 16 characteristic patterns of Conv layers of generation, connects Relu layers of line rectification function and maximum value pond Max-
After pooling layers, output size be 288 × 464 × 16 characteristic pattern.
Preferably, the described 2nd Conv layers convolution kernel size be 7 × 7, step-length 1, convolution kernel number is 32, image
The 2nd 32 characteristic patterns of Conv layers of generation are inputted, connect Relu layers and Max-pooling layers, output size is 144 × 232 ×
32 characteristic pattern.
Preferably, the described 3rd Conv layers convolution kernel size be 7 × 7, step-length 1, convolution kernel number is 16, image
Input the 3rd 16 characteristic patterns of Conv layers of generation, then connect Relu layers and Max-pooling layers, output size be 72 × 116 ×
16 characteristic pattern.
Preferably, the 4th Conv layers of the convolution kernel size is 7 × 7, and step-length 1, convolution kernel number is 8, figure
As the 4th 8 characteristic patterns of Conv layers of generation of input, Relu layers and Max-pooling layers are connected, output size is 36 × 58 × 8
Characteristic pattern.
Preferably, described 5 full articulamentums include FC5, FC6, FC7, FC8 and Softmax layers, the 5th Conv layers of output
Double two 36 × 58 × 8 characteristic patterns for taking the photograph head, are separately input to full articulamentum FC5_0 and FC5_1, obtain two group of 1024 dimension
Feature vector;Two groups of vectors are separately input to FC6_0 layers and FC6_1 layer obtain two group of 512 feature vector tieed up, and then two
The feature vector of 512 dimension of group carries out phase add operation and obtains one group of 512 new dimensional feature vector;One group of 512 new dimensional feature to
Amount is input to FC7 layers and obtains the feature vector of 256 dimensions;256 feature vector tieed up is input to FC8 layers again and obtains 128 dimensions
Feature vector;128 feature vector tieed up finally is input to Softmax layers and obtains one group of 5 probability vector tieed up.
Preferably, the density rating includes ex-low, low, medium, high, ex-high, the sample of the ex-low
This label is [1,0,0,0,0], and the sample label of the low is [0,1,0,0,0], the sample label of the medium be [0,
0,1,0,0], the sample label of the high is [0,0,0,1,0], and the sample label of the ex-high is [0,0,0,0,1],
The crowd density grade of image is determined according to the last layer output valve size.
Preferably, it described 20 specifically includes:
Each batch that neural network is arranged in S201 is a predetermined value, and each iteration inputs the sample of the predetermined value number;
S202 initializes full connection using xavier method using the parameter of gaussian initialization neural network convolutional layer
Layer parameter, the gaussian " are that Gaussian Profile is initialized as W~N (μ, σ2), wherein μ=0, σ2=0.01, the xavier
It is to initialize parameter in a uniformly distributed manner, speciallyWherein layer where n expression parameter
Input dimension, m then indicate output dimension;
S203 optimizes training using Adam algorithm, and wherein the formula of Adam algorithm is as follows:
Parameter updates rule:
In above formula,Indicate the gradient of the t times iteration, β1, β2It is hyper parameter, is traditionally arranged to be 0.9 He
0.999, ∈ be the numerical value of a very little to prevent denominator is zero, is typically provided to 10, mt approximation and regards as pairPhase
It hopes, vtApproximation is regarded as pairExpectation, andWithIt is then to m respectivelytAnd vtZero deflection estimation;
S204 utilizes loss function Softmax recursive neural network for several times, until being optimal, the loss function
The formula of Softmax is as follows:
Wherein, left item is cross entropy cost function, [f1, f2..., fK] for the output vector of network, the K=in this task
5, yiIt is expressed as class density corresponding to i-th of sample in this iteration, right item R (W) is regular terms, and W indicates network ginseng
Number,λ is hyper parameter, is set as 0.0002.
Preferably, two video frames are weighted fusion in described 30, obtain the image classification result in current train compartment
Calculation formula it is as follows:
Class=argmax { [F (X1;θ)+F(X2;θ)]/2}
Wherein F (Xi;It is θ) output of network model, X1, X2The image of respectively two cameras input, θ are convergence mould
The parameter of type.
In embodiments of the present invention, the present invention proposes the Classification Neural comprising 5 convolutional layers.By subway crowd
Picture and its tag along sort input network, optimize network parameter using the continuous repetitive exercise of loss function Softmax;The present invention is also
Multi-cam input is proposed to solve occlusion issue, final prediction result is the fusion of each input results, and it is quasi- to improve classification
True rate.
Relative to pervious crowd density estimation technology, the present invention is had the advantage that
1, there can be better robust in subway carriage from sparse satisfied to achieving the effect that under extremely crowded environment
Property;
2, end-to-end completion training, is compared with the traditional method, not many and diverse calculating process, and can be in practical applications
Reach real-time.
Basis of the invention is accurately estimate crowd density grade in the intensive place such as subway carriage, and
With robustness, real-time and other effects.Therefore, any all to be wrapped based on crowd density grade separation application technology proposed by the present invention
It is contained within the present invention, such as video brainpower watch and control.
Claims (9)
1. a kind of based on double compartment crowd density estimation methods for taking the photograph head, which comprises the steps of:
S10 prepares training sample: establishing includes the convolutional layer of 4 parameter sharings and the neural network of 5 full articulamentums, and input is same
The video frame of two different perspectivess in one compartment mutually in the same time, training have the sample of the label of density rating, wherein convolution
Layer is carried out for extracting the feature vector of video, the feature vector that full articulamentum is used to be extracted convolutional layer by density rating
Classification;
S20 neural metwork training: the neural network of iteration optimization training for several times;
The S30 application stage: the video frame in the double current train compartments for taking the photograph head shooting of interception is separately input into the nerve net after optimization
Network obtains the image classification result in current train compartment.
2. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described
4 convolutional layers described in S10 include the first Conv layers, the 2nd Conv layers, the 3rd Conv layers and the 4th Conv layers, and described first
Conv layers of convolution kernel size is 9 × 9, and step-length 1, convolution kernel number is 16, and Conv layers of image input the first generates 16
Characteristic pattern, after connecting Max-pooling layers of Relu layers of line rectification function and maximum value pond, output size for 288 × 464 ×
16 characteristic pattern.
3. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described
Two Conv layers of convolution kernel size is 7 × 7, and step-length 1, convolution kernel number is 32, and Conv layers of image input the 2nd generates 32
A characteristic pattern connects Relu layers and Max-pooling layers, the characteristic pattern that output size is 144 × 232 × 32.
4. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described
Three Conv layers of convolution kernel size is 7 × 7, and step-length 1, convolution kernel number is 16, and Conv layers of image input the 3rd generates 16
A characteristic pattern, then Relu layers and Max-pooling layers are connected, the characteristic pattern that output size is 72 × 116 × 16.
5. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described
Four Conv layers of convolution kernel size is 7 × 7, and step-length 1, convolution kernel number is 8, and Conv layers of image input the 4th generates 8
A characteristic pattern connects Relu layers and Max-pooling layers, the characteristic pattern that output size is 36 × 58 × 8.
6. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described 5
A full articulamentum includes FC5, FC6, FC7, FC8 and Softmax layers, and the 5th double two 36 × 58 for taking the photograph head of Conv layers of output ×
8 characteristic pattern is separately input to full articulamentum FC5_0 and FC5_1, obtains the feature vector of two group of 1024 dimension;Two groups of vectors point
It is not input to FC6_0 layers and FC6_1 layers and obtains the feature vector of two group of 512 dimension, then the feature vector of two group of 512 dimension carries out phase
Add operation obtains one group of 512 new dimensional feature vector;One group of 512 new dimensional feature vector is input to FC7 layers and obtains 256 dimensions
Feature vector;256 feature vector tieed up is input to FC8 layers again and obtains the feature vector of 128 dimensions;Finally this 128 is tieed up
Feature vector is input to Softmax layers and obtains the probability vector of one group of 5 dimension.
7. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described close
Degree grade includes ex-low, low, medium, high, ex-high, and the sample label of the ex-low is [1,0,0,0,0], institute
The sample label for stating low is [0,1,0,0,0], and the sample label of the medium is [0,0,1,0,0], the sample of the high
Label is [0,0,0,1,0], and the sample label of the ex-high is [0,0,0,0,1], is sentenced according to the last layer output valve size
Determine the crowd density grade of image.
8. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described 20
It specifically includes:
Each batch that neural network is arranged in S201 is a predetermined value, and each iteration inputs the sample of the predetermined value number;
S202 is initialized full articulamentum using xavier method and is joined using the parameter of gaussian initialization neural network convolutional layer
Number, the gaussian " is that Gaussian Profile is initialized as W~N (μ, σ2), wherein μ=0, σ2=0.01, the xavier be by
Parameter initializes in a uniformly distributed manner, speciallyWherein layer where n expression parameter is defeated
Enter dimension, m then indicates output dimension;
S203 optimizes training using Adam algorithm, and wherein the formula of Adam algorithm is as follows:
Parameter updates rule:
In above formula,Indicate the gradient of the t times iteration, β1, β2It is hyper parameter, is traditionally arranged to be 0.9 and 0.999, ∈ is
The numerical value of one very little is zero to prevent denominator, is typically provided to 10-8, mtApproximation is regarded as pairExpectation, vtIt is approximate
Regard as pairExpectation, andWithIt is then to m respectivelytAnd vtZero deflection estimation;
S204 utilizes loss function Softmax recursive neural network for several times, until being optimal, the loss function Softmax
Formula it is as follows:
Wherein, left item is cross entropy cost function, [f1,f2,…,fK] for the output vector of network, K=5, y in this taskiTable
It is shown as class density corresponding to i-th of sample in this iteration, right item R (W) is regular terms, and W indicates network parameter,λ is hyper parameter, is set as 0.0002.
9. as described in claim 1 based on double compartment crowd density estimation methods for taking the photograph head, which is characterized in that described 30
Middle that two video frames are weighted fusion, the calculation formula for obtaining the image classification result in current train compartment is as follows:
Class=argmax { [F (X1;θ)tF(X2;θ)]/2}
Wherein F (Xi;It is θ) output of network model, X1,X2The image of respectively two cameras input, θ are convergence model
Parameter.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810408662.0A CN110414301B (en) | 2018-04-28 | 2018-04-28 | Train carriage crowd density estimation method based on double cameras |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810408662.0A CN110414301B (en) | 2018-04-28 | 2018-04-28 | Train carriage crowd density estimation method based on double cameras |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110414301A true CN110414301A (en) | 2019-11-05 |
CN110414301B CN110414301B (en) | 2023-06-23 |
Family
ID=68357852
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810408662.0A Active CN110414301B (en) | 2018-04-28 | 2018-04-28 | Train carriage crowd density estimation method based on double cameras |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110414301B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113158780A (en) * | 2021-03-09 | 2021-07-23 | 中国科学院深圳先进技术研究院 | Regional crowd density estimation method, electronic device and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103258232A (en) * | 2013-04-12 | 2013-08-21 | 中国民航大学 | Method for estimating number of people in public place based on two cameras |
CN104992223A (en) * | 2015-06-12 | 2015-10-21 | 安徽大学 | Intensive population estimation method based on deep learning |
CN106909924A (en) * | 2017-02-18 | 2017-06-30 | 北京工业大学 | A kind of remote sensing image method for quickly retrieving based on depth conspicuousness |
CN107133220A (en) * | 2017-06-07 | 2017-09-05 | 东南大学 | Name entity recognition method in a kind of Geography field |
CN107220657A (en) * | 2017-05-10 | 2017-09-29 | 中国地质大学(武汉) | A kind of method of high-resolution remote sensing image scene classification towards small data set |
CN107316295A (en) * | 2017-07-02 | 2017-11-03 | 苏州大学 | A kind of fabric defects detection method based on deep neural network |
CN107560849A (en) * | 2017-08-04 | 2018-01-09 | 华北电力大学 | A kind of Wind turbines Method for Bearing Fault Diagnosis of multichannel depth convolutional neural networks |
CN107944386A (en) * | 2017-11-22 | 2018-04-20 | 天津大学 | Visual scene recognition methods based on convolutional neural networks |
-
2018
- 2018-04-28 CN CN201810408662.0A patent/CN110414301B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103258232A (en) * | 2013-04-12 | 2013-08-21 | 中国民航大学 | Method for estimating number of people in public place based on two cameras |
CN104992223A (en) * | 2015-06-12 | 2015-10-21 | 安徽大学 | Intensive population estimation method based on deep learning |
CN106909924A (en) * | 2017-02-18 | 2017-06-30 | 北京工业大学 | A kind of remote sensing image method for quickly retrieving based on depth conspicuousness |
CN107220657A (en) * | 2017-05-10 | 2017-09-29 | 中国地质大学(武汉) | A kind of method of high-resolution remote sensing image scene classification towards small data set |
CN107133220A (en) * | 2017-06-07 | 2017-09-05 | 东南大学 | Name entity recognition method in a kind of Geography field |
CN107316295A (en) * | 2017-07-02 | 2017-11-03 | 苏州大学 | A kind of fabric defects detection method based on deep neural network |
CN107560849A (en) * | 2017-08-04 | 2018-01-09 | 华北电力大学 | A kind of Wind turbines Method for Bearing Fault Diagnosis of multichannel depth convolutional neural networks |
CN107944386A (en) * | 2017-11-22 | 2018-04-20 | 天津大学 | Visual scene recognition methods based on convolutional neural networks |
Non-Patent Citations (1)
Title |
---|
谭智勇等: "一种基于深度卷积神经网络的人群密度估计方法", 《计算机应用与软件》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113158780A (en) * | 2021-03-09 | 2021-07-23 | 中国科学院深圳先进技术研究院 | Regional crowd density estimation method, electronic device and storage medium |
CN113158780B (en) * | 2021-03-09 | 2023-10-27 | 中国科学院深圳先进技术研究院 | Regional crowd density estimation method, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110414301B (en) | 2023-06-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109389055B (en) | Video classification method based on mixed convolution and attention mechanism | |
CN108520535B (en) | Object classification method based on depth recovery information | |
Li et al. | Building-a-nets: Robust building extraction from high-resolution remote sensing images with adversarial networks | |
Sun et al. | Lattice long short-term memory for human action recognition | |
Tao et al. | Smoke detection based on deep convolutional neural networks | |
CN110135243B (en) | Pedestrian detection method and system based on two-stage attention mechanism | |
Fu et al. | Fast crowd density estimation with convolutional neural networks | |
CN108510012A (en) | A kind of target rapid detection method based on Analysis On Multi-scale Features figure | |
CN112507777A (en) | Optical remote sensing image ship detection and segmentation method based on deep learning | |
CN109886225A (en) | A kind of image gesture motion on-line checking and recognition methods based on deep learning | |
CN108960059A (en) | A kind of video actions recognition methods and device | |
CN108537743A (en) | A kind of face-image Enhancement Method based on generation confrontation network | |
CN109919011A (en) | A kind of action video recognition methods based on more duration informations | |
CN112288627A (en) | Recognition-oriented low-resolution face image super-resolution method | |
CN113379771B (en) | Hierarchical human body analysis semantic segmentation method with edge constraint | |
CN107818307A (en) | A kind of multi-tag Video Events detection method based on LSTM networks | |
Desai et al. | Next frame prediction using ConvLSTM | |
CN112906520A (en) | Gesture coding-based action recognition method and device | |
CN114627269A (en) | Virtual reality security protection monitoring platform based on degree of depth learning target detection | |
CN113255464A (en) | Airplane action recognition method and system | |
Wu et al. | Spatial-temporal graph network for video crowd counting | |
CN112084952A (en) | Video point location tracking method based on self-supervision training | |
CN116977674A (en) | Image matching method, related device, storage medium and program product | |
Liu et al. | Axial assembled correspondence network for few-shot semantic segmentation | |
Liu et al. | Online human action recognition with spatial and temporal skeleton features using a distributed camera network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |