CN109255381A - A kind of image classification method based on the sparse adaptive depth network of second order VLAD - Google Patents
A kind of image classification method based on the sparse adaptive depth network of second order VLAD Download PDFInfo
- Publication number
- CN109255381A CN109255381A CN201811038736.2A CN201811038736A CN109255381A CN 109255381 A CN109255381 A CN 109255381A CN 201811038736 A CN201811038736 A CN 201811038736A CN 109255381 A CN109255381 A CN 109255381A
- Authority
- CN
- China
- Prior art keywords
- vlad
- order
- saso
- feature
- vladnet
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The present invention proposes a kind of image classification method based on the sparse adaptive depth network of second order VLAD, belongs to image classification and depth learning technology field.The present invention extracts convolution feature from multiple convolutional layers first, and corresponding SASO-VLAD coding is then obtained in each convolution feature, finally summarizes all SASO-VLAD codings, constructs final multipath feature coding network.This method uses the newly encoded method of sparse adaptive soft allocated code as weight coefficient, uses the cascade of single order and second order VLAD coding as final character representation on the basis of the existing encoding model of VLAD end to end.Compare NetVLAD model, sparse strategy of the invention and second order indicate to effectively improve the validity of image classification, multipath trains multiple feature coding networks using basic, normal, high levels characteristic simultaneously, more stronger than expression ability of the single-stage feature coding network to characteristics of image.
Description
Technical field
The invention belongs to image classifications and deep learning technology field, and in particular to one kind is sparse adaptive based on second order VLAD
Answer the image classification method of depth network.
Background technique
Deep learning model achieves excellent performance in computer vision field, and main application direction includes view
Feel classification, super-resolution imaging, semantic segmentation, target detection and vision tracking.Compared with traditional statistical learning method, depth
Learning model has two major advantages in that (1) can obtain certain computer visual task by training method end to end
The weight being more suitable for.(2) from large-scale image data focusing study to deep structure feature can better describe original graph
Picture.Compared to traditional manual feature (SIFT feature or HOG feature) method, depth characteristic method can significant ground enhancing
Energy.
In view of the huge advantage of end to end model and further feature, nearest a few thing is by conventional statistics learning method
Domain knowledge be embedded into deep neural network, and the entire model of training in a manner of end to end.The nerve of these new constructions
Network not only inherits domain-specific knowledge, but also all parameters is made to be more suitable for final application task.
Feature coding is a kind of vision sorter statistical learning method of prevalence.In traditional feature coding frame, feature
Coding method is the core component of connection features extraction and feature pool, and is influenced on vision sorter performance very big.Popular
Feature coding method includes hard coded, soft coding, convolution sparse coding, local restriction coding, local feature Aggregation Descriptor
(VLAD) coding etc..In traditional feature coding method all algorithm assemblies (feature extraction, dictionary learning, feature coding and
Classifier training) parameter that is all independent from each other, therefore learns may not be optimal for image classification.This
Outside, SIFT (Scale invariant features transform) feature used in traditional characteristic coding method cannot indicate image well.Recently,
Traditional VLAD coding (NetVLAD) model is extended to the referred to as end to end model of NetVLAD.NetVLAD layers combine depth
CNN carries out joint training, to obtain outstanding image classification and image searching result, in addition, NetVLAD model is acting
Classification field demonstrates its validity.But existing NetVLAD model is used only the single order from space scale and polymerize letter
Breath, the resolving ability of end-to-end feature coding network are not yet sufficiently studied.
Summary of the invention
The present invention in order to overcome existing NetVLAD model, not yet sufficiently grind by the resolving ability of end-to-end feature coding network
The shortcomings that studying carefully proposes a kind of image classification method based on the sparse adaptive depth network of second order VLAD.This method is existing
On the basis of NetVLAD model, use the newly encoded method of sparse adaptive soft allocated code (SASAC) as weight coefficient, benefit
Sparse Adaptive Second-Order VLAD model (SASO-VLADNet) end to end is indicated jointly with single order and second order VLAD coding, from more
A convolutional layer extracts convolution feature, passes through the multichannel feature coding network (M-SASO- being made of multiple SASO-VLADNet
VLADNet final feature coding) is generated, is lost finally by full articulamentum and loss layer output category.
The purpose of the present invention is realized especially by following technical solution.
A kind of image classification method based on the sparse adaptive depth network of second order VLAD, this method use end-to-end training
Multichannel feature coding network, extract nonlinear convolution feature from the subsequent activation primitive of multiple convolutional layers first, then exist
It is (sparse adaptive that corresponding sparse Adaptive Second-Order-local feature Aggregation Descriptor SASO-VLAD is calculated in each convolution feature
Second order-local feature Aggregation Descriptor) coding, finally summarizes all SASO-VLAD codings, constructs final multipath feature
Coding network (M-SASO-VLADNet) is lost by full articulamentum and loss layer output category;The SASO-VLAD coding makes
Sparse weight coefficient is obtained with sparse adaptive soft allocated code (SASAC), is encoded using single order and second order VLAD common
Indicate sparse Adaptive Second-Order VLAD model (SASO-VLADNet) end to end.
Further, in described sparse adaptive soft this new coding method of allocated code (SASAC), it is sparse from
The variant that soft allocated code (SASAC) layer is multidimensional Gaussian probability-density function is adapted to, and adaptive by mode end to end
Ground learns all parameters, including dictionary and variance parameter;SASAC layers only retain T maximum probability, and force other small probabilities to be
Zero to obtain sparse weight coefficient.
Further, the SASO-VLAD end to end constitutes SASO-VLADNet layers, and network constitutes step are as follows:
Step 3.1: using a specific CNN feature F of convolutional layeriBy SASAC layers be multiplied to obtain one after dimensionality reduction layer
Rank statistical information ξ1(Fi);
Step 3.2: ξ1(Fi) by being normalized after average pond layer by L2 norm, ξ1(Fi) by two stratum obtain two
Rank statistical information ξ2(Fi) after by L2 norm normalize, normalize to obtain most by L2 norm after connecting two normalized outputs
After export;The dimension reduction method is affine subspace method.
Further, the SASAC layers of expression formula are as follows:
Wherein | | | |2The L2 norm of representation vector,The specific convolutional layer feature of i-th of image of representative model
Descriptor set, a shared M descriptor in this descriptor set, fij∈RD×1It is FiJ-th of descriptor, D representation vector dimension
Degree, ak∈RD×1,bk∈RD×1,vk∈ R, (k=1,2 ..., K) it is f respectivelyijWeight, fijBiasing and it is normalized partially
Set, these parameters be all in SASO-VLADNet can training parameter.These parameters one share K group, and k indicates a certain group specific
The index of parameter.K' indicates to meet set ST(fij) condition several groups parameter index.
ST(fij) it is the set for meeting following condition:
WhereinIt is ST(fij) complementary set, Card (ST(fij)) it is ST(fij) first prime number.
Further, activation primitive can be one of sigmoid function, tanh function and ReLU function;
Further, the first-order statistics information ξ1(Fi) expression formula are as follows:
The descriptor set of the specific convolutional layer feature of i-th of image of representative model, in this descriptor set
One shared M descriptor, fij∈RD×1It is FiJ-th of descriptor, D representation vector dimension, λij(k) in claim 4
SASAC layers of code coefficient, Uk,μkFor in first-order statistics information dimensionality reduction matrix and biasing, an and shared K group dimensionality reduction matrix
And biasing, k indicate the index of specific a certain group of dimensionality reduction matrix and biasing, (Ukfij+μk) indicate kth group affine subspace layer.
Dimensionality reduction matrix and biasing be all in SASO-VLADNet can training parameter.
Further, second-order statistics information ξ2(Fi) utilize the interaction feature of covariance matrix acquisition interchannel, second-order statistics
Information ξ2(Fi) expression
Formula are as follows:
Wherein vec is the vector operation by matrix conversion for corresponding column vector.
Further, the forward direction operation of the SASO-VLADNet model updates the final loss of depth network first,
Then loss is propagated backward into input about the gradient of each parameter to update SASO-VLADNet layers;The classification of the output
Loss is the softmax loss of standard.
Further, the multichannel feature coding network (M-SASO-VLADNet) uses basic, normal, high multiple grades simultaneously
Convolution feature train multiple feature coding networks.
Further, the parameter updating step of the complete model includes:
Step 1: in each SASO-VLADNet layers of acquisition initiation parameter;
Step 2: being initialized by each SASO-VLADNet coding and final softmax classifier final complete
The weight of articulamentum;
Step 3: using above-mentioned initiation parameter and based on training method end to end, the gradient of softmax classifier
Information is used to update each layer in M-SASO-VLADNet of parameter until classifier loses curve convergence.
Compared with prior art, what the method for the present invention was proposed is a kind of based on the sparse adaptive depth network of second order VLAD
Image classification method has the advantages that
NetVLAD model is compared, sparse strategy of the invention and second order indicate to effectively increase the performance of image classification, more
Path trains multiple feature coding networks using basic, normal, high levels characteristic simultaneously, than single-stage feature coding network to image spy
The expression ability of sign is stronger.
Detailed description of the invention
Fig. 1 is the flow diagram of the method for the present invention;
Fig. 2 is SASO-VLADNet layers in the method for the present invention of network structure;
Fig. 3 is M-SASO-VLADNet network structure in the method for the present invention.
Specific embodiment
In order to clearly demonstrate the objectives, technical solutions, and advantages of the present invention, with reference to the accompanying drawings and embodiments, to this hair
It is bright to be further elaborated.If being this field it is noted that having the process or symbol of not special detailed description below
Technical staff can refer to the prior art realize or understand.It should be understood that specific embodiment described herein is only
To explain the present invention, it is not construed as the scope of protection of the patent of the present invention, the present invention is using appended claims as patent protection model
It encloses.In addition, as long as technical characteristic involved in each embodiment of invention described below is not constituted each other
Conflict can be combined with each other.
As shown in Figure 1, a kind of image classification method based on the sparse adaptive depth network of second order VLAD includes following step
It is rapid:
Step 1: image being pre-processed using depth convolutional Neural net, chooses specific L=4 convolutional layer, is extracted every
A convolutional layer is by the feature after activation primitive as L=4 input vector;
Specifically, the single-stage feature of SASO-VLADNet and the multistage of M-SASO-VLADNet are extracted using VGG-VD network
Feature, for SASO-VLADNet, extracted single-stage feature is the feature of the relu5_3 convolutional layer of VGG-VD network,
For M-SASO-VLADNet, extracted multi-stage characteristics are the relu5_1, relu5_2, relu5_3 of VGG-VD network
With the feature of pool5 this 4 convolutional layers.The size of all images is adjusted to 448 × 448 pixels, with random cropping technology and with
Machine mirror image technology enhances image, and using flexible efficient deep learning library Mxnet realizes depth CNN feature extraction.
Specifically, activation primitive is one of sigmoid function, tanh function and ReLU function.
Step 2: as shown in Fig. 2, some specific convolutional layer feature is (in relu5_1, relu5_2, relu5_3 and pool5
One) SASO-VLADNet coding calculating process it is as follows:
Step 2.1: using the feature (one in relu5_1, relu5_2, relu5_3 and pool5 of a specific convolutional layer
It is a) FiBy being multiplied to obtain first-order statistics information ξ after sparse adaptive soft allocated code (SASAC) layer and dimensionality reduction layer1(Fi);
Step 2.2: ξ1(Fi) by being normalized after average pond layer by L2 norm, ξ1(Fi) by two stratum obtain two
Rank statistical information ξ2(Fi), ξ2(Fi) normalized by L2 norm, two normalized outputs are connected, are normalized using L2 norm
Obtain SASO-VLADNet layers of output.
Specifically, for SASO-VLADNet, by VGG-VD netinit, which is the depth CNN of front end
Obtained from extensive ImageNet data set pre-training, then, using a specific CNN feature (relu5_1, relu5_2,
One in relu5_3 and pool5) come learn initialize dictionaryDictionary is initialized by the K-means in the library VLFeat
Algorithm obtains.In SASO-VLADNet model, the general K=128 that chooses can obtain performance good enough, therefore K=is arranged
128。
Step 3:The descriptor set of the specific convolutional layer feature of i-th of image of representative model, fij∈RD×1It is Fi
J-th of descriptor, the number of active lanes of D representation vector dimension namely convolution feature is last several layers of for VGG-VD network
The number of active lanes of convolution feature is 512, so in SASO-VLADNet, D=512.
The SASAC layer expression formula of neotectonics in SASO-VLADNet layers are as follows:
Wherein | | | |2The L2 norm of representation vector,The specific convolutional layer feature of i-th of image of representative model
Descriptor set, a shared M descriptor in this descriptor set, fij∈RD×1It is FiJ-th of descriptor, D representation vector dimension
Degree, ak∈RD×1,bk∈RD×1,vk∈ R, (k=1,2 ..., K) it is f respectivelyijWeight, fijBiasing and it is normalized partially
Set, these parameters be all in SASO-VLADNet can training parameter.These parameters one share K group, and k indicates a certain group specific
The index of parameter.K' indicates to meet set ST(fij) condition several groups parameter index.
ST(fij) it is the set for meeting following condition:
WhereinIt is ST(fij) complementary set, Card (ST(fij)) it is ST(fij) first prime number.
Specifically, the value of T maximum value of SASAC layers of holding, T cannot be too big or too small, and specific T value is by cross validation
It determines.By relevant experimental verification, for simplicity generally setting T=5.
Step 4: carrying out dimensionality reduction using affine subspace method;
Affine subspace layer in SASO-VLADNet are as follows: Rk=Uk(fij-ck)=(Ukfij+μk)
Wherein μk=-Ukck∈RP×1,Uk∈RP×D(k=1,2 ..., k) is the dimensionality reduction projection square in affine subspace method
Battle array, P is subspace dimension.P determines final characteristic length, in order to enable mark sheet is shown with relatively small dimension and has enough
Good performance, generally setting P=128.
First-order statistics information ξ1(Fi) expression are as follows:
Specifically, (Ukfij+μk) convolution weight can be regarded as Uk, it is biased to μk1 × 1 convolutional layer, using traditional
CNN training method efficiently trains affine subspace layer end to end.
Step 5: second-order statistics information ξ2(Fi) expression formula are as follows:
Wherein vec is the vector operation by matrix conversion for corresponding column vector.
Specifically, it is indicated using the interaction that the covariance matrix of single order feature obtains feature interchannel, due to second-order statistics
Information can be micro-, so second-order statistics Information Level can be trained by mode end to end.
Step 6: since affine subspace layer and second-order statistic layer can be trained with existing end-to-end method,
And SASAC layers be a brand new network layer, provide SASAC layers of specific backpropagation function here carry out end arrive
The training at end:
6.1:SASAC layers of expression formula of step are equivalent to three expression formulas to each k (k=1,2 ..., K):
Second expression formula of SASAC layers of equivalent expression can regard a kind of mutation of maximum pond layer as, which protects
T maximum value is held, forcing remaining value is 0;Third expression formula is that normalization layer obtains normalized weight coefficient.
Step 6.2: to each k, the gradient that final Classification Loss J is exported relative to SASAC layers isBased on chain
Formula rule obtains γij(k) and βij(k) pressure gradient expression formula are as follows:
Step 6.3: being based on the βij(k) second expression of (k=1,2 ..., K) and SASAC layers of equivalent expression group
Formula can obtain loss J relative to fijPressure gradient expression formula:
Step 6.4: being based on the βij(k) second expression of (k=1,2 ..., K) and SASAC layers of equivalent expression group
Formula can obtain loss J relative to ak,bk,vkPressure gradient expression formula:
Step 7: after inputting pretreated image, the convolution feature F of the specific convolutional layer of available i-th of picturei, Fi
SASO-VLAD (sparse Adaptive Second-Order-local feature Aggregation Descriptor) indicate final expression formula are as follows:
Wherein, L2norm is the L2 norm method for normalizing an of vector, ak,bk,vk,Uk,μk(k=1,2 ..., k) be
In SASO-VLADNet can training parameter.
Specifically, ak,bk,vk,Uk,μk(k=1,2 ..., K) these parameters are to learn to obtain by mode end to end
's.
In parameter renewal process in SASO-VLADNet, pass through the preceding final damage that depth network is updated to operation first
It loses, loss is then propagated backward into input about the gradient of each parameter to update entire SASO-VLADNet model.
Step 8: encoding (relu5_1, relu5_2, relu5_3 and pool5 convolution when obtaining L=4 SASO-VLADNet
The coding that feature generates) after, this 4 codings are cascaded up to obtain final M-SASO-VLADNet coding, as shown in Figure 3.
M-SASO-VLADNet coding obtains Classification Loss by final full articulamentum, loss layer, and loss layer is the softmax of standard
Loss, is denoted as:
Wherein, C is classification quantity, and 1 { } was indicator function, 1 { a true state }=1,1 { a false state }=
0, yiRepresent the class label of i-th of image, ρicIt is L=4 SASO-VLADNet (by relu5_1, relu5_2, relu5_
4 SASO-VLADNet coding that 3 and pool5 is generated) whole prediction scores:
Wherein,WithIt is a full connection (FC) layer of l (l=1,2 ..., L)
Weight and biasing.
Specifically, ρicIt further indicates that are as follows: ρic=(Gc)T[ξ(Fi (1));ξ(Fi (2));…ξ(Fi (L))]+(Bc)T
To trained SASO-VLADNet and M-SASO-VLADNet in destination image data collection (Caltech256 data
Collection), it tests in fine granularity image data set (CUB200 data set, StandFord Car data set) and texture image dataset
Their image classification performance, compared to NetVLAD model, SASO-VLADNet promotes the image recognition rate of 2-4%.And institute
The multi-channel network (M-SASO-VLADNet) of proposition improves 1% or so than proposed one-way network (SASO-VLADNet)
Image recognition rate.
Step 9: the complete parameter updating step based on the sparse adaptive depth network of second order VLAD includes:
Step 9.1: in each SASO-VLADNet layers of acquisition initiation parameter;
Step 9.2: final to initialize by each SASO-VLADNet coding and final softmax classifier
The weight of full articulamentum;
Step 9.3: using above-mentioned initiation parameter and based on training method end to end, the ladder of softmax classifier
Degree information is used to update each layer in M-SASO-VLADNet of parameter until classifier loses curve convergence.
Claims (10)
1. a kind of image classification method based on the sparse adaptive depth network of second order VLAD, which is characterized in that using end-to-end
Trained multichannel feature coding network extracts nonlinear convolution feature from the subsequent activation primitive of multiple convolutional layers first, so
Corresponding sparse Adaptive Second-Order-local feature Aggregation Descriptor SASO-VLAD coding is calculated in each convolution feature afterwards, most
Summarize all SASO-VLAD codings afterwards, final multipath feature coding network M-SASO-VLADNet is constructed, by connecting entirely
Connect layer and the loss of loss layer output category;The SASO-VLAD coding is obtained using sparse adaptive soft allocated code SASAC
Sparse weight coefficient encodes common expression sparse Adaptive Second-Order VLAD model end to end using single order and second order VLAD
SASO-VLADNet。
2. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 1,
It is characterized in that, in described sparse this new coding method of adaptive soft allocated code SASAC, sparse adaptive soft distribution
SASAC layers of coding is the variant of multidimensional Gaussian probability-density function, and adaptively learns all ginsengs by mode end to end
Number, including dictionary and variance parameter;SASAC layers only retain T maximum probability, and forcing other small probabilities is zero sparse to obtain
Weight coefficient.
3. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 1,
It is characterized in that, the SASO-VLAD end to end constitutes SASO-VLADNet layers, and network constitutes step are as follows:
Step 3.1: using a specific CNN feature F of convolutional layeriBy SASAC layers be multiplied to obtain first-order statistics after dimensionality reduction layer
Information ξ1(Fi);
Step 3.2: ξ1(Fi) by being normalized after average pond layer by L2 norm, ξ1(Fi) by two stratum obtain second-order statistics
Information ξ2(Fi) after by L2 norm normalize, normalize to export to the end by L2 norm after connecting two normalized outputs;
The dimension reduction method is affine subspace method.
4. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 2 or 3,
It is characterized in that, the SASAC layers of expression formula are as follows:
Wherein | | | |2The L2 norm of representation vector,The description of the specific convolutional layer feature of i-th of image of representative model
Symbol collects, a shared M descriptor in this descriptor set, fij∈RD×1It is FiJ-th of descriptor, D representation vector dimension,
ak∈RD×1,bk∈RD×1,vk∈ R, (k=1,2 ..., K) it is f respectivelyijWeight, fijBiasing and normalized biasing, this
A little parameters be all in SASO-VLADNet can training parameter;These parameters one share K group, and k indicates specific a certain group of parameter
Index;K' indicates to meet set ST(fij) condition several groups parameter index;
ST(fij) it is the set for meeting following condition:
WhereinIt is ST(fij) complementary set, Card (ST(fij)) it is ST(fij) first prime number.
5. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claims 1 and 2,
It is characterized in that, activation primitive is one of sigmoid function, tanh function and ReLU function.
6. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 4,
It is characterized in that, the first-order statistics information ξ1(Fi) expression formula are as follows:
The descriptor set of the specific convolutional layer feature of i-th of image of representative model, in this descriptor set altogether
There are M descriptor, fij∈RD×1It is FiJ-th of descriptor, D representation vector dimension, λij(k) code coefficient for being SASAC layers,
Uk,μkFor in first-order statistics information dimensionality reduction matrix and biasing, and a shared K group dimensionality reduction matrix and biasing, k indicate specific
The index of a certain group of dimensionality reduction matrix and biasing, (Ukfij+μk) indicate kth group affine subspace layer;Dimensionality reduction matrix and biasing are all
In SASO-VLADNet can training parameter.
7. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 4,
It is characterized in that, second-order statistics information ξ2(Fi) utilize the interaction feature of covariance matrix acquisition interchannel, second-order statistics information ξ2
(Fi) expression formula are as follows:
Wherein vec is the vector operation by matrix conversion for corresponding column vector.
8. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 4,
It is characterized in that, the forward direction operation of the SASO-VLADNet model updates the final loss of depth network first, then will damage
It loses and propagates backward to input about the gradient of each parameter to update SASO-VLADNet layers;The Classification Loss of the output is mark
Quasi- softmax loss.
9. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 1,
It is characterized in that, the multichannel feature coding network trains multiple features using the convolution feature of basic, normal, high multiple grades simultaneously
Coding network.
10. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 1,
The parameter updating step for being characterized in that the complete model includes:
Step 1: in each SASO-VLADNet layers of acquisition initiation parameter;
Step 2: final full connection is initialized by each SASO-VLADNet coding and final softmax classifier
The weight of layer;
Step 3: using above-mentioned initiation parameter and based on training method end to end, the gradient information of softmax classifier
For updating each layer in M-SASO-VLADNet of parameter until classifier loses curve convergence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811038736.2A CN109255381B (en) | 2018-09-06 | 2018-09-06 | Image classification method based on second-order VLAD sparse adaptive depth network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811038736.2A CN109255381B (en) | 2018-09-06 | 2018-09-06 | Image classification method based on second-order VLAD sparse adaptive depth network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109255381A true CN109255381A (en) | 2019-01-22 |
CN109255381B CN109255381B (en) | 2022-03-29 |
Family
ID=65047079
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811038736.2A Active CN109255381B (en) | 2018-09-06 | 2018-09-06 | Image classification method based on second-order VLAD sparse adaptive depth network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109255381B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109784420A (en) * | 2019-01-29 | 2019-05-21 | 深圳市商汤科技有限公司 | A kind of image processing method and device, computer equipment and storage medium |
CN109901207A (en) * | 2019-03-15 | 2019-06-18 | 武汉大学 | A kind of high-precision outdoor positioning method of Beidou satellite system and feature combinations |
CN110135460A (en) * | 2019-04-16 | 2019-08-16 | 广东工业大学 | Image information intensifying method based on VLAD convolution module |
CN110209859A (en) * | 2019-05-10 | 2019-09-06 | 腾讯科技(深圳)有限公司 | The method and apparatus and electronic equipment of place identification and its model training |
CN110991480A (en) * | 2019-10-31 | 2020-04-10 | 上海交通大学 | Attention mechanism-based sparse coding method |
CN111967528A (en) * | 2020-08-27 | 2020-11-20 | 北京大学 | Image identification method for deep learning network structure search based on sparse coding |
CN113139587A (en) * | 2021-03-31 | 2021-07-20 | 杭州电子科技大学 | Double-quadratic pooling model for self-adaptive interactive structure learning |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103336795A (en) * | 2013-06-09 | 2013-10-02 | 华中科技大学 | Video indexing method based on multiple features |
CN104408479A (en) * | 2014-11-28 | 2015-03-11 | 电子科技大学 | Massive image classification method based on deep vector of locally aggregated descriptors (VLAD) |
CN108460764A (en) * | 2018-03-31 | 2018-08-28 | 华南理工大学 | The ultrasonoscopy intelligent scissor method enhanced based on automatic context and data |
-
2018
- 2018-09-06 CN CN201811038736.2A patent/CN109255381B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103336795A (en) * | 2013-06-09 | 2013-10-02 | 华中科技大学 | Video indexing method based on multiple features |
CN104408479A (en) * | 2014-11-28 | 2015-03-11 | 电子科技大学 | Massive image classification method based on deep vector of locally aggregated descriptors (VLAD) |
CN108460764A (en) * | 2018-03-31 | 2018-08-28 | 华南理工大学 | The ultrasonoscopy intelligent scissor method enhanced based on automatic context and data |
Non-Patent Citations (1)
Title |
---|
CHEN ET AL.: "A novel localized and second order feature coding network for image recognition", 《PATTERN RECOGNITION》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109784420A (en) * | 2019-01-29 | 2019-05-21 | 深圳市商汤科技有限公司 | A kind of image processing method and device, computer equipment and storage medium |
CN109901207A (en) * | 2019-03-15 | 2019-06-18 | 武汉大学 | A kind of high-precision outdoor positioning method of Beidou satellite system and feature combinations |
CN110135460A (en) * | 2019-04-16 | 2019-08-16 | 广东工业大学 | Image information intensifying method based on VLAD convolution module |
CN110209859A (en) * | 2019-05-10 | 2019-09-06 | 腾讯科技(深圳)有限公司 | The method and apparatus and electronic equipment of place identification and its model training |
CN110209859B (en) * | 2019-05-10 | 2022-12-27 | 腾讯科技(深圳)有限公司 | Method and device for recognizing places and training models of places and electronic equipment |
CN110991480A (en) * | 2019-10-31 | 2020-04-10 | 上海交通大学 | Attention mechanism-based sparse coding method |
CN111967528A (en) * | 2020-08-27 | 2020-11-20 | 北京大学 | Image identification method for deep learning network structure search based on sparse coding |
CN111967528B (en) * | 2020-08-27 | 2023-12-26 | 北京大学 | Image recognition method for deep learning network structure search based on sparse coding |
CN113139587A (en) * | 2021-03-31 | 2021-07-20 | 杭州电子科技大学 | Double-quadratic pooling model for self-adaptive interactive structure learning |
CN113139587B (en) * | 2021-03-31 | 2024-02-06 | 杭州电子科技大学 | Double secondary pooling model for self-adaptive interactive structure learning |
Also Published As
Publication number | Publication date |
---|---|
CN109255381B (en) | 2022-03-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109255381A (en) | A kind of image classification method based on the sparse adaptive depth network of second order VLAD | |
Morgado et al. | Semantically consistent regularization for zero-shot recognition | |
CN112308158B (en) | Multi-source field self-adaptive model and method based on partial feature alignment | |
CN109299342B (en) | Cross-modal retrieval method based on cycle generation type countermeasure network | |
CN110298037B (en) | Convolutional neural network matching text recognition method based on enhanced attention mechanism | |
Chen et al. | Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning | |
CN107526785B (en) | Text classification method and device | |
Liao et al. | Learning deep parsimonious representations | |
CN108121975B (en) | Face recognition method combining original data and generated data | |
Bradley et al. | Differential sparse coding | |
CN109063666A (en) | The lightweight face identification method and system of convolution are separated based on depth | |
CN111414461B (en) | Intelligent question-answering method and system fusing knowledge base and user modeling | |
CN113326731B (en) | Cross-domain pedestrian re-identification method based on momentum network guidance | |
US11748919B2 (en) | Method of image reconstruction for cross-modal communication system and device thereof | |
CN110378208B (en) | Behavior identification method based on deep residual error network | |
CN111444367A (en) | Image title generation method based on global and local attention mechanism | |
CN113239784A (en) | Pedestrian re-identification system and method based on space sequence feature learning | |
CN112749274B (en) | Chinese text classification method based on attention mechanism and interference word deletion | |
CN112784929B (en) | Small sample image classification method and device based on double-element group expansion | |
CN113688894B (en) | Fine granularity image classification method integrating multiple granularity features | |
CN114896434B (en) | Hash code generation method and device based on center similarity learning | |
CN109815496A (en) | Based on capacity adaptive shortening mechanism carrier production text steganography method and device | |
CN109033294A (en) | A kind of mixed recommendation method incorporating content information | |
Li et al. | Embedded stacked group sparse autoencoder ensemble with L1 regularization and manifold reduction | |
CN113627543A (en) | Anti-attack detection method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |