CN109255381A - A kind of image classification method based on the sparse adaptive depth network of second order VLAD - Google Patents

A kind of image classification method based on the sparse adaptive depth network of second order VLAD Download PDF

Info

Publication number
CN109255381A
CN109255381A CN201811038736.2A CN201811038736A CN109255381A CN 109255381 A CN109255381 A CN 109255381A CN 201811038736 A CN201811038736 A CN 201811038736A CN 109255381 A CN109255381 A CN 109255381A
Authority
CN
China
Prior art keywords
vlad
order
saso
feature
vladnet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811038736.2A
Other languages
Chinese (zh)
Other versions
CN109255381B (en
Inventor
王倩倩
陈博恒
刘娇蛟
马碧云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201811038736.2A priority Critical patent/CN109255381B/en
Publication of CN109255381A publication Critical patent/CN109255381A/en
Application granted granted Critical
Publication of CN109255381B publication Critical patent/CN109255381B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The present invention proposes a kind of image classification method based on the sparse adaptive depth network of second order VLAD, belongs to image classification and depth learning technology field.The present invention extracts convolution feature from multiple convolutional layers first, and corresponding SASO-VLAD coding is then obtained in each convolution feature, finally summarizes all SASO-VLAD codings, constructs final multipath feature coding network.This method uses the newly encoded method of sparse adaptive soft allocated code as weight coefficient, uses the cascade of single order and second order VLAD coding as final character representation on the basis of the existing encoding model of VLAD end to end.Compare NetVLAD model, sparse strategy of the invention and second order indicate to effectively improve the validity of image classification, multipath trains multiple feature coding networks using basic, normal, high levels characteristic simultaneously, more stronger than expression ability of the single-stage feature coding network to characteristics of image.

Description

A kind of image classification method based on the sparse adaptive depth network of second order VLAD
Technical field
The invention belongs to image classifications and deep learning technology field, and in particular to one kind is sparse adaptive based on second order VLAD Answer the image classification method of depth network.
Background technique
Deep learning model achieves excellent performance in computer vision field, and main application direction includes view Feel classification, super-resolution imaging, semantic segmentation, target detection and vision tracking.Compared with traditional statistical learning method, depth Learning model has two major advantages in that (1) can obtain certain computer visual task by training method end to end The weight being more suitable for.(2) from large-scale image data focusing study to deep structure feature can better describe original graph Picture.Compared to traditional manual feature (SIFT feature or HOG feature) method, depth characteristic method can significant ground enhancing Energy.
In view of the huge advantage of end to end model and further feature, nearest a few thing is by conventional statistics learning method Domain knowledge be embedded into deep neural network, and the entire model of training in a manner of end to end.The nerve of these new constructions Network not only inherits domain-specific knowledge, but also all parameters is made to be more suitable for final application task.
Feature coding is a kind of vision sorter statistical learning method of prevalence.In traditional feature coding frame, feature Coding method is the core component of connection features extraction and feature pool, and is influenced on vision sorter performance very big.Popular Feature coding method includes hard coded, soft coding, convolution sparse coding, local restriction coding, local feature Aggregation Descriptor (VLAD) coding etc..In traditional feature coding method all algorithm assemblies (feature extraction, dictionary learning, feature coding and Classifier training) parameter that is all independent from each other, therefore learns may not be optimal for image classification.This Outside, SIFT (Scale invariant features transform) feature used in traditional characteristic coding method cannot indicate image well.Recently, Traditional VLAD coding (NetVLAD) model is extended to the referred to as end to end model of NetVLAD.NetVLAD layers combine depth CNN carries out joint training, to obtain outstanding image classification and image searching result, in addition, NetVLAD model is acting Classification field demonstrates its validity.But existing NetVLAD model is used only the single order from space scale and polymerize letter Breath, the resolving ability of end-to-end feature coding network are not yet sufficiently studied.
Summary of the invention
The present invention in order to overcome existing NetVLAD model, not yet sufficiently grind by the resolving ability of end-to-end feature coding network The shortcomings that studying carefully proposes a kind of image classification method based on the sparse adaptive depth network of second order VLAD.This method is existing On the basis of NetVLAD model, use the newly encoded method of sparse adaptive soft allocated code (SASAC) as weight coefficient, benefit Sparse Adaptive Second-Order VLAD model (SASO-VLADNet) end to end is indicated jointly with single order and second order VLAD coding, from more A convolutional layer extracts convolution feature, passes through the multichannel feature coding network (M-SASO- being made of multiple SASO-VLADNet VLADNet final feature coding) is generated, is lost finally by full articulamentum and loss layer output category.
The purpose of the present invention is realized especially by following technical solution.
A kind of image classification method based on the sparse adaptive depth network of second order VLAD, this method use end-to-end training Multichannel feature coding network, extract nonlinear convolution feature from the subsequent activation primitive of multiple convolutional layers first, then exist It is (sparse adaptive that corresponding sparse Adaptive Second-Order-local feature Aggregation Descriptor SASO-VLAD is calculated in each convolution feature Second order-local feature Aggregation Descriptor) coding, finally summarizes all SASO-VLAD codings, constructs final multipath feature Coding network (M-SASO-VLADNet) is lost by full articulamentum and loss layer output category;The SASO-VLAD coding makes Sparse weight coefficient is obtained with sparse adaptive soft allocated code (SASAC), is encoded using single order and second order VLAD common Indicate sparse Adaptive Second-Order VLAD model (SASO-VLADNet) end to end.
Further, in described sparse adaptive soft this new coding method of allocated code (SASAC), it is sparse from The variant that soft allocated code (SASAC) layer is multidimensional Gaussian probability-density function is adapted to, and adaptive by mode end to end Ground learns all parameters, including dictionary and variance parameter;SASAC layers only retain T maximum probability, and force other small probabilities to be Zero to obtain sparse weight coefficient.
Further, the SASO-VLAD end to end constitutes SASO-VLADNet layers, and network constitutes step are as follows:
Step 3.1: using a specific CNN feature F of convolutional layeriBy SASAC layers be multiplied to obtain one after dimensionality reduction layer Rank statistical information ξ1(Fi);
Step 3.2: ξ1(Fi) by being normalized after average pond layer by L2 norm, ξ1(Fi) by two stratum obtain two Rank statistical information ξ2(Fi) after by L2 norm normalize, normalize to obtain most by L2 norm after connecting two normalized outputs After export;The dimension reduction method is affine subspace method.
Further, the SASAC layers of expression formula are as follows:
Wherein | | | |2The L2 norm of representation vector,The specific convolutional layer feature of i-th of image of representative model Descriptor set, a shared M descriptor in this descriptor set, fij∈RD×1It is FiJ-th of descriptor, D representation vector dimension Degree, ak∈RD×1,bk∈RD×1,vk∈ R, (k=1,2 ..., K) it is f respectivelyijWeight, fijBiasing and it is normalized partially Set, these parameters be all in SASO-VLADNet can training parameter.These parameters one share K group, and k indicates a certain group specific The index of parameter.K' indicates to meet set ST(fij) condition several groups parameter index.
ST(fij) it is the set for meeting following condition:
WhereinIt is ST(fij) complementary set, Card (ST(fij)) it is ST(fij) first prime number.
Further, activation primitive can be one of sigmoid function, tanh function and ReLU function;
Further, the first-order statistics information ξ1(Fi) expression formula are as follows:
The descriptor set of the specific convolutional layer feature of i-th of image of representative model, in this descriptor set One shared M descriptor, fij∈RD×1It is FiJ-th of descriptor, D representation vector dimension, λij(k) in claim 4 SASAC layers of code coefficient, UkkFor in first-order statistics information dimensionality reduction matrix and biasing, an and shared K group dimensionality reduction matrix And biasing, k indicate the index of specific a certain group of dimensionality reduction matrix and biasing, (Ukfijk) indicate kth group affine subspace layer. Dimensionality reduction matrix and biasing be all in SASO-VLADNet can training parameter.
Further, second-order statistics information ξ2(Fi) utilize the interaction feature of covariance matrix acquisition interchannel, second-order statistics Information ξ2(Fi) expression
Formula are as follows:
Wherein vec is the vector operation by matrix conversion for corresponding column vector.
Further, the forward direction operation of the SASO-VLADNet model updates the final loss of depth network first, Then loss is propagated backward into input about the gradient of each parameter to update SASO-VLADNet layers;The classification of the output Loss is the softmax loss of standard.
Further, the multichannel feature coding network (M-SASO-VLADNet) uses basic, normal, high multiple grades simultaneously Convolution feature train multiple feature coding networks.
Further, the parameter updating step of the complete model includes:
Step 1: in each SASO-VLADNet layers of acquisition initiation parameter;
Step 2: being initialized by each SASO-VLADNet coding and final softmax classifier final complete The weight of articulamentum;
Step 3: using above-mentioned initiation parameter and based on training method end to end, the gradient of softmax classifier Information is used to update each layer in M-SASO-VLADNet of parameter until classifier loses curve convergence.
Compared with prior art, what the method for the present invention was proposed is a kind of based on the sparse adaptive depth network of second order VLAD Image classification method has the advantages that
NetVLAD model is compared, sparse strategy of the invention and second order indicate to effectively increase the performance of image classification, more Path trains multiple feature coding networks using basic, normal, high levels characteristic simultaneously, than single-stage feature coding network to image spy The expression ability of sign is stronger.
Detailed description of the invention
Fig. 1 is the flow diagram of the method for the present invention;
Fig. 2 is SASO-VLADNet layers in the method for the present invention of network structure;
Fig. 3 is M-SASO-VLADNet network structure in the method for the present invention.
Specific embodiment
In order to clearly demonstrate the objectives, technical solutions, and advantages of the present invention, with reference to the accompanying drawings and embodiments, to this hair It is bright to be further elaborated.If being this field it is noted that having the process or symbol of not special detailed description below Technical staff can refer to the prior art realize or understand.It should be understood that specific embodiment described herein is only To explain the present invention, it is not construed as the scope of protection of the patent of the present invention, the present invention is using appended claims as patent protection model It encloses.In addition, as long as technical characteristic involved in each embodiment of invention described below is not constituted each other Conflict can be combined with each other.
As shown in Figure 1, a kind of image classification method based on the sparse adaptive depth network of second order VLAD includes following step It is rapid:
Step 1: image being pre-processed using depth convolutional Neural net, chooses specific L=4 convolutional layer, is extracted every A convolutional layer is by the feature after activation primitive as L=4 input vector;
Specifically, the single-stage feature of SASO-VLADNet and the multistage of M-SASO-VLADNet are extracted using VGG-VD network Feature, for SASO-VLADNet, extracted single-stage feature is the feature of the relu5_3 convolutional layer of VGG-VD network, For M-SASO-VLADNet, extracted multi-stage characteristics are the relu5_1, relu5_2, relu5_3 of VGG-VD network With the feature of pool5 this 4 convolutional layers.The size of all images is adjusted to 448 × 448 pixels, with random cropping technology and with Machine mirror image technology enhances image, and using flexible efficient deep learning library Mxnet realizes depth CNN feature extraction.
Specifically, activation primitive is one of sigmoid function, tanh function and ReLU function.
Step 2: as shown in Fig. 2, some specific convolutional layer feature is (in relu5_1, relu5_2, relu5_3 and pool5 One) SASO-VLADNet coding calculating process it is as follows:
Step 2.1: using the feature (one in relu5_1, relu5_2, relu5_3 and pool5 of a specific convolutional layer It is a) FiBy being multiplied to obtain first-order statistics information ξ after sparse adaptive soft allocated code (SASAC) layer and dimensionality reduction layer1(Fi);
Step 2.2: ξ1(Fi) by being normalized after average pond layer by L2 norm, ξ1(Fi) by two stratum obtain two Rank statistical information ξ2(Fi), ξ2(Fi) normalized by L2 norm, two normalized outputs are connected, are normalized using L2 norm Obtain SASO-VLADNet layers of output.
Specifically, for SASO-VLADNet, by VGG-VD netinit, which is the depth CNN of front end Obtained from extensive ImageNet data set pre-training, then, using a specific CNN feature (relu5_1, relu5_2, One in relu5_3 and pool5) come learn initialize dictionaryDictionary is initialized by the K-means in the library VLFeat Algorithm obtains.In SASO-VLADNet model, the general K=128 that chooses can obtain performance good enough, therefore K=is arranged 128。
Step 3:The descriptor set of the specific convolutional layer feature of i-th of image of representative model, fij∈RD×1It is Fi J-th of descriptor, the number of active lanes of D representation vector dimension namely convolution feature is last several layers of for VGG-VD network The number of active lanes of convolution feature is 512, so in SASO-VLADNet, D=512.
The SASAC layer expression formula of neotectonics in SASO-VLADNet layers are as follows:
Wherein | | | |2The L2 norm of representation vector,The specific convolutional layer feature of i-th of image of representative model Descriptor set, a shared M descriptor in this descriptor set, fij∈RD×1It is FiJ-th of descriptor, D representation vector dimension Degree, ak∈RD×1,bk∈RD×1,vk∈ R, (k=1,2 ..., K) it is f respectivelyijWeight, fijBiasing and it is normalized partially Set, these parameters be all in SASO-VLADNet can training parameter.These parameters one share K group, and k indicates a certain group specific The index of parameter.K' indicates to meet set ST(fij) condition several groups parameter index.
ST(fij) it is the set for meeting following condition:
WhereinIt is ST(fij) complementary set, Card (ST(fij)) it is ST(fij) first prime number.
Specifically, the value of T maximum value of SASAC layers of holding, T cannot be too big or too small, and specific T value is by cross validation It determines.By relevant experimental verification, for simplicity generally setting T=5.
Step 4: carrying out dimensionality reduction using affine subspace method;
Affine subspace layer in SASO-VLADNet are as follows: Rk=Uk(fij-ck)=(Ukfijk)
Wherein μk=-Ukck∈RP×1,Uk∈RP×D(k=1,2 ..., k) is the dimensionality reduction projection square in affine subspace method Battle array, P is subspace dimension.P determines final characteristic length, in order to enable mark sheet is shown with relatively small dimension and has enough Good performance, generally setting P=128.
First-order statistics information ξ1(Fi) expression are as follows:
Specifically, (Ukfijk) convolution weight can be regarded as Uk, it is biased to μk1 × 1 convolutional layer, using traditional CNN training method efficiently trains affine subspace layer end to end.
Step 5: second-order statistics information ξ2(Fi) expression formula are as follows:
Wherein vec is the vector operation by matrix conversion for corresponding column vector.
Specifically, it is indicated using the interaction that the covariance matrix of single order feature obtains feature interchannel, due to second-order statistics Information can be micro-, so second-order statistics Information Level can be trained by mode end to end.
Step 6: since affine subspace layer and second-order statistic layer can be trained with existing end-to-end method, And SASAC layers be a brand new network layer, provide SASAC layers of specific backpropagation function here carry out end arrive The training at end:
6.1:SASAC layers of expression formula of step are equivalent to three expression formulas to each k (k=1,2 ..., K):
Second expression formula of SASAC layers of equivalent expression can regard a kind of mutation of maximum pond layer as, which protects T maximum value is held, forcing remaining value is 0;Third expression formula is that normalization layer obtains normalized weight coefficient.
Step 6.2: to each k, the gradient that final Classification Loss J is exported relative to SASAC layers isBased on chain Formula rule obtains γij(k) and βij(k) pressure gradient expression formula are as follows:
Step 6.3: being based on the βij(k) second expression of (k=1,2 ..., K) and SASAC layers of equivalent expression group Formula can obtain loss J relative to fijPressure gradient expression formula:
Step 6.4: being based on the βij(k) second expression of (k=1,2 ..., K) and SASAC layers of equivalent expression group Formula can obtain loss J relative to ak,bk,vkPressure gradient expression formula:
Step 7: after inputting pretreated image, the convolution feature F of the specific convolutional layer of available i-th of picturei, Fi SASO-VLAD (sparse Adaptive Second-Order-local feature Aggregation Descriptor) indicate final expression formula are as follows:
Wherein, L2norm is the L2 norm method for normalizing an of vector, ak,bk,vk,Ukk(k=1,2 ..., k) be In SASO-VLADNet can training parameter.
Specifically, ak,bk,vk,Ukk(k=1,2 ..., K) these parameters are to learn to obtain by mode end to end 's.
In parameter renewal process in SASO-VLADNet, pass through the preceding final damage that depth network is updated to operation first It loses, loss is then propagated backward into input about the gradient of each parameter to update entire SASO-VLADNet model.
Step 8: encoding (relu5_1, relu5_2, relu5_3 and pool5 convolution when obtaining L=4 SASO-VLADNet The coding that feature generates) after, this 4 codings are cascaded up to obtain final M-SASO-VLADNet coding, as shown in Figure 3. M-SASO-VLADNet coding obtains Classification Loss by final full articulamentum, loss layer, and loss layer is the softmax of standard Loss, is denoted as:
Wherein, C is classification quantity, and 1 { } was indicator function, 1 { a true state }=1,1 { a false state }= 0, yiRepresent the class label of i-th of image, ρicIt is L=4 SASO-VLADNet (by relu5_1, relu5_2, relu5_ 4 SASO-VLADNet coding that 3 and pool5 is generated) whole prediction scores:
Wherein,WithIt is a full connection (FC) layer of l (l=1,2 ..., L) Weight and biasing.
Specifically, ρicIt further indicates that are as follows: ρic=(Gc)T[ξ(Fi (1));ξ(Fi (2));…ξ(Fi (L))]+(Bc)T
To trained SASO-VLADNet and M-SASO-VLADNet in destination image data collection (Caltech256 data Collection), it tests in fine granularity image data set (CUB200 data set, StandFord Car data set) and texture image dataset Their image classification performance, compared to NetVLAD model, SASO-VLADNet promotes the image recognition rate of 2-4%.And institute The multi-channel network (M-SASO-VLADNet) of proposition improves 1% or so than proposed one-way network (SASO-VLADNet) Image recognition rate.
Step 9: the complete parameter updating step based on the sparse adaptive depth network of second order VLAD includes:
Step 9.1: in each SASO-VLADNet layers of acquisition initiation parameter;
Step 9.2: final to initialize by each SASO-VLADNet coding and final softmax classifier The weight of full articulamentum;
Step 9.3: using above-mentioned initiation parameter and based on training method end to end, the ladder of softmax classifier Degree information is used to update each layer in M-SASO-VLADNet of parameter until classifier loses curve convergence.

Claims (10)

1. a kind of image classification method based on the sparse adaptive depth network of second order VLAD, which is characterized in that using end-to-end Trained multichannel feature coding network extracts nonlinear convolution feature from the subsequent activation primitive of multiple convolutional layers first, so Corresponding sparse Adaptive Second-Order-local feature Aggregation Descriptor SASO-VLAD coding is calculated in each convolution feature afterwards, most Summarize all SASO-VLAD codings afterwards, final multipath feature coding network M-SASO-VLADNet is constructed, by connecting entirely Connect layer and the loss of loss layer output category;The SASO-VLAD coding is obtained using sparse adaptive soft allocated code SASAC Sparse weight coefficient encodes common expression sparse Adaptive Second-Order VLAD model end to end using single order and second order VLAD SASO-VLADNet。
2. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 1, It is characterized in that, in described sparse this new coding method of adaptive soft allocated code SASAC, sparse adaptive soft distribution SASAC layers of coding is the variant of multidimensional Gaussian probability-density function, and adaptively learns all ginsengs by mode end to end Number, including dictionary and variance parameter;SASAC layers only retain T maximum probability, and forcing other small probabilities is zero sparse to obtain Weight coefficient.
3. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 1, It is characterized in that, the SASO-VLAD end to end constitutes SASO-VLADNet layers, and network constitutes step are as follows:
Step 3.1: using a specific CNN feature F of convolutional layeriBy SASAC layers be multiplied to obtain first-order statistics after dimensionality reduction layer Information ξ1(Fi);
Step 3.2: ξ1(Fi) by being normalized after average pond layer by L2 norm, ξ1(Fi) by two stratum obtain second-order statistics Information ξ2(Fi) after by L2 norm normalize, normalize to export to the end by L2 norm after connecting two normalized outputs; The dimension reduction method is affine subspace method.
4. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 2 or 3, It is characterized in that, the SASAC layers of expression formula are as follows:
Wherein | | | |2The L2 norm of representation vector,The description of the specific convolutional layer feature of i-th of image of representative model Symbol collects, a shared M descriptor in this descriptor set, fij∈RD×1It is FiJ-th of descriptor, D representation vector dimension, ak∈RD×1,bk∈RD×1,vk∈ R, (k=1,2 ..., K) it is f respectivelyijWeight, fijBiasing and normalized biasing, this A little parameters be all in SASO-VLADNet can training parameter;These parameters one share K group, and k indicates specific a certain group of parameter Index;K' indicates to meet set ST(fij) condition several groups parameter index;
ST(fij) it is the set for meeting following condition:
WhereinIt is ST(fij) complementary set, Card (ST(fij)) it is ST(fij) first prime number.
5. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claims 1 and 2, It is characterized in that, activation primitive is one of sigmoid function, tanh function and ReLU function.
6. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 4, It is characterized in that, the first-order statistics information ξ1(Fi) expression formula are as follows:
The descriptor set of the specific convolutional layer feature of i-th of image of representative model, in this descriptor set altogether There are M descriptor, fij∈RD×1It is FiJ-th of descriptor, D representation vector dimension, λij(k) code coefficient for being SASAC layers, UkkFor in first-order statistics information dimensionality reduction matrix and biasing, and a shared K group dimensionality reduction matrix and biasing, k indicate specific The index of a certain group of dimensionality reduction matrix and biasing, (Ukfijk) indicate kth group affine subspace layer;Dimensionality reduction matrix and biasing are all In SASO-VLADNet can training parameter.
7. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 4, It is characterized in that, second-order statistics information ξ2(Fi) utilize the interaction feature of covariance matrix acquisition interchannel, second-order statistics information ξ2 (Fi) expression formula are as follows:
Wherein vec is the vector operation by matrix conversion for corresponding column vector.
8. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 4, It is characterized in that, the forward direction operation of the SASO-VLADNet model updates the final loss of depth network first, then will damage It loses and propagates backward to input about the gradient of each parameter to update SASO-VLADNet layers;The Classification Loss of the output is mark Quasi- softmax loss.
9. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 1, It is characterized in that, the multichannel feature coding network trains multiple features using the convolution feature of basic, normal, high multiple grades simultaneously Coding network.
10. a kind of image classification method based on the sparse adaptive depth network of second order VLAD according to claim 1, The parameter updating step for being characterized in that the complete model includes:
Step 1: in each SASO-VLADNet layers of acquisition initiation parameter;
Step 2: final full connection is initialized by each SASO-VLADNet coding and final softmax classifier The weight of layer;
Step 3: using above-mentioned initiation parameter and based on training method end to end, the gradient information of softmax classifier For updating each layer in M-SASO-VLADNet of parameter until classifier loses curve convergence.
CN201811038736.2A 2018-09-06 2018-09-06 Image classification method based on second-order VLAD sparse adaptive depth network Active CN109255381B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811038736.2A CN109255381B (en) 2018-09-06 2018-09-06 Image classification method based on second-order VLAD sparse adaptive depth network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811038736.2A CN109255381B (en) 2018-09-06 2018-09-06 Image classification method based on second-order VLAD sparse adaptive depth network

Publications (2)

Publication Number Publication Date
CN109255381A true CN109255381A (en) 2019-01-22
CN109255381B CN109255381B (en) 2022-03-29

Family

ID=65047079

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811038736.2A Active CN109255381B (en) 2018-09-06 2018-09-06 Image classification method based on second-order VLAD sparse adaptive depth network

Country Status (1)

Country Link
CN (1) CN109255381B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784420A (en) * 2019-01-29 2019-05-21 深圳市商汤科技有限公司 A kind of image processing method and device, computer equipment and storage medium
CN109901207A (en) * 2019-03-15 2019-06-18 武汉大学 A kind of high-precision outdoor positioning method of Beidou satellite system and feature combinations
CN110135460A (en) * 2019-04-16 2019-08-16 广东工业大学 Image information intensifying method based on VLAD convolution module
CN110209859A (en) * 2019-05-10 2019-09-06 腾讯科技(深圳)有限公司 The method and apparatus and electronic equipment of place identification and its model training
CN110991480A (en) * 2019-10-31 2020-04-10 上海交通大学 Attention mechanism-based sparse coding method
CN111967528A (en) * 2020-08-27 2020-11-20 北京大学 Image identification method for deep learning network structure search based on sparse coding
CN113139587A (en) * 2021-03-31 2021-07-20 杭州电子科技大学 Double-quadratic pooling model for self-adaptive interactive structure learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103336795A (en) * 2013-06-09 2013-10-02 华中科技大学 Video indexing method based on multiple features
CN104408479A (en) * 2014-11-28 2015-03-11 电子科技大学 Massive image classification method based on deep vector of locally aggregated descriptors (VLAD)
CN108460764A (en) * 2018-03-31 2018-08-28 华南理工大学 The ultrasonoscopy intelligent scissor method enhanced based on automatic context and data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103336795A (en) * 2013-06-09 2013-10-02 华中科技大学 Video indexing method based on multiple features
CN104408479A (en) * 2014-11-28 2015-03-11 电子科技大学 Massive image classification method based on deep vector of locally aggregated descriptors (VLAD)
CN108460764A (en) * 2018-03-31 2018-08-28 华南理工大学 The ultrasonoscopy intelligent scissor method enhanced based on automatic context and data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHEN ET AL.: "A novel localized and second order feature coding network for image recognition", 《PATTERN RECOGNITION》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784420A (en) * 2019-01-29 2019-05-21 深圳市商汤科技有限公司 A kind of image processing method and device, computer equipment and storage medium
CN109901207A (en) * 2019-03-15 2019-06-18 武汉大学 A kind of high-precision outdoor positioning method of Beidou satellite system and feature combinations
CN110135460A (en) * 2019-04-16 2019-08-16 广东工业大学 Image information intensifying method based on VLAD convolution module
CN110209859A (en) * 2019-05-10 2019-09-06 腾讯科技(深圳)有限公司 The method and apparatus and electronic equipment of place identification and its model training
CN110209859B (en) * 2019-05-10 2022-12-27 腾讯科技(深圳)有限公司 Method and device for recognizing places and training models of places and electronic equipment
CN110991480A (en) * 2019-10-31 2020-04-10 上海交通大学 Attention mechanism-based sparse coding method
CN111967528A (en) * 2020-08-27 2020-11-20 北京大学 Image identification method for deep learning network structure search based on sparse coding
CN111967528B (en) * 2020-08-27 2023-12-26 北京大学 Image recognition method for deep learning network structure search based on sparse coding
CN113139587A (en) * 2021-03-31 2021-07-20 杭州电子科技大学 Double-quadratic pooling model for self-adaptive interactive structure learning
CN113139587B (en) * 2021-03-31 2024-02-06 杭州电子科技大学 Double secondary pooling model for self-adaptive interactive structure learning

Also Published As

Publication number Publication date
CN109255381B (en) 2022-03-29

Similar Documents

Publication Publication Date Title
CN109255381A (en) A kind of image classification method based on the sparse adaptive depth network of second order VLAD
Morgado et al. Semantically consistent regularization for zero-shot recognition
CN112308158B (en) Multi-source field self-adaptive model and method based on partial feature alignment
CN109299342B (en) Cross-modal retrieval method based on cycle generation type countermeasure network
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
Chen et al. Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning
CN107526785B (en) Text classification method and device
Liao et al. Learning deep parsimonious representations
CN108121975B (en) Face recognition method combining original data and generated data
Bradley et al. Differential sparse coding
CN109063666A (en) The lightweight face identification method and system of convolution are separated based on depth
CN111414461B (en) Intelligent question-answering method and system fusing knowledge base and user modeling
CN113326731B (en) Cross-domain pedestrian re-identification method based on momentum network guidance
US11748919B2 (en) Method of image reconstruction for cross-modal communication system and device thereof
CN110378208B (en) Behavior identification method based on deep residual error network
CN111444367A (en) Image title generation method based on global and local attention mechanism
CN113239784A (en) Pedestrian re-identification system and method based on space sequence feature learning
CN112749274B (en) Chinese text classification method based on attention mechanism and interference word deletion
CN112784929B (en) Small sample image classification method and device based on double-element group expansion
CN113688894B (en) Fine granularity image classification method integrating multiple granularity features
CN114896434B (en) Hash code generation method and device based on center similarity learning
CN109815496A (en) Based on capacity adaptive shortening mechanism carrier production text steganography method and device
CN109033294A (en) A kind of mixed recommendation method incorporating content information
Li et al. Embedded stacked group sparse autoencoder ensemble with L1 regularization and manifold reduction
CN113627543A (en) Anti-attack detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant