CN110163258A

CN110163258A - A kind of zero sample learning method and system reassigning mechanism based on semantic attribute attention

Info

Publication number: CN110163258A
Application number: CN201910335801.6A
Authority: CN
Inventors: 刘洋; 蔡登�
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2019-04-24
Filing date: 2019-04-24
Publication date: 2019-08-23
Anticipated expiration: 2039-04-24
Also published as: CN110163258B

Abstract

The invention discloses a kind of zero sample learning method and system that mechanism is reassigned based on semantic attribute attention, wherein zero sample learning method includes: that (1) establishes the neural network model that mechanism is reassigned based on semantic attribute attention；(2) weight between semantic feature is redistributed using the attention in semantic attribute space；(3) using the image data set training neural network model with label；(4) similarity of the semantic prototype of semantic feature and unknown class after calculating image weighting, calculates the similarity of the hidden layer Feature prototype of hidden layer feature and unknown class, two kinds of similarities is added to obtain the similarity of test image Yu each unknown class；(5) according to it is all kinds of between similarity be ranked up, the class for choosing the class of maximum similarity as the image is predicted.The present invention can make zero sample learning in the training process, can closely contact semantic space and hidden layer space, so that the result in conjunction with the joint classification in two kinds of spaces is more robust.

Description

It is a kind of based on semantic attribute attention reassign mechanism zero sample learning method and System

Technical field

The present invention relates to zero sample learning categorizing system fields, are divided again more particularly, to one kind based on semantic attribute attention The zero sample learning method and system with mechanism.

Background technique

In recent years, target classification is come from industry by vast as an important branch in computer vision field With the concern of academia researcher.The fast development for benefiting from depth learning technology has the target classification task of supervision to achieve Very big progress, but simultaneously, there are some limitations for the training method under this surveillance requirements.In Supervised classification, each class Require enough training samples with label.In addition, the classifier of study can only be to the class for belonging to training data and being covered Example classify, lack handle previously invisible class ability.In practical applications, each class may be not enough Training sample, it is also possible to there is a situation where that unlapped class appears in test sample in training.The target of zero sample learning is Classify to the example for belonging to the class being not covered in training, has become a fast-developing side in machine learning field To in computer vision, natural language processing and general fit calculation aspect are had a wide range of applications.

Zero sample learning method of mainstream at present, the two stages for being mainly taken based on attribute derive the mark for carrying out forecast image Label.The process of the derivation method are as follows: one image of input, model predict each attribute of this image in the first stage, then Second stage possesses the class of most like property set by search to infer its class label.For example, Christoph H.Lampert etc. People is in 2013 in IEEE mode analysis and machine intelligence periodical (The IEEE Transactions on Pattern Analysis and Machine Intelligence) on article " the Attribute-based classification that delivers For zero-shot visual object categorization " propose DAP model pass through learning probability attributive classification Device estimates the posterior probability of each attribute of image, is then inferred by calculating posterior probability and the MAP estimation of class The class label of image.International computer visions in 2016 and pattern-recognition meeting (The Conference on Computer Vision and Pattern Recognition) on include " Recovering the missing link: Predicting class-attribute associations for unsupervised zero-shot learning》 In, propose to be first each attribute learning probability classifier, is then classified by the method for random forest, this classification side Method is capable of handling some insecure attributes.There is domain migration in this two stage method, although for example, goal task It is the label of the class of forecast image, the intermediate task of DAP is study classifier related with image attributes.

The latest developments of zero sample learning directly learn the mapping from image feature space to attribute semantemes space.Wherein, It is embodied within 2016 IEEE mode analysis and machine intelligence periodical (The IEEE Transactions on Pattern Analysis and Machine Intelligence) on article " Label-embedding for image Classification " propose ALE model, using it is a kind of based on ranking loss function study image and attribute space between Bilinearity compatibility function.International computer visions in 2017 and pattern-recognition meeting (The Conference on Computer Vision and Pattern Recognition) on " the Semantic autoencoder for that includes Zero-shot learning " propose that a kind of semantic autocoder pressure projects to characteristics of image can be by image reconstruction Semantic space.(the The International Conference on Computer of international computer vision conference in 2017 Vision on), an entitled " Predicting visual exemplars of unseen classes for zero- Shot learning " article proposition, by the way that class semantic expressiveness is projected in visual signature space and is carried out in these projections Arest neighbors classification.

Other than common semantic attribute space, some newest work in recent years are often related to from more zero sample learning A space carries out joint category inferences.For example, international computer vision conference (The International in 2017 Conference on Computer Vision) on an entitled " Learning discriminative latent Attributes for zero-shot classification " article propose LAD model obtained using dictionary learning Have any different but retain the hidden feature space of semantic information.International computer visions in 2018 and pattern-recognition meeting (The Conference on Computer Vision and Pattern Recognition) on, one is entitled The article of " Discriminative Learning of Latent Features for Zero-Shot Recognition " It proposes a kind of combination to maximize between class distance and minimize the new hidden feature space of inter- object distance, and in semantic space and hidden Joint deduction is carried out on feature space.In European Computer vision international conference (The European Conference in 2018 On Computer Vision) in, an entitled " Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition " article propose CDL model in vision and semantic space simultaneously It is aligned class formation.However, these methods all think that every attribute is all of equal importance in assorting process in itself, and neglect Having omited them, there are different distributions, variance, comentropies etc. between inhomogeneity, and this processing mode is some challenging Image on, it is easy to cause wrong classification.

Summary of the invention

The present invention provides a kind of zero sample learning method and system that mechanism is reassigned based on semantic attribute attention, lead to It crosses and provides a kind of mode that attention reassigns to the attribute forecast of every image, measure again each when for the image classification The significance level of a attribute, and then achieve the effect that better zero sample learning.

Technical scheme is as follows:

A kind of zero sample learning method reassigning mechanism based on semantic attribute attention, comprising:

(1) neural network model for reassigning mechanism based on semantic attribute attention is established, is reflected comprising vision-semantic space Branch, vision-hidden layer space reflection branch and attention branch are penetrated, neural network model combines three branches to allow to carry out image When forward direction derives, semantic feature of the image in semantic attribute space, the hidden layer feature in hidden layer space are respectively obtained and in language The attention of adopted attribute space；

(2) weight between semantic feature is redistributed using the attention in semantic attribute space；

(3) using the image data set training neural network model with label；

(4) image to be tested is inputted, the semantic feature after calculating image weighting is similar to the semantic prototype of each unknown class Degree calculates the similarity of the hidden layer Feature prototype of hidden layer feature and unknown class, by two kinds of similarities be added to obtain test image with The similarity of each unknown class；

(5) according to it is all kinds of between similarity be ranked up, the class for choosing maximum similarity is pre- as the class of the image It surveys.

The zero sample learning method reassigned based on semantic attribute attention proposed by the invention, is a kind of modified version The algorithm that semantic attribute space and hidden layer spaces union are inferred.Semantic space and hidden layer compared to algorithm before, in this method Space combines even closer, provides the guidance of category information including 1) hidden layer space for semantic attribute space, produces neural network The raw attention to hold water；2) semantic attribute space provides initial methods for the prototype construction of unknown class in hidden layer space, from And reduce drawback brought by domain migration.Meanwhile models coupling reassigns the semantic attribute space and hidden layer space of attention Inferred, so that the stability of model prediction greatly improves.

In step (1), the vision-semantic space mapping branch and vision-hidden layer space reflection branch use VGG19 bone Frame network structure makees the Feature Mapping of different spaces as shared shallow-layer network using different full articulamentums respectively；

The attention branch uses the characteristic pattern of different layers in VGG19 back bone network the convolution kernel size of different parameters Make feature extraction for 3 single layer convolutional neural networks, and calculates the VGG19 characteristic pattern pair of different layers using the method for Fusion Features The attention in the semantic attribute space answered.

In step (1), obtain semantic feature of the image in semantic attribute space, the hidden layer feature in hidden layer space and Detailed process is as follows for the attention in semantic attribute space:

Image, which is extracted, using the depth convolutional neural networks of pre-training inputs x_iDeep layer visual signature θ_i, use full connection The deep layer visual signature of image is mapped to semantic space and hidden layer space respectively by neural network, and the semantic space and hidden layer are empty Between calculation formula are as follows:

σ_i=FC₂(θ_i)

Wherein,Indicate that image i is indicated in the vector of semantic space, σ_iIndicate that image i is indicated in the vector in hidden layer space, FC₁Indicate the mapping function that semantic space is mapped to from visual space, FC₂It indicates to be mapped to reflecting for hidden layer space from visual space Penetrate function；

The intermediate features figure about image i for choosing l layers in depth convolutional neural networks indicatesWith hidden layer space vector σ_i, calculation formula of the image i in the semantic attribute attention that space or depth perception is l Are as follows:

Wherein, W_lAnd b_lIt is the parameter of single layer fully-connected network,It is vision when hidden layer vector indicates and depth is l Fusion Features expression, the calculation formula that the Fusion Features indicate are as follows:

Wherein, F_sqIt is the transforming function transformation function of matrix, C × H × W size three-dimensional matrice is indicated to the two dimension for being converted to C × HW Matrix expression,Representing matrix is summed by channel,It is characteristic patternBy a series of convolution As a result, k indicate Fusion Features after port number, with semantic vector indicate and hidden layer vector expression length be consistent；Most After select to different number of plies l ∈ l_BCarry out sum, calculation formula of the image i in the attention in semantic attribute space are as follows:

Wherein, p_{I, l}The semantic attribute attention for being l in space or depth perception for image i.

In step (2), the weight between semantic feature, calculation formula are redistributed using the attention in semantic attribute space Are as follows:

Wherein, diag (p_i) be a k × k diagonal matrix, the value on diagonal line is p_i。

The detailed process of step (3) are as follows:

In (3-1) Data Preparation Process, former training dataset D is divided into the set of multiple triple compositions in advance

Wherein, for any triple WithIt is from same classDifferent images,Then be fromThe different class of imageImage；

In (3-2) training process, for each tripleNeural network model uses mixed damage Function L is lost to train neural network, the calculation formula of specific loss function L are as follows:

Wherein, L_FIt is defined in the loss function in hidden layer space, L_AIt is defined in the loss function in semantic attribute space；

Hidden layer space loss function comes to maximize between class distance simultaneously using triple loss function, minimize class it is interior away from From；The specific formula for calculation of hidden layer loss function are as follows:

The loss function in semantic attribute space using maximized based on cross entropy loss function semantic space classification Probability；The specific formula for calculation of semantic loss function are as follows:

Wherein, Y is the set of all trained classes,It is known class y_iSemantic attribute prototype.

The specific steps of step (4) are as follows:

(4-1) is for input picture x_i, using trained model prediction, its semantic vector is indicatedHidden layer vector table Show σ_iWith its semantic attribute attention p_i；

(4-2) is for any classification y^u∈Y^u, wherein Y^uIt indicates the unlapped classification of training, calculates separately image x_iWith class Semantic prototypeWith class hidden layer prototypeCosine similarity in semantic attribute space and hidden layer space；Semantic attribute space Cosine similarity calculation formula are as follows:

The calculation formula of the cosine similarity in hidden layer space are as follows:

The cosine similarity of two spaces is summed to obtain image x_iWith class y^uBetween similarity, its calculation formula is:

The specific steps of step (5) are as follows:

Using nearest neighbor search algorithm, image x is calculated_iIn class set Y^uOn class predictionIts calculation formula is:

The zero sample learning algorithm reassigned based on semantic attribute attention proposed by the invention, has zero sample learning All advantages, and can correctly distinguish it is some there are the difficult samples of semantic ambiguity, such as distinguished from pig and spot dog Out with spotted pig.In practice, zero more previous sample learning algorithm of proposed algorithm, the category on semantic space are found Property predicted value has much lower variance, and when this makes final searching classification, classification results are influenced by a variety of semantic attributes, Rather than by one or a small number of attribute forecast more outstanding is dominated.Joint of this algorithm in semantic space and hidden layer space is inferred On the basis of classification, so that the relationship of two spaces is even closer, avoids image and be correctly sorted in a feature space In another feature space the problem of mistake classification.

The present invention also provides a kind of zero sample learning systems that mechanism is reassigned based on semantic attribute attention, including meter It calculation machine memory, computer processor and is stored in the computer storage and can be held on the computer processor Capable computer program has with lower module in the computer storage:

Visual signature module captures the deep vision feature of input picture using depth convolutional neural networks；

Visual signature is mapped to semantic attribute space using full Connection Neural Network by vision-semantic mapping module；

Visual signature is mapped to hidden layer space using full Connection Neural Network by vision-hidden layer mapping block；

Semanteme pays attention to power module, utilizes the category information generative semantics space of the shallow-layer visual signature and hidden layer space of image Attribute attention；

Systematic searching module utilizes the semantic attribute space representation of image, the attention of hidden layer space representation and semantic space Power carries out image classification；

Classify generation module, for after category of model finishes, external output category result.

Compared with prior art, the invention has the following advantages:

1, semantic attribute attention reassignment algorithm proposed by the present invention, by using the mechanism of attention, so that each language There are competitive relations between adopted attribute forecast, so that classification results are determined by more semantic attributes, rather than are solely dependent upon portion Divide semantic attribute more outstanding, to avoid the mistake classification on some difficult samples with semantic ambiguity.

2, the domain migration problem when present invention can infer classification to avoid semantic space and hidden layer spaces union, therefore can be with Avoid the problem that classification results are intended to training covering class when general zero sample learning.

3, the present invention is by the way that experimental results demonstrate illustrate the model performance better than other reference line algorithms.It is demonstrate,proved from experiment The superiority of model is illustrated.

Detailed description of the invention

Fig. 1 is a kind of the whole of zero sample learning method that mechanism is reassigned based on semantic attribute attention of the embodiment of the present invention Body block schematic illustration；

Fig. 2 is the semantic operation schematic diagram for paying attention to power module of method in the embodiment of the present invention；

Fig. 3 is a kind of the whole of zero sample learning system that mechanism is reassigned based on semantic attribute attention of the embodiment of the present invention Body structural schematic diagram；

Fig. 4 is to be illustrated in the embodiment of the present invention using the prediction distribution of the different obtained semantic spaces of attention mechanism Figure.

Specific embodiment

The invention will be described in further detail with reference to the accompanying drawings and examples, it should be pointed out that reality as described below It applies example to be intended to convenient for the understanding of the present invention, and does not play any restriction effect to it.

As shown in Figure 1, master cast of the invention be divided into visual signature module and three branch modules to respectively correspond image defeated Three outputs entered, and these three branches are synchronized to the optimization process for being put into entire model.Specific step is as follows:

(a) visual signature module is study input picture x in zero sample learning training process_iDeep vision feature θ_i, base Steps are as follows for this:

(a-1) network model parameter is initialized using the good large-scale neural network ResNet101 of pre-training.Input picture x_i The center image x ' that is cut to 224 × 224 sizes first_iAs actually entering for network.

(a-2) take the feature vector of the last layer non-categorical layer of neural network as image x '_iDeep vision feature θ_i, the length of feature vector is denoted as V.

(b) vision-semantic mapping module provides the mapping in deep vision space to semantic space for zero sample learning process, Basic step is as follows:

(b-1) initialization model parameter.Initialize spatial mapping matrix W₁∈R^k×VAnd b₁∈R^k。

(b-2) by visual signature θ_iIt is mapped to semantic space, obtains semantic space expressionSpecific formula for calculation are as follows:

(c) vision-hidden layer mapping block provides the mapping in deep vision space to hidden layer space for zero sample learning process, Basic step is as follows:

(c-1), similar with step (b-1), model is initialized, W is obtained₂And b₂。

(c-2), similar with step (b-2), visual signature is mapped, specific formula for calculation are as follows:

σ_i=W₂θ_i+b₂

(d) semanteme-attention power module is zero sample using the part category information in hidden layer space and the shallow-layer visual information of image Weight is redistributed in the output of semantic space in this study, as shown in Fig. 2, basic step is as follows:

(d-1) the shallow-layer visual signature figure of certain layer l is chosenAnd hidden layer character representation σ_i∈R^k, As the input of semanteme-attention power module.Initialize convolutional neural networksParameter, the parameter W of fully-connected network FC₃And b₃。

(d-2) shallow-layer visual signature figureBy a series of convolution transformsObtain characteristic patternWherein characteristic pattern size be H ' × W ', port number be k and with hidden layer feature σ_iLength keep one It causes.

(d-3) characteristic patternσ is indicated with the hidden layer of image_i∈R^kBe added by channel, and will The matrix of generation is converted to column vector by channel, obtains hidden variable

(d-4) hidden variable obtains integrated images shallow-layer visual signature by full Connection Neural Network FCWith hidden layer feature σ_iThe space or depth perception in semantic space be l attention indicate p_{I, l}, specific formula for calculation are as follows:

(d-5) four specific network depth l ∈ l are chosen_B, repeat above-mentioned 1-4 step respectively, and by obtained attention Power adds up, and obtains image in total semantic attribute attention in semantic attribute space, its calculation formula is:

The training step of the zero sample learning method based on semantic attribute attention reassignment is as follows:

1. initializing training dataset D={ (x_i, y_i), wherein x_iIndicate input picture, y_i∈ Y indicates input picture Class label, Y indicates the set for the class that training set is covered, to every any sort y^s∈ Y,It is such prototype in semantic space Vector.Data set is arranged to the set for being divided into multiple triplesWhereinWithCome From same classDifferent images,Then be fromThe different class of imageImage.

2. choosing a triple pairAs the input of network model, each image is obtained respectively in language The vector in adopted space, hidden layer space and semantic attribute attention indicates.

3. minimizing inter- object distance using based on triple loss function to maximize between class distance simultaneously.Hidden layer loses letter Several specific formula for calculation are as follows:

Use the probability maximized based on the loss function of cross entropy in semantic space classification.The tool of semantic loss function Body calculation formula are as follows:

4. repeating above-mentioned 2-3 step, the parameter of each module of training using gradient descent method.

5. using the average hidden layer space representation of training picture as the class hidden layer prototype of training covering class, calculation formula Are as follows:

Wherein N^sIndicate class y^sTraining samples number.S calculates the hidden layer vector expression that training does not cover class using ridge regressionIts specific formula for calculation are as follows:

Steps are as follows for the sample classification of the zero sample learning method based on semantic attribute attention reassignment:

1. for input picture x_i, using trained model prediction, its semantic vector is indicatedHidden layer vector indicates σ_iWith its semantic attribute attention p_i。

2. for any classification y^u∈Y^u, wherein Y^uIt indicates the unlapped classification of training, calculates separately image x_iWith class semanteme PrototypeWith class hidden layer prototypeIn the cosine similarity of semantic space and hidden layer space.The cosine similarity of semantic space Calculation formula are as follows:

3. using nearest neighbor search algorithm, image x is calculated_iIn class set Y^uOn class predictionIts calculation formula is:

As shown in figure 3, a kind of zero sample classification system reassigned based on semantic attribute attention, is divided into six big moulds Block, is visual signature module respectively, vision-semantic mapping module, vision-hidden layer mapping block, semanteme-attention power module, point Class retrieval module, and classification generation module.

The above method is applied in the following example below, it is specific in embodiment to embody technical effect of the invention Step repeats no more.

The present embodiment is had and zero sample of other current forefronts in three large sizes public data collection AwA2, CUB and SUN Learning method compares.AwA2 is the medium scale data set an of coarseness, from 50 there are 85 users to define and belongs to 37322 images of the animal category of property.CUB is a fine-grained data set, by 11788 from 200 kinds of different birds A image composition, has 312 user-defined attributes.SUN is another fine-grained data set, including not from 717 With 14340 images of scene, 102 user-defined attributes are provided.Data set is divided into two parts: training set and test Collection, on different data sets by different divisions.On AwA2 data set, 40 animal classes are as training set, 10 animal classes As test set；Similar, training set of 150 classes as CUB, test set of 50 classes as CUB, 645 classes are as SUN Training set, test set of 72 classes as SUN.The judging quota of the present embodiment is that class is averaged recognition accuracy, is compared in total Zero specimen discerning algorithm of 5 current mainstreams, the results are shown in Table 1 for overall contrast.

Table 1

As it can be seen from table 1 the zero sample learning frame proposed by the present invention reassigned based on semantic attribute attention, Optimal effectiveness is obtained under major judging quota, sufficiently illustrates the superiority of inventive algorithm.

The algorithm proposed in order to further illustrate the present invention really by inhibiting part relatively prominent in semantic space Predicted value out, the present invention are compared simultaneously " without using attention mechanism ", " use the attention machine based on sigmoid function Experimental comparison of " the attention mechanism for using softmax function " that system " and the present invention use on CUB data set, experiment knot Fruit is as shown in table 2.

Table 2

Method	Class Average Accuracy (%)	Semantic forecast value variance × 10^-3
			w/o Attention	62.1	2.48
w/Sigmoid Attention	73.5	1.75
			w/Softmax Attention	81.1	0.86

From on table 2 as can be seen that the attention mechanism based on softmax function has reached optimal experiment knot in experiment Fruit, and it was found that with the reduction of semantic space predicted value variance, the inhibitory effect of outburst prediction value is better, the effect of model It is better, absolutely prove validity of the attention mechanism in zero sample learning.

In addition to this, the kernel function estimation of semantic forecast value as shown in Figure 4 also reflect attention mechanism will be semantic empty Between prediction distribution be limited in a particular range, absolutely prove algorithm proposed by the invention by redistributing semanteme The attribute weight in space has reached the inhibitory effect to exception semantics predicted value.

Technical solution of the present invention and beneficial effect is described in detail in embodiment described above, it should be understood that Above is only a specific embodiment of the present invention, it is not intended to restrict the invention, it is all to be done in spirit of the invention Any modification, supplementary, and equivalent replacement, should all be included in the protection scope of the present invention.

Claims

1. a kind of zero sample learning method for reassigning mechanism based on semantic attribute attention characterized by comprising

(1) neural network model that mechanism is reassigned based on semantic attribute attention is established, the neural network model includes view Feel-semantic attribute space reflection branch, vision-hidden layer space reflection branch and attention branch make image carry out network When forward direction derives, semantic feature of the image in semantic attribute space, the hidden layer feature in hidden layer space are respectively obtained and in language The attention of adopted attribute space；

(3) using the image data set training neural network model with label；

(4) image to be tested is inputted, the similarity of the semantic prototype of semantic feature and each unknown class after calculating image weighting, meter The similarity for calculating the hidden layer Feature prototype of hidden layer feature and unknown class, by two kinds of similarities be added to obtain test image with it is each unknown The similarity of class；

(5) according to it is all kinds of between similarity be ranked up, the class for choosing the class of maximum similarity as the image is predicted.

2. the zero sample learning method according to claim 1 for reassigning mechanism based on semantic attribute attention, feature It is, in step (1), the vision-semantic space mapping branch and vision-hidden layer space reflection branch use VGG19 skeleton Network structure makees the Feature Mapping of different spaces as shared shallow-layer network using different full articulamentums respectively；

The attention branch is 3 using the convolution kernel size of different parameters to the characteristic pattern of different layers in VGG19 back bone network Single layer convolutional neural networks make feature extraction, and it is corresponding using the VGG19 characteristic pattern that the method for Fusion Features calculates different layers Semantic attribute space attention.

3. the zero sample learning method according to claim 1 for reassigning mechanism based on semantic attribute attention, feature It is, in step (1), obtains semantic feature of the image in semantic attribute space, the hidden layer feature in hidden layer space and in language Detailed process is as follows for the attention of adopted attribute space:

Image, which is extracted, using the depth convolutional neural networks of pre-training inputs x_iDeep layer visual signature θ_i, use full connection nerve The deep layer visual signature of image is mapped to semantic space and hidden layer space respectively by network, the semantic space and hidden layer space Calculation formula are as follows:

Wherein,Indicate that image i is indicated in the vector of semantic space, σ_iIndicate that image i is indicated in the vector in hidden layer space, FC₁Table Show the mapping function that semantic space is mapped to from visual space, FC₂Indicate the mapping letter that hidden layer space is mapped to from visual space Number；

Wherein, W_lAnd b_lIt is the parameter of single layer fully-connected network,It is visual signature when hidden layer vector indicates and depth is l Fusion expression, the calculation formula that the Fusion Features indicate are as follows:

Wherein, F_sqIt is the transforming function transformation function of matrix, C × H × W size three-dimensional matrice is indicated to the two-dimensional matrix for being converted to C × HW It indicates,Representing matrix is summed by channel,It is characteristic patternBy a series of knot of convolution Fruit, k indicate the port number after Fusion Features, indicate with semantic vector and the length of hidden layer vector expression is consistent；Finally select It selects to different number of plies l ∈ l_BCarry out sum, calculation formula of the image i in the attention in semantic attribute space are as follows:

4. the zero sample learning method according to claim 1 for reassigning mechanism based on semantic attribute attention, feature It is, in step (2), redistributes the weight between semantic feature, calculation formula using the attention in semantic attribute space are as follows:

5. the zero sample learning method according to claim 1 for reassigning mechanism based on semantic attribute attention, feature It is, the detailed process of step (3) are as follows:

Wherein, for any tripleWithIt is from same classDifferent images,Then Be fromThe different class of imageImage；

In (3-2) training process, for each tripleNeural network model uses mixed loss letter L is counted to train neural network, the calculation formula of specific loss function L are as follows:

Hidden layer space loss function carrys out using triple loss function while maximizing between class distance, minimizes inter- object distance；It is hidden The specific formula for calculation of layer loss function are as follows:

The loss function in semantic attribute space uses the probability maximized based on cross entropy loss function in semantic space classification； The specific formula for calculation of semantic loss function are as follows:

6. the zero sample learning method according to claim 1 for reassigning mechanism based on semantic attribute attention, feature It is, the specific steps of step (4) are as follows:

(4-1) is for input picture x_i, using trained model prediction, its semantic vector is indicatedHidden layer vector indicates σ_i With its semantic attribute attention p_i；

(4-2) is for any classification y^u∈Y^u, wherein Y^uIt indicates the unlapped classification of training, calculates separately image x_iIt is semantic former with class TypeWith class hidden layer prototypeCosine similarity in semantic attribute space and hidden layer space；The cosine in semantic attribute space The calculation formula of similarity are as follows:

7. the zero sample learning method according to claim 6 for reassigning mechanism based on semantic attribute attention, feature It is, the specific steps of step (5) are as follows:

8. a kind of zero sample learning system for reassigning mechanism based on semantic attribute attention, including computer storage, calculating Machine processor and it is stored in the computer program that can be executed in the computer storage and on the computer processor, It is characterized in that, having in the computer storage with lower module:

Semanteme pays attention to power module, utilizes the attribute in the category information generative semantics space of the shallow-layer visual signature and hidden layer space of image Attention；

Systematic searching module, using the semantic attribute space representation of image, the attention of hidden layer space representation and semantic space into Row image classification；