CN105740879A - Zero-sample image classification method based on multi-mode discriminant analysis - Google Patents

Zero-sample image classification method based on multi-mode discriminant analysis Download PDF

Info

Publication number
CN105740879A
CN105740879A CN201610026972.7A CN201610026972A CN105740879A CN 105740879 A CN105740879 A CN 105740879A CN 201610026972 A CN201610026972 A CN 201610026972A CN 105740879 A CN105740879 A CN 105740879A
Authority
CN
China
Prior art keywords
classification
sigma
sample image
zero
semantic feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610026972.7A
Other languages
Chinese (zh)
Other versions
CN105740879B (en
Inventor
冀中
谢于中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Orpheus Group Co.,Ltd.
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201610026972.7A priority Critical patent/CN105740879B/en
Publication of CN105740879A publication Critical patent/CN105740879A/en
Application granted granted Critical
Publication of CN105740879B publication Critical patent/CN105740879B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

A zero-sample image classification method based on multi-mode discriminant analysis comprises the steps of constructing matrixes based on the visual feature of training data and semantic features of corresponding categories, getting a mapping matrix, verifying massed learning to get a weight alpha(i), using the mapping matrix to map the visual feature of training data and semantic features of unseen categories to a common space, and classifying test data. According to the invention, a common space between the visual feature of an image and the semantic features of multiple modes can be sought, and higher accuracy is achieved in zero-sample image classification, so the zero-sample image classification method is effective. The method is simple, and has a good effect. Apart from the zero-sample image classification problem, the method can adapt to other multi-mode classification and retrieval problems.

Description

The zero sample image sorting technique based on multi-modal discriminant analysis
Technical field
The present invention relates to one and realize zero sample image sorting technique.Particularly relate to one by multi-modal discriminant analysis, set up contacting between the visual space of image and the semantic space of image category, thus realizing the zero sample image sorting technique based on multi-modal discriminant analysis of zero sample image classification.
Background technology
For traditional image classification system, want to accurately identify out certain class image, it is necessary to provide the training data of corresponding tape label.But the label of training data is often be difficult to obtain, zero sample image classification solves a kind of effective means of class label disappearance problem exactly, its object is to imitate the mankind without having seen actual vision sample, just can recognize the ability of new classification.Zero sample image categorizing system, by having the training data of label, the classification namely met, sets up mapping relations between visual space and semantic space.Then according to these mapping relations, the visual signature of test data is associated with the semantic feature of unseen classification, selects semantic immediate classification as the label of test data.
In zero sample image classification, for test image and the corresponding item name of unseen classification, it is necessary to set up contact by semantic space.In semantic space, each item name is expressed as a high dimension vector.In Prior efforts, this semantic space is normally based on attribute, and then each item name just can be expressed as an attribute vector.50 class animal paintings are labelled with 85 semantic attributes by such as Lampert et al., and the color of such as object, shape etc., with it as high-level semantics describing mode.
In recent years, along with the development of natural language processing technique, the semantic space based on text vector is popular gradually.Conventional text vector extracting method is Mikolov et al. Word2Vec proposed, and it is a kind of unsupervised method, it is possible to represented by the word vector in corpus, and the similarity between vector can well simulate the similarity in the semanteme of word.
After trying to achieve that met and unseen classification semantic feature vector in given semantic space, the semantic dependency between of all categories just can be obtained by the distance between semantic feature vector.But, image is to be represented by the visual feature vector in visual space, and due to the existence of semantic gap, it can not directly be set up with the characteristic vector of semantic space and contact.Existing method by the semantic feature of the visual signature of classification picture met and respective labels, learns a mapping function being mapped to semantic space from visual space mostly.Then, by this mapping function, the visual signature of test picture is mapped to semantic space, obtains the semantic feature of prediction, then find out from its nearest semantic feature not meeting classification, so that it is determined that generic.
But, the semantic space that the semantic feature of single mode is constituted tends not to the category structure of sufficient descriptor data set.
Common zero sample image sorting technique is that the visual signature of image is mapped to the semantic feature space of item name, then classifies.But, the luv space that the semantic feature of item name is constituted tends not to well describe the category structure of data set.Therefore can improve from following two aspect: one, visual signature and semantic feature are mapped to a public space, then further they be set up contact;Two, use the semantic feature of multiple modalities, from multiple angles, the category structure of data set is described.Multi-modal discriminant analysis just can meet the two demand simultaneously.
Summary of the invention
The technical problem to be solved is to provide a kind of zero sample image sorting technique based on multi-modal discriminant analysis that the semantic feature of the visual signature of training image and image category title can be mapped to a public space.
The technical solution adopted in the present invention is: a kind of zero sample image sorting technique based on multi-modal discriminant analysis, comprises the steps:
1) the visual signature X of training data is used1And the semantic feature X of respective classes2,…XcBuild matrix S and D, wherein,
S j r = Σ i = 1 c ( Σ k = 1 n i j x i j k x i j k T - n i j n i r n i μ i j ( x ) μ i j ( x ) T ) , j = r - Σ i = 1 c n i j n i r n i μ i j ( x ) μ i r ( x ) T , j ≠ r - - - ( 1 )
D j r = ( Σ i = 1 c n i j n i r n i μ i j ( x ) μ i j ( x ) T ) - 1 n ( Σ i = 1 c n i j μ i j ( x ) ) ( Σ i = 1 c n i j μ i j ( x ) ) T - - - ( 2 )
In formula, x is the vector in visual signature matrix or semantic feature matrix, and i represents classification sequence number, and j represents mode sequence number, and k represents sample sequence number, and c represents the sum of classification, and n represents the sum of sample,It is expressed as:
2) seek following formula, obtain mapping matrix W:
m a x W 1 , W 2 , ... W v T r ( W T D W W T S W ) , - - - ( 3 ) ;
3) weight α in following formula is obtained in checking massed learningi
k * = argmax k [ Σ i = 2 c α i s i m ( W 1 T x j , W i T y i k ) ] , - - - ( 4 )
K=1,2 ..., n.
In formula, xjIt is the visual signature of checking data,It is and xjThe semantic feature of the kth mode of corresponding classification, sim (a, b)=aTB/ (| | a | | | | b | |), is two vectorial distances;
4) mapping matrix W is used, by the visual signature of test dataSemantic feature y with unseen classificationkMap to public space;
5) by step 3) in formula to test data classify, the k in formula*It it is the test corresponding classification of data.
The zero sample image sorting technique based on multi-modal discriminant analysis of the present invention, has the advantages that
1, usual way can only seek the public space between the visual signature of image and the semantic feature of single mode, and the multi-modal discriminant analysis of the present invention can seek the public space between the visual signature of image and the semantic feature of multiple mode.
2, item name can be described by the semantic feature of multiple mode of the present invention from different perspectives, thus reaching better to describe effect.Through experimental verification, can only use with other single mode semantic feature method compared with, the method for the present invention can obtain higher accuracy rate in zero sample image classification, is therefore a kind of effective zero sample image sorting technique.
3, the method for the present invention is simple, excellent effect.Except zero sample image classification problem, also adapt to other multi-modal classification, search problem simultaneously.
Detailed description of the invention
Below in conjunction with embodiment, the zero sample image sorting technique based on multi-modal discriminant analysis of the present invention is described in detail.
Zero sample image classification belongs to the image classification problem in machine learning.Classification problem refers to, according to known training dataset one grader of study, then utilizes this grader that new input example is classified.Zero sample image classification is also classification problem, simply concentrates the classification new test data do not occur at training data.The present invention passes through multi-modal discriminant analysis, sets up contacting between the visual space of image and the semantic space of image category, thus realizing zero sample image classification.
The zero sample image sorting technique based on multi-modal discriminant analysis of the present invention is intended to utilize multi-modal discriminant analysis, a kind of effective zero sample image sorting technique is provided, by the method for the present invention, the semantic feature of the visual signature of training image and image category title can be mapped to a public space, and then effectively compare the distance between the visual signature after mapping and semantic feature, such that it is able to better solve zero sample image classification problem.In this public space, visual signature and the corresponding semantic feature of image have good corresponding relation.For newly inputted test image, its visual signature is mapped to public space, finds the semantic feature of the unseen classification the most close with it, it is possible to determine the generic of test image.
The zero sample image sorting technique based on multi-modal discriminant analysis of the present invention, comprises the steps:
1) the visual signature X of training data is used1And the semantic feature X of respective classes2,…XcBuild matrix S and D, wherein,
S j r = Σ i = 1 c ( Σ k = 1 n i j x i j k x i j k T - n i j n i r n i μ i j ( x ) μ i j ( x ) T ) , j = r - Σ i = 1 c n i j n i r n i μ i j ( x ) μ i r ( x ) T , j ≠ r - - - ( 1 )
D j r = ( Σ i = 1 c n i j n i r n i μ i j ( x ) μ i j ( x ) T ) - 1 n ( Σ i = 1 c n i j μ i j ( x ) ) ( Σ i = 1 c n i j μ i j ( x ) ) T - - - ( 2 )
In formula, x is the vector in visual signature matrix or semantic feature matrix, and i represents classification sequence number, and j represents mode sequence number, and k represents sample sequence number, and c represents the sum of classification, and n represents the sum of sample,It is expressed as:
2) seek following formula, obtain mapping matrix W:
m a x W 1 , W 2 , ... W v T r ( W T D W W T S W ) , - - - ( 3 ) ;
3) the weight α i in following formula is obtained in checking massed learning
k * = argmax k [ Σ i = 2 c α i s i m ( W 1 T x j , W i T y i k ) ] ,
K=1,2 ..., n. (4)
In formula, xjIt is the visual signature of checking data,It is and xjThe semantic feature of the kth mode of corresponding classification, sim (a, b)=aTB/ (| | a | | | | b | |), is two vectorial distances;
4) mapping matrix W is used, by the visual signature of test dataSemantic feature y with unseen classificationkMap to public space;
5) by step 3) in formula to test data classify, the k in formula*It it is the test corresponding classification of data.

Claims (1)

1. a zero sample image sorting technique based on multi-modal discriminant analysis, it is characterised in that comprise the steps:
1) the visual signature X of training data is used1And the semantic feature X of respective classes2,...XcBuild matrix S and D, wherein,
S jr = Σ i = 1 c ( Σ k = 1 n ij x ijk x ijk T - n ij n ir n i μ ij ( x ) μ ij ( x ) T ) , j = r - Σ i = 1 c n ij n ir n i μ ij ( x ) μ ir ( x ) T , j ≠ r - - - ( 1 )
D jr = ( Σ i = 1 c n ij n ir n i μ ij ( x ) μ ij ( x ) T ) - 1 n ( Σ i = 1 c n ij μ ij ( x ) ) ( Σ i = 1 c n ij μ ij ( x ) ) T - - - ( 2 )
In formula, x is the vector in visual signature matrix or semantic feature matrix, and i represents classification sequence number, and j represents mode sequence number, and k represents sample sequence number, and c represents the sum of classification, and n represents the sum of sample,It is expressed as:
2) seek following formula, obtain mapping matrix W:
m a x W 1 , W 2 , ... W v T r ( W T D W W T S W ) , - - - ( 3 ) ;
3) weight α in following formula is obtained in checking massed learningi
k * = argmax k [ Σ i = 2 c α i s i m ( W 1 T x j , W i T y i k ) ] , - - - ( 4 )
K=1,2 ..., n.
In formula, xjIt is the visual signature of checking data,It is and xjThe semantic feature of the kth mode of corresponding classification, sim (a, b)=aTB/ (| | a | | | | b | |), is two vectorial distances;
4) mapping matrix W is used, by the visual signature of test dataSemantic feature y with unseen classificationkMap to public space;
5) by step 3) in formula to test data classify, the k in formula*It it is the test corresponding classification of data.
CN201610026972.7A 2016-01-15 2016-01-15 The zero sample image classification method based on multi-modal discriminant analysis Active CN105740879B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610026972.7A CN105740879B (en) 2016-01-15 2016-01-15 The zero sample image classification method based on multi-modal discriminant analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610026972.7A CN105740879B (en) 2016-01-15 2016-01-15 The zero sample image classification method based on multi-modal discriminant analysis

Publications (2)

Publication Number Publication Date
CN105740879A true CN105740879A (en) 2016-07-06
CN105740879B CN105740879B (en) 2019-05-21

Family

ID=56246320

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610026972.7A Active CN105740879B (en) 2016-01-15 2016-01-15 The zero sample image classification method based on multi-modal discriminant analysis

Country Status (1)

Country Link
CN (1) CN105740879B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250925A (en) * 2016-07-25 2016-12-21 天津大学 A kind of zero Sample video sorting technique based on the canonical correlation analysis improved
CN106485272A (en) * 2016-09-30 2017-03-08 天津大学 The zero sample classification method being embedded based on the cross-module state of manifold constraint
WO2018032354A1 (en) * 2016-08-16 2018-02-22 Nokia Technologies Oy Method and apparatus for zero-shot learning
CN110647897A (en) * 2018-06-26 2020-01-03 广东工业大学 Zero sample image classification and identification method based on multi-part attention mechanism

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104156433A (en) * 2014-08-11 2014-11-19 合肥工业大学 Image retrieval method based on semantic mapping space construction
CN104834757A (en) * 2015-06-05 2015-08-12 昆山国显光电有限公司 Image semantic retrieval method and system
CN105205096A (en) * 2015-08-18 2015-12-30 天津中科智能识别产业技术研究院有限公司 Text modal and image modal crossing type data retrieval method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104156433A (en) * 2014-08-11 2014-11-19 合肥工业大学 Image retrieval method based on semantic mapping space construction
CN104834757A (en) * 2015-06-05 2015-08-12 昆山国显光电有限公司 Image semantic retrieval method and system
CN105205096A (en) * 2015-08-18 2015-12-30 天津中科智能识别产业技术研究院有限公司 Text modal and image modal crossing type data retrieval method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MARK PALATUCCI, ETAL: "Zero-Shot Learning with Semantic Output Codes", 《NIPS"09 PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS》 *
RICHARD SOCHER, ETAL: "Zero-Shot Learning Through Cross-Modal Transfer", 《ARXIV.ORG》 *
赵才荣: "基于图嵌入与视觉注意的特征抽取", 《中国博士学位论文全文数据库》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250925A (en) * 2016-07-25 2016-12-21 天津大学 A kind of zero Sample video sorting technique based on the canonical correlation analysis improved
WO2018032354A1 (en) * 2016-08-16 2018-02-22 Nokia Technologies Oy Method and apparatus for zero-shot learning
CN106485272A (en) * 2016-09-30 2017-03-08 天津大学 The zero sample classification method being embedded based on the cross-module state of manifold constraint
CN110647897A (en) * 2018-06-26 2020-01-03 广东工业大学 Zero sample image classification and identification method based on multi-part attention mechanism
CN110647897B (en) * 2018-06-26 2023-04-18 广东工业大学 Zero sample image classification and identification method based on multi-part attention mechanism

Also Published As

Publication number Publication date
CN105740879B (en) 2019-05-21

Similar Documents

Publication Publication Date Title
CN105701514A (en) Multi-modal canonical correlation analysis method for zero sample classification
CN105718940A (en) Zero-sample image classification method based on multi-group factor analysis
US9639806B2 (en) System and method for predicting iconicity of an image
CN106294344B (en) Video retrieval method and device
CN106203483B (en) A kind of zero sample image classification method based on semantic related multi-modal mapping method
US20150178321A1 (en) Image-based 3d model search and retrieval
CN105205096A (en) Text modal and image modal crossing type data retrieval method
CN103810252B (en) Image retrieval method based on group sparse feature selection
CN106886601A (en) A kind of Cross-modality searching algorithm based on the study of subspace vehicle mixing
CN105389326B (en) Image labeling method based on weak matching probability typical relevancy models
Gupta et al. Vico: Word embeddings from visual co-occurrences
CN106250925B (en) A kind of zero Sample video classification method based on improved canonical correlation analysis
CN109492105B (en) Text emotion classification method based on multi-feature ensemble learning
CN102663447B (en) Cross-media searching method based on discrimination correlation analysis
CN106997379B (en) Method for merging similar texts based on click volumes of image texts
Niu et al. Knowledge-based topic model for unsupervised object discovery and localization
CN105740879A (en) Zero-sample image classification method based on multi-mode discriminant analysis
CN110046264A (en) A kind of automatic classification method towards mobile phone document
CN106227836B (en) Unsupervised joint visual concept learning system and unsupervised joint visual concept learning method based on images and characters
CN106485272A (en) The zero sample classification method being embedded based on the cross-module state of manifold constraint
CN113033438A (en) Data feature learning method for modal imperfect alignment
CN106021402A (en) Multi-modal multi-class Boosting frame construction method and device for cross-modal retrieval
CN103473308B (en) High-dimensional multimedia data classifying method based on maximum margin tensor study
KR20120047622A (en) System and method for managing digital contents
Yao [Retracted] Application of Higher Education Management in Colleges and Universities by Deep Learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20210714

Address after: 401120 No.2, 13th floor, building 6, No.2 Huizhu Road, Yubei District, Chongqing

Patentee after: Orpheus Group Co.,Ltd.

Address before: 300072 Tianjin City, Nankai District Wei Jin Road No. 92

Patentee before: Tianjin University

TR01 Transfer of patent right