CN104572940A - Automatic image annotation method based on deep learning and canonical correlation analysis - Google Patents

Automatic image annotation method based on deep learning and canonical correlation analysis Download PDF

Info

Publication number
CN104572940A
CN104572940A CN201410843484.6A CN201410843484A CN104572940A CN 104572940 A CN104572940 A CN 104572940A CN 201410843484 A CN201410843484 A CN 201410843484A CN 104572940 A CN104572940 A CN 104572940A
Authority
CN
China
Prior art keywords
image
vector
degree
dbm
boltzmann machine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410843484.6A
Other languages
Chinese (zh)
Other versions
CN104572940B (en
Inventor
张立民
刘凯
邓向阳
孙永威
张建廷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Naval Aeronautical University
Original Assignee
Naval Aeronautical Engineering Institute of PLA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Naval Aeronautical Engineering Institute of PLA filed Critical Naval Aeronautical Engineering Institute of PLA
Priority to CN201410843484.6A priority Critical patent/CN104572940B/en
Publication of CN104572940A publication Critical patent/CN104572940A/en
Application granted granted Critical
Publication of CN104572940B publication Critical patent/CN104572940B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an automatic image annotation method based on deep learning and canonical correlation analysis. The method includes: using a depth Boltzmann machine to extract the high-level feature vectors of images and annotation words, selecting multiple Bernoulli distribution to fit annotation word samples, and selecting Gaussian distribution to fit image features; performing canonical correlation analysis on the high-level features of the images and the annotation words; calculating the Mahalanobis distance between to-be-annotated images and training set images in canonical variable space, and performing weighted calculation according to the distance to obtain high-level annotation word features; generating image annotation words through mean field estimation. The depth Boltzmann machine comprises I-DBM and T-DBM which are respectively used for extracting the high-level feature vectors of the images and the annotation words. Each of the I-DBM and the T-DBM sequentially comprises a visible layer, a first hidden unit layer and a second hidden unit layer from bottom to top. By the method, the problem of 'semantic gap' during image semantic annotation can be solved effectively, and annotation accuracy is increased.

Description

A kind of based on the image automatic annotation method of degree of depth study with canonical correlation analysis
Technical field
The present invention relates to automatic image annotation and retrieval technique, particularly a kind of based on the image automatic annotation method of degree of depth study with canonical correlation analysis.
Background technology
Along with view data presents the growth of geometric series, how to carry out effectively managing and retrieving the study hotspot become in informatization to these view data.Although CBIR technology has had significant progress at present, and there has also been multiple civilian prototype, technology and retrieval product, but because main problem-" semantic gap " is not broken through at all, cause its retrieval effectiveness and mode still not ideal enough.For overcoming these problems, best solution adds the text semantic information relevant to picture material, i.e. image labeling to image.In view of artificial mark also exists the problems such as subjectivity is strong, annotating efficiency is low, automatic image annotation becomes the study hotspot in image labeling field gradually.
First ripe degree of depth learning model starts from the degree of depth belief network that the people such as Hinton in 2002 propose, and this model achieves the abstract expression of data message by multilayer feature extraction mechanism.As powerful probability generation model, successively having there is the various ways such as degree of depth Boltzmann machine, degree of depth autocoder in degree of depth learning model development, and is successfully applied to the fields such as speech recognition, network situation awareness and higher-dimension time series modeling.In image procossing, the Google Brain of Google uses deep neural network in image recognition, obtain huge success, can realize the simulation of part human brain function; In extensive target identification, 5 layers of convolutional network based on degree of depth learning model obtain most high-accuracy in the ImageNet test and appraisal of 2012; On image labeling and classification, the people such as Srivastava achieve good achievement too by building multi-modal degree of depth Boltzmann machine.First of ten quantum jump technology in 2013, degree of depth learning model illustrates powerful vitality and huge energy in machine learning field.
At present, based on degree of depth learning model, good effect has been achieved to Computer image genration mark vocabulary.Multi-modal degree of depth Boltzmann machine solves the Multimodal Learning problem of image and text preferably, and applies at image retrieval and mark.From experimental result, compared to other degree of depth learning models, this modelling effect is better, but still there is gap compared with the automatic image annotation algorithm of classics, and reason is that vocabulary model and top-level feature syncretizing mechanism are not suitable for automatic image annotation task.For this two problems, in conjunction with classical Automatic image annotation algorithm thinking, automatic image marking method based on degree of depth Boltzmann machine and canonical correlation analysis is proposed, employing better can process characteristics of image and generate the degree of depth Boltzmann machine model of higher level of abstraction semantic concept, in conjunction with canonical correlation analysis, designed image automatic marking model, effectively can improve management, the recall precision of large-scale image, and accelerate the processing speed of image information, there is good application prospect and important practicality, economic benefit.
Summary of the invention
For the deficiencies in the prior art, the invention provides a kind of " semantic gap " problem that can overcome linguistic indexing of pictures, realize semantic tagger comparatively accurately based on degree of depth study and the image automatic annotation method of canonical correlation analysis.
Based on degree of depth study and an image automated process for canonical correlation analysis, comprising:
(1) model training data set is built;
(2) the low-level image feature vector extracting image to be marked builds the visual feature vector obtaining respective image;
(3) degree of depth Boltzmann machine model I-DBM that the input of described visual feature vector trains is obtained corresponding image high-level characteristic vector;
(4) described image high-level characteristic is projected in the canonical variable space established, search the image of model labeled data collection adjacent with it, and generate mark vocabulary high-level characteristic vector;
(5) described mark vocabulary high-level characteristic vector is inputted the degree of depth Boltzmann machine model T-DBM trained and marked vocabulary accordingly.
The model training data set of described step (1) is obtained by following steps:
(S11) the mark dictionary comprising several text marking vocabulary is created;
(S12) select the image marked of respective classes as model training data set according to mark dictionary;
The degree of depth Boltzmann machine I-DBM trained in described step (3) is obtained by following steps:
(S31) extracting training data concentrates the low-level image feature of every width image vector to form the visual feature vector obtaining respective image, and determines the mark lexical feature vector of every width image according to mark dictionary and mark vocabulary;
(S32) degree of depth Boltzmann machine model I-DBM is built, described degree of depth Boltzmann machine model comprises visible layer, the first hidden unit layer, the second hidden unit layer from bottom to up successively, any two nodes in each layer are without connection, and any two nodes between adjacent layer are bi-directionally connected;
(S33) utilize the visual feature vector of all images of model training data centralization to described degree of depth Boltzmann machine model training, obtain the degree of depth Boltzmann machine model trained.
The canonical variable space established in described step (4) is obtained by following steps:
(S41) the I-DBM high-level characteristic vector that training data concentrates all images is extracted;
(S42) the T-DBM high-level characteristic vector of the mark word that all images are corresponding in training set is extracted;
(S43) described I-DBM high-level characteristic vector is carried out canonical correlation analysis with T-DBM high-level characteristic vector, obtain projection matrix.
The degree of depth Boltzmann machine T-DBM trained in described step (5) and (S42) is obtained by following steps:
(S51) the mark lexical feature vector of every width image is determined according to mark dictionary and mark vocabulary;
(S52) degree of depth Boltzmann machine model T-DBM is built, described degree of depth Boltzmann machine model comprises visible layer, the first hidden unit layer, the second hidden unit layer from bottom to up successively, any two nodes in each layer are without connection, and any two nodes between adjacent layer are bi-directionally connected;
(S53) utilize the mark lexical feature of model training data centralization all images vector to described degree of depth Boltzmann machine model training, obtain the degree of depth Boltzmann machine model trained.
Low-level image feature based on first extracting image to be marked in degree of depth study and the image automatic annotation method of canonical correlation analysis of the present invention, and the visual feature vector obtaining image is built according to all low-level image features, then direct the visible layer of visual feature vector as degree of depth Boltzmann machine model I-DBM to be inputted, using the second hidden unit layer state of I-DBM as high-level characteristic vector, projected in canonical variable space, search the top n image that distance mahalanobis distance is nearest, according to the degree of depth Boltzmann machine T-DBM second hidden unit layer state that distance weighted generation is new, finally generate the mark vocabulary of new mark vocabulary vector as image by T-DBM.
In degree of depth Boltzmann machine model, high-level semantic obtains by low-level image feature is abstract, because low-level image feature is difficult to be transitioned into high-level semantic, therefore can produce " semantic gap ".Training speed too much can be caused excessively slow in view of hidden unit in practical application counts layer by layer, therefore, two hidden unit layers (being respectively the first hidden unit layer and the second hidden unit layer) are comprised in degree of depth Boltzmann machine model used in the present invention, the middle abstracting power that two hidden unit layers improve degree of depth Boltzmann machine is set, cross over " semantic gap " in linguistic indexing of pictures process, improve mark accuracy rate.
Text eigenvector in described step (S51) is a 0-1 vector (namely in vector, all elements can only be 0 or 1), and described Text eigenvector determines the mark lexical feature vector of each image according to following steps:
(S51-1) initialization full null vector, of making every one dimension corresponding marks vocabulary;
(S51-2) according to the mark word of image, be 1 by the element assignment of corresponding dimension, namely obtain the mark vocabulary vector of this image.
The degree of depth Boltzmann machine model that described step builds, any two nodes in each layer without connection, being bi-directionally connected of any two nodes between adjacent layer.
Described learns the image automatic annotation method with canonical correlation analysis based on the degree of depth, and it is characterized in that, the training process of the degree of depth Boltzmann machine model in described step (S33) (S53) is as follows:
(S53-1) using visual feature vector or mark lexical feature vector as visible layer;
(S53-2) using visible layer and the first hidden unit layer as limited Boltzmann machine, using visual feature vector as the input of visible layer, use and the end-state obtaining connection weights between visible layer and the first hidden unit layer and the first hidden unit layer is trained this limited Boltzmann machine to sdpecific dispersion algorithm;
(S53-3) using the first hidden unit layer and the second hidden unit layer as limited Boltzmann machine, using the end-state of the first hidden unit layer as the end-state of the first hidden unit layer as the input of the first hidden unit layer, use and the end-state obtaining connection weights between the first hidden unit layer and the second hidden unit layer and the second hidden unit layer is trained this limited Boltzmann machine to sdpecific dispersion algorithm.
Canonical correlation analysis process in described step (S43) is as follows:
(S43-1) by described I-DBM high-level characteristic vector and the standardization of T-DBM high-level characteristic vector, association's difference battle array is calculated;
(S43-2) calculate eigenwert and the proper vector of association's difference battle array, carry out sorting and judge whether equal;
(S43-3) by eigenwert according to sequence from big to small, and according to this order proper vector is sorted;
(S43-4) using the row vector of proper vector as matrix, canonical correlation analysis result is obtained.
Described I-DBM model visible layer node number is identical with the dimension of visual feature vector.
In identification and training process, input all using visual feature vector as I-DBM visible layer, therefore each node of I-DBM visible layer must be mutually corresponding with the element of one dimension every in visual feature vector, then the node number of I-DBM visible layer is identical with the dimension of visual feature vector.
Described T-DBM model visible layer node number is identical with vocabulary number in mark dictionary.
In identification and training process, all using the mark vocabulary of image vector as the input of T-DBM visible layer, therefore each node of T-DBM visible layer must be mutually corresponding with vocabulary in mark dictionary, then the node number of T-DBM visible layer is identical with vocabulary number in mark dictionary.
First hidden unit layer and the second hidden unit node layer number of described I-DBM empirically set, and are generally 400 ~ 500, can experimentally effect adjust in actual applications.
Described image low-level image feature vector comprises described bottom layer image proper vector and comprises color layout's description vectors, color structure description vectors, scalable color description vector, edge histogram description vectors, GIST proper vector and the visual word bag vector based on SIFT feature.
Described learns, with the image automatic annotation method of canonical correlation analysis, to it is characterized in that based on the degree of depth, and the visual word bag vector based on SIFT feature is obtained by following steps:
A () calculates the SIFT feature vector of all images of described model training data centralization;
B () is carried out cluster to all SIFT feature vectors and is obtained 500 cluster centres;
C (), using each cluster centre as vision word, adds up each vision word occurrence number in the SIFT feature vector of every width image and forms the visual word bag vector based on the feature of SIFT.
Embodiment
Below in conjunction with instantiation, the present invention is described in further detail.
Based on degree of depth study and an image automatic annotation method for canonical correlation analysis, comprising:
(1) the low-level image feature vector extracting image to be marked builds the visual feature vector obtaining respective image;
In this enforcement, low-level image feature vector comprises color layout's description vectors, color structure description vectors, scalable color description vector, edge histogram description vectors, GIST proper vector and the visual word bag vector based on SIFT feature.
Visual word bag vector based on SIFT feature is obtained by following steps extraction:
A () calculates the SIFT feature vector of all images of described model training data centralization;
B () is carried out cluster to all SIFT feature vectors and is obtained 500 cluster centres;
C () is using each cluster centre as vision word, add up each vision word in the SIFT feature vector of every width image occurrence number and formed respective image the visual word bag based on SIFT feature vector, the dimension of visual word bag vector equals 500 (equaling the number of cluster centre), and in visual word bag vector, each element is respectively the number of times that in all SIFT feature vectors of respective image, different vision word occurs.
(2) degree of depth Boltzmann machine model I-DBM that the input of the visual feature vector of image to be marked trains is obtained corresponding image high-level characteristic vector;
The degree of depth Boltzmann machine model trained used in step (2) in this example is obtained by following steps:
(S21) extracting training data concentrates the low-level image feature of every width image vector to form the visual feature vector obtaining respective image;
(S22) degree of depth Boltzmann machine model I-DBM is built, described degree of depth Boltzmann machine model comprises visible layer, the first hidden unit layer, the second hidden unit layer from bottom to up successively, any two nodes in each layer are without connection, and any two nodes between adjacent layer are bi-directionally connected;
(S23) utilize the visual feature vector of all images of model training data centralization to described degree of depth Boltzmann machine model training, obtain the degree of depth Boltzmann machine model trained
(3) described image high-level characteristic is projected in the canonical variable space established, search the image of model labeled data collection adjacent with it, and generate mark vocabulary high-level characteristic vector;
The canonical correlation space used in step (3) in this example is obtained by following steps:
(S31) the I-DBM high-level characteristic vector that training data concentrates all images is extracted;
(S32) the T-DBM high-level characteristic vector of the mark word that all images are corresponding in training set is extracted;
(S33) described I-DBM high-level characteristic vector is carried out canonical correlation analysis with T-DBM high-level characteristic vector, obtain projection matrix.
(4) described mark vocabulary high-level characteristic vector is inputted the degree of depth Boltzmann machine model T-DBM trained and marked vocabulary accordingly.
The canonical correlation space used in step (4) in this example is obtained by following steps:
(S41) the I-DBM high-level characteristic vector that training data concentrates all images is extracted;
(S42) the T-DBM high-level characteristic vector of the mark word that all images are corresponding in training set is extracted;
(S43) described I-DBM high-level characteristic vector is carried out canonical correlation analysis with T-DBM high-level characteristic vector, obtain projection matrix.
In this example, the canonical correlation analysis of step (S43) is undertaken by following steps:
(S43-1) by described I-DBM high-level characteristic vector and the standardization of T-DBM high-level characteristic vector, association's difference battle array is calculated;
(S43-2) calculate eigenwert and the proper vector of association's difference battle array, carry out sorting and judge whether equal;
(S43-3) by eigenwert according to sequence from big to small, and according to this order proper vector is sorted;
(S43-4) using the row vector of proper vector as matrix, canonical correlation analysis result is obtained.
I-DBM visible layer node number is identical with the dimension of visual feature vector, is 990 dimensions.
T-DBM visible layer node number is identical with the vocabulary number of mark dictionary, is 260 dimensions.
Node number in I-DBM first hidden unit layer and the second hidden unit layer is 400.
T-DBM first hidden unit layer and the second hidden unit layer interior joint number are 200.
Step (S23) and (S42) obtain the degree of depth Boltzmann machine model trained, and concrete training process is as follows:
(S2-1) using visual feature vector or mark lexical feature vector as visible layer;
(S2-2) using visible layer and the first hidden unit layer as limited Boltzmann machine, using visual feature vector as the input of visible layer, use and the end-state obtaining connection weights between visible layer and the first hidden unit layer and the first hidden unit layer is trained this limited Boltzmann machine to sdpecific dispersion algorithm;
(S2-3) using the first hidden unit layer and the second hidden unit layer as limited Boltzmann machine, using the end-state of the first hidden unit layer as the end-state of the first hidden unit layer as the input of the first hidden unit layer, use and the end-state obtaining connection weights between the first hidden unit layer and the second hidden unit layer and the second hidden unit layer is trained this limited Boltzmann machine to sdpecific dispersion algorithm.
The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, is anyly familiar with those skilled in the art in the technical scope that the present invention discloses; the change that can expect easily or replacement, all should be encompassed within protection scope of the present invention.

Claims (7)

1., based on degree of depth study and an image automatic annotation method for canonical correlation analysis, it is characterized in that, comprise:
(1) model training data set is built;
(2) the low-level image feature vector extracting image to be marked builds the visual feature vector obtaining respective image;
(3) degree of depth Boltzmann machine model I-DBM that the input of described visual feature vector trains is obtained corresponding image high-level characteristic vector;
(4) described image high-level characteristic is projected in the canonical variable space established, search the image of model labeled data collection adjacent with it, and generate mark vocabulary high-level characteristic vector;
(5) described mark vocabulary high-level characteristic vector is inputted the degree of depth Boltzmann machine model T-DBM trained and marked vocabulary accordingly;
The model training data set of described step (1) is obtained by following steps:
(S11) the mark dictionary comprising several text marking vocabulary is created;
(S12) select the image marked of respective classes as model training data set according to mark dictionary;
The degree of depth Boltzmann machine I-DBM trained in described step (3) is obtained by following steps:
(S31) extracting training data concentrates the low-level image feature of every width image vector to form the visual feature vector obtaining respective image, and determines the mark lexical feature vector of every width image according to mark dictionary and mark vocabulary;
(S32) degree of depth Boltzmann machine model I-DBM is built, described degree of depth Boltzmann machine model comprises visible layer, the first hidden unit layer, the second hidden unit layer from bottom to up successively, any two nodes in each layer are without connection, and any two nodes between adjacent layer are bi-directionally connected;
(S33) utilize the visual feature vector of all images of model training data centralization to described degree of depth Boltzmann machine model training, obtain the degree of depth Boltzmann machine model trained;
The canonical variable space established in described step (4) is obtained by following steps:
(S41) the I-DBM high-level characteristic vector that training data concentrates all images is extracted;
(S42) the T-DBM high-level characteristic vector of the mark word that all images are corresponding in training set is extracted;
(S43) described I-DBM high-level characteristic vector is carried out canonical correlation analysis with T-DBM high-level characteristic vector, obtain projection matrix;
The degree of depth Boltzmann machine T-DBM trained in described step (5) and (S42) is obtained by following steps:
(S51) the mark lexical feature vector of every width image is determined according to mark dictionary and mark vocabulary;
(S52) degree of depth Boltzmann machine model T-DBM is built, described degree of depth Boltzmann machine model comprises visible layer, the first hidden unit layer, the second hidden unit layer from bottom to up successively, any two nodes in each layer are without connection, and any two nodes between adjacent layer are bi-directionally connected;
(S53) utilize the mark lexical feature of model training data centralization all images vector to described degree of depth Boltzmann machine model training, obtain the degree of depth Boltzmann machine model trained.
2., as claimed in claim 1 based on the image automatic annotation method of degree of depth study with canonical correlation analysis, it is characterized in that, the training process of the degree of depth Boltzmann machine model in described step (S33) (S53) is as follows:
(S2-1) using visual feature vector or mark lexical feature vector as visible layer;
(S2-2) using visible layer and the first hidden unit layer as limited Boltzmann machine, using visual feature vector as the input of visible layer, use and the end-state obtaining connection weights between visible layer and the first hidden unit layer and the first hidden unit layer is trained this limited Boltzmann machine to sdpecific dispersion algorithm;
(S2-3) using the first hidden unit layer and the second hidden unit layer as limited Boltzmann machine, using the end-state of the first hidden unit layer as the end-state of the first hidden unit layer as the input of the first hidden unit layer, use and the end-state obtaining connection weights between the first hidden unit layer and the second hidden unit layer and the second hidden unit layer is trained this limited Boltzmann machine to sdpecific dispersion algorithm.
3., as claimed in claim 2 based on the image automatic annotation method of degree of depth study with canonical correlation analysis, it is characterized in that, the node number of described I-DBM visible layer is identical with the dimension of visual feature vector.
4., as claimed in claim 3 based on the image automatic annotation method of degree of depth study with canonical correlation analysis, it is characterized in that, the node number of described T-DBM visible layer is identical with the dimension of Text eigenvector.
5. learn with the canonical correlation analysis process in the automatic image annotation of canonical correlation analysis and step (S44) as follows based on the degree of depth as claimed in claim 4:
(S5-1) by described I-DBM high-level characteristic vector and the standardization of T-DBM high-level characteristic vector, association's difference battle array is calculated;
(S5-2) calculate eigenwert and the proper vector of association's difference battle array, carry out sorting and judge whether equal;
(S5-3) by eigenwert according to sequence from big to small, and according to this order proper vector is sorted;
(S5-4) using the row vector of proper vector as matrix, canonical correlation analysis result is obtained.
6. as described in claim arbitrary in Claims 1 to 5 based on degree of depth study and the image automatic annotation method of canonical correlation analysis, it is characterized in that, described bottom layer image proper vector comprises color layout's description vectors, color structure description vectors, scalable color description vector, edge histogram description vectors, GIST proper vector and the visual word bag vector based on SIFT feature.
7. as claimed in claim 6 based on degree of depth study and the image automatic annotation method of canonical correlation analysis, it is characterized in that, the visual word bag vector based on SIFT feature is obtained by following steps:
A () calculates the SIFT feature vector of all images of described model training data centralization;
B () is carried out cluster to all SIFT feature vectors and is obtained 500 cluster centres;
C (), using each cluster centre as vision word, adds up each vision word occurrence number in the SIFT feature vector of every width image and forms the visual word bag vector based on the feature of SIFT.
CN201410843484.6A 2014-12-30 2014-12-30 A kind of image automatic annotation method based on deep learning and canonical correlation analysis Active CN104572940B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410843484.6A CN104572940B (en) 2014-12-30 2014-12-30 A kind of image automatic annotation method based on deep learning and canonical correlation analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410843484.6A CN104572940B (en) 2014-12-30 2014-12-30 A kind of image automatic annotation method based on deep learning and canonical correlation analysis

Publications (2)

Publication Number Publication Date
CN104572940A true CN104572940A (en) 2015-04-29
CN104572940B CN104572940B (en) 2017-11-21

Family

ID=53089002

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410843484.6A Active CN104572940B (en) 2014-12-30 2014-12-30 A kind of image automatic annotation method based on deep learning and canonical correlation analysis

Country Status (1)

Country Link
CN (1) CN104572940B (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105389326A (en) * 2015-09-16 2016-03-09 中国科学院计算技术研究所 Image annotation method based on weak matching probability canonical correlation model
CN105702250A (en) * 2016-01-06 2016-06-22 福建天晴数码有限公司 Voice recognition method and device
CN105741832A (en) * 2016-01-27 2016-07-06 广东外语外贸大学 Spoken language evaluation method based on deep learning and spoken language evaluation system
CN105808752A (en) * 2016-03-10 2016-07-27 大连理工大学 CCA and 2PKNN based automatic image annotation method
CN106250915A (en) * 2016-07-22 2016-12-21 福州大学 A kind of automatic image marking method merging depth characteristic and semantic neighborhood
GB2547068A (en) * 2016-01-13 2017-08-09 Adobe Systems Inc Semantic natural language vector space
CN107066464A (en) * 2016-01-13 2017-08-18 奥多比公司 Semantic Natural Language Vector Space
CN107169051A (en) * 2017-04-26 2017-09-15 山东师范大学 Based on semantic related method for searching three-dimension model and system between body
CN107194437A (en) * 2017-06-22 2017-09-22 重庆大学 Image classification method based on Gist feature extractions Yu conceptual machine recurrent neural network
CN107292322A (en) * 2016-03-31 2017-10-24 华为技术有限公司 A kind of image classification method, deep learning model and computer system
US9811765B2 (en) 2016-01-13 2017-11-07 Adobe Systems Incorporated Image captioning with weak supervision
CN107357927A (en) * 2017-07-26 2017-11-17 深圳爱拼信息科技有限公司 A kind of Document Modeling method
CN108628926A (en) * 2017-03-20 2018-10-09 奥多比公司 Topic association and marking for fine and close image
CN109493249A (en) * 2018-11-05 2019-03-19 北京邮电大学 A kind of analysis method of electricity consumption data on Multiple Time Scales
CN109833061A (en) * 2017-11-24 2019-06-04 无锡祥生医疗科技股份有限公司 The method of optimization ultrasonic image-forming system parameter based on deep learning
CN110298386A (en) * 2019-06-10 2019-10-01 成都积微物联集团股份有限公司 A kind of label automation definition method of image content-based
WO2020248391A1 (en) * 2019-06-14 2020-12-17 平安科技(深圳)有限公司 Case brief classification method and apparatus, computer device, and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120155774A1 (en) * 2008-05-30 2012-06-21 Microsoft Corporation Statistical Approach to Large-scale Image Annotation
CN103823845A (en) * 2014-01-28 2014-05-28 浙江大学 Method for automatically annotating remote sensing images on basis of deep learning
CN104021224A (en) * 2014-06-25 2014-09-03 中国科学院自动化研究所 Image labeling method based on layer-by-layer label fusing deep network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120155774A1 (en) * 2008-05-30 2012-06-21 Microsoft Corporation Statistical Approach to Large-scale Image Annotation
CN103823845A (en) * 2014-01-28 2014-05-28 浙江大学 Method for automatically annotating remote sensing images on basis of deep learning
CN104021224A (en) * 2014-06-25 2014-09-03 中国科学院自动化研究所 Image labeling method based on layer-by-layer label fusing deep network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
NITISH SRIVASTAVA ET AL.: "Multimodal Learning with Deep Boltzmann Machines", 《JOURNAL OF MACHINE LEARNING ?RESEARCH 15(2014)》 *
李静: "基于多特征的图像标注研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105389326A (en) * 2015-09-16 2016-03-09 中国科学院计算技术研究所 Image annotation method based on weak matching probability canonical correlation model
CN105389326B (en) * 2015-09-16 2018-08-31 中国科学院计算技术研究所 Image labeling method based on weak matching probability typical relevancy models
CN105702250A (en) * 2016-01-06 2016-06-22 福建天晴数码有限公司 Voice recognition method and device
US9792534B2 (en) 2016-01-13 2017-10-17 Adobe Systems Incorporated Semantic natural language vector space
GB2547068B (en) * 2016-01-13 2019-06-19 Adobe Inc Semantic natural language vector space
GB2547068A (en) * 2016-01-13 2017-08-09 Adobe Systems Inc Semantic natural language vector space
CN107066464A (en) * 2016-01-13 2017-08-18 奥多比公司 Semantic Natural Language Vector Space
CN107066464B (en) * 2016-01-13 2022-12-27 奥多比公司 Semantic natural language vector space
US9811765B2 (en) 2016-01-13 2017-11-07 Adobe Systems Incorporated Image captioning with weak supervision
CN105741832A (en) * 2016-01-27 2016-07-06 广东外语外贸大学 Spoken language evaluation method based on deep learning and spoken language evaluation system
CN105808752A (en) * 2016-03-10 2016-07-27 大连理工大学 CCA and 2PKNN based automatic image annotation method
CN105808752B (en) * 2016-03-10 2018-04-10 大连理工大学 A kind of automatic image marking method based on CCA and 2PKNN
CN107292322A (en) * 2016-03-31 2017-10-24 华为技术有限公司 A kind of image classification method, deep learning model and computer system
CN106250915A (en) * 2016-07-22 2016-12-21 福州大学 A kind of automatic image marking method merging depth characteristic and semantic neighborhood
CN106250915B (en) * 2016-07-22 2019-08-09 福州大学 A kind of automatic image marking method of fusion depth characteristic and semantic neighborhood
CN108628926A (en) * 2017-03-20 2018-10-09 奥多比公司 Topic association and marking for fine and close image
CN107169051A (en) * 2017-04-26 2017-09-15 山东师范大学 Based on semantic related method for searching three-dimension model and system between body
CN107169051B (en) * 2017-04-26 2019-09-24 山东师范大学 Based on relevant method for searching three-dimension model semantic between ontology and system
CN107194437B (en) * 2017-06-22 2020-04-07 重庆大学 Image classification method based on Gist feature extraction and concept machine recurrent neural network
CN107194437A (en) * 2017-06-22 2017-09-22 重庆大学 Image classification method based on Gist feature extractions Yu conceptual machine recurrent neural network
CN107357927A (en) * 2017-07-26 2017-11-17 深圳爱拼信息科技有限公司 A kind of Document Modeling method
CN107357927B (en) * 2017-07-26 2020-06-12 深圳爱拼信息科技有限公司 Document modeling method
CN109833061A (en) * 2017-11-24 2019-06-04 无锡祥生医疗科技股份有限公司 The method of optimization ultrasonic image-forming system parameter based on deep learning
US11564661B2 (en) 2017-11-24 2023-01-31 Chison Medical Technologies Co., Ltd. Method for optimizing ultrasonic imaging system parameter based on deep learning
CN109493249A (en) * 2018-11-05 2019-03-19 北京邮电大学 A kind of analysis method of electricity consumption data on Multiple Time Scales
CN109493249B (en) * 2018-11-05 2021-11-12 北京邮电大学 Analysis method of electricity consumption data on multiple time scales
CN110298386A (en) * 2019-06-10 2019-10-01 成都积微物联集团股份有限公司 A kind of label automation definition method of image content-based
CN110298386B (en) * 2019-06-10 2023-07-28 成都积微物联集团股份有限公司 Label automatic definition method based on image content
WO2020248391A1 (en) * 2019-06-14 2020-12-17 平安科技(深圳)有限公司 Case brief classification method and apparatus, computer device, and storage medium

Also Published As

Publication number Publication date
CN104572940B (en) 2017-11-21

Similar Documents

Publication Publication Date Title
CN104572940A (en) Automatic image annotation method based on deep learning and canonical correlation analysis
Bansal et al. Zero-shot object detection
Liu et al. Sign language recognition with long short-term memory
Zhou et al. Application of deep learning in object detection
Alayrac et al. Unsupervised learning from narrated instruction videos
Li et al. Visual question answering with question representation update (qru)
CN106446526B (en) Electronic health record entity relation extraction method and device
CN104217225B (en) A kind of sensation target detection and mask method
CN109934261A (en) A kind of Knowledge driving parameter transformation model and its few sample learning method
CN109376242A (en) Text classification algorithm based on Recognition with Recurrent Neural Network variant and convolutional neural networks
CN105389326B (en) Image labeling method based on weak matching probability typical relevancy models
CN109299657B (en) Group behavior identification method and device based on semantic attention retention mechanism
CN106650694A (en) Human face recognition method taking convolutional neural network as feature extractor
CN106778796A (en) Human motion recognition method and system based on hybrid cooperative model training
Tung et al. Reward learning from narrated demonstrations
CN109271539A (en) A kind of image automatic annotation method and device based on deep learning
CN108765383A (en) Video presentation method based on depth migration study
CN105718532A (en) Cross-media sequencing method based on multi-depth network structure
CN106202030A (en) A kind of rapid serial mask method based on isomery labeled data and device
CN113688894B (en) Fine granularity image classification method integrating multiple granularity features
Liang et al. An expressive deep model for human action parsing from a single image
CN110263165A (en) A kind of user comment sentiment analysis method based on semi-supervised learning
CN109784288B (en) Pedestrian re-identification method based on discrimination perception fusion
Chen et al. Efficient maximum appearance search for large-scale object detection
Kindiroglu et al. Temporal accumulative features for sign language recognition

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200214

Address after: 264001 Research and Academic Department, 188 Erma Road, Zhifu District, Yantai City, Shandong Province

Patentee after: Naval Aviation University of PLA

Address before: 264001 Yantai City, Zhifu Province, No. two road, No. 188, Department of research,

Patentee before: Naval Aeronautical Engineering Institute PLA

TR01 Transfer of patent right