CN104572940A - Automatic image annotation method based on deep learning and canonical correlation analysis - Google Patents
Automatic image annotation method based on deep learning and canonical correlation analysis Download PDFInfo
- Publication number
- CN104572940A CN104572940A CN201410843484.6A CN201410843484A CN104572940A CN 104572940 A CN104572940 A CN 104572940A CN 201410843484 A CN201410843484 A CN 201410843484A CN 104572940 A CN104572940 A CN 104572940A
- Authority
- CN
- China
- Prior art keywords
- image
- vector
- degree
- dbm
- boltzmann machine
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2155—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an automatic image annotation method based on deep learning and canonical correlation analysis. The method includes: using a depth Boltzmann machine to extract the high-level feature vectors of images and annotation words, selecting multiple Bernoulli distribution to fit annotation word samples, and selecting Gaussian distribution to fit image features; performing canonical correlation analysis on the high-level features of the images and the annotation words; calculating the Mahalanobis distance between to-be-annotated images and training set images in canonical variable space, and performing weighted calculation according to the distance to obtain high-level annotation word features; generating image annotation words through mean field estimation. The depth Boltzmann machine comprises I-DBM and T-DBM which are respectively used for extracting the high-level feature vectors of the images and the annotation words. Each of the I-DBM and the T-DBM sequentially comprises a visible layer, a first hidden unit layer and a second hidden unit layer from bottom to top. By the method, the problem of 'semantic gap' during image semantic annotation can be solved effectively, and annotation accuracy is increased.
Description
Technical field
The present invention relates to automatic image annotation and retrieval technique, particularly a kind of based on the image automatic annotation method of degree of depth study with canonical correlation analysis.
Background technology
Along with view data presents the growth of geometric series, how to carry out effectively managing and retrieving the study hotspot become in informatization to these view data.Although CBIR technology has had significant progress at present, and there has also been multiple civilian prototype, technology and retrieval product, but because main problem-" semantic gap " is not broken through at all, cause its retrieval effectiveness and mode still not ideal enough.For overcoming these problems, best solution adds the text semantic information relevant to picture material, i.e. image labeling to image.In view of artificial mark also exists the problems such as subjectivity is strong, annotating efficiency is low, automatic image annotation becomes the study hotspot in image labeling field gradually.
First ripe degree of depth learning model starts from the degree of depth belief network that the people such as Hinton in 2002 propose, and this model achieves the abstract expression of data message by multilayer feature extraction mechanism.As powerful probability generation model, successively having there is the various ways such as degree of depth Boltzmann machine, degree of depth autocoder in degree of depth learning model development, and is successfully applied to the fields such as speech recognition, network situation awareness and higher-dimension time series modeling.In image procossing, the Google Brain of Google uses deep neural network in image recognition, obtain huge success, can realize the simulation of part human brain function; In extensive target identification, 5 layers of convolutional network based on degree of depth learning model obtain most high-accuracy in the ImageNet test and appraisal of 2012; On image labeling and classification, the people such as Srivastava achieve good achievement too by building multi-modal degree of depth Boltzmann machine.First of ten quantum jump technology in 2013, degree of depth learning model illustrates powerful vitality and huge energy in machine learning field.
At present, based on degree of depth learning model, good effect has been achieved to Computer image genration mark vocabulary.Multi-modal degree of depth Boltzmann machine solves the Multimodal Learning problem of image and text preferably, and applies at image retrieval and mark.From experimental result, compared to other degree of depth learning models, this modelling effect is better, but still there is gap compared with the automatic image annotation algorithm of classics, and reason is that vocabulary model and top-level feature syncretizing mechanism are not suitable for automatic image annotation task.For this two problems, in conjunction with classical Automatic image annotation algorithm thinking, automatic image marking method based on degree of depth Boltzmann machine and canonical correlation analysis is proposed, employing better can process characteristics of image and generate the degree of depth Boltzmann machine model of higher level of abstraction semantic concept, in conjunction with canonical correlation analysis, designed image automatic marking model, effectively can improve management, the recall precision of large-scale image, and accelerate the processing speed of image information, there is good application prospect and important practicality, economic benefit.
Summary of the invention
For the deficiencies in the prior art, the invention provides a kind of " semantic gap " problem that can overcome linguistic indexing of pictures, realize semantic tagger comparatively accurately based on degree of depth study and the image automatic annotation method of canonical correlation analysis.
Based on degree of depth study and an image automated process for canonical correlation analysis, comprising:
(1) model training data set is built;
(2) the low-level image feature vector extracting image to be marked builds the visual feature vector obtaining respective image;
(3) degree of depth Boltzmann machine model I-DBM that the input of described visual feature vector trains is obtained corresponding image high-level characteristic vector;
(4) described image high-level characteristic is projected in the canonical variable space established, search the image of model labeled data collection adjacent with it, and generate mark vocabulary high-level characteristic vector;
(5) described mark vocabulary high-level characteristic vector is inputted the degree of depth Boltzmann machine model T-DBM trained and marked vocabulary accordingly.
The model training data set of described step (1) is obtained by following steps:
(S11) the mark dictionary comprising several text marking vocabulary is created;
(S12) select the image marked of respective classes as model training data set according to mark dictionary;
The degree of depth Boltzmann machine I-DBM trained in described step (3) is obtained by following steps:
(S31) extracting training data concentrates the low-level image feature of every width image vector to form the visual feature vector obtaining respective image, and determines the mark lexical feature vector of every width image according to mark dictionary and mark vocabulary;
(S32) degree of depth Boltzmann machine model I-DBM is built, described degree of depth Boltzmann machine model comprises visible layer, the first hidden unit layer, the second hidden unit layer from bottom to up successively, any two nodes in each layer are without connection, and any two nodes between adjacent layer are bi-directionally connected;
(S33) utilize the visual feature vector of all images of model training data centralization to described degree of depth Boltzmann machine model training, obtain the degree of depth Boltzmann machine model trained.
The canonical variable space established in described step (4) is obtained by following steps:
(S41) the I-DBM high-level characteristic vector that training data concentrates all images is extracted;
(S42) the T-DBM high-level characteristic vector of the mark word that all images are corresponding in training set is extracted;
(S43) described I-DBM high-level characteristic vector is carried out canonical correlation analysis with T-DBM high-level characteristic vector, obtain projection matrix.
The degree of depth Boltzmann machine T-DBM trained in described step (5) and (S42) is obtained by following steps:
(S51) the mark lexical feature vector of every width image is determined according to mark dictionary and mark vocabulary;
(S52) degree of depth Boltzmann machine model T-DBM is built, described degree of depth Boltzmann machine model comprises visible layer, the first hidden unit layer, the second hidden unit layer from bottom to up successively, any two nodes in each layer are without connection, and any two nodes between adjacent layer are bi-directionally connected;
(S53) utilize the mark lexical feature of model training data centralization all images vector to described degree of depth Boltzmann machine model training, obtain the degree of depth Boltzmann machine model trained.
Low-level image feature based on first extracting image to be marked in degree of depth study and the image automatic annotation method of canonical correlation analysis of the present invention, and the visual feature vector obtaining image is built according to all low-level image features, then direct the visible layer of visual feature vector as degree of depth Boltzmann machine model I-DBM to be inputted, using the second hidden unit layer state of I-DBM as high-level characteristic vector, projected in canonical variable space, search the top n image that distance mahalanobis distance is nearest, according to the degree of depth Boltzmann machine T-DBM second hidden unit layer state that distance weighted generation is new, finally generate the mark vocabulary of new mark vocabulary vector as image by T-DBM.
In degree of depth Boltzmann machine model, high-level semantic obtains by low-level image feature is abstract, because low-level image feature is difficult to be transitioned into high-level semantic, therefore can produce " semantic gap ".Training speed too much can be caused excessively slow in view of hidden unit in practical application counts layer by layer, therefore, two hidden unit layers (being respectively the first hidden unit layer and the second hidden unit layer) are comprised in degree of depth Boltzmann machine model used in the present invention, the middle abstracting power that two hidden unit layers improve degree of depth Boltzmann machine is set, cross over " semantic gap " in linguistic indexing of pictures process, improve mark accuracy rate.
Text eigenvector in described step (S51) is a 0-1 vector (namely in vector, all elements can only be 0 or 1), and described Text eigenvector determines the mark lexical feature vector of each image according to following steps:
(S51-1) initialization full null vector, of making every one dimension corresponding marks vocabulary;
(S51-2) according to the mark word of image, be 1 by the element assignment of corresponding dimension, namely obtain the mark vocabulary vector of this image.
The degree of depth Boltzmann machine model that described step builds, any two nodes in each layer without connection, being bi-directionally connected of any two nodes between adjacent layer.
Described learns the image automatic annotation method with canonical correlation analysis based on the degree of depth, and it is characterized in that, the training process of the degree of depth Boltzmann machine model in described step (S33) (S53) is as follows:
(S53-1) using visual feature vector or mark lexical feature vector as visible layer;
(S53-2) using visible layer and the first hidden unit layer as limited Boltzmann machine, using visual feature vector as the input of visible layer, use and the end-state obtaining connection weights between visible layer and the first hidden unit layer and the first hidden unit layer is trained this limited Boltzmann machine to sdpecific dispersion algorithm;
(S53-3) using the first hidden unit layer and the second hidden unit layer as limited Boltzmann machine, using the end-state of the first hidden unit layer as the end-state of the first hidden unit layer as the input of the first hidden unit layer, use and the end-state obtaining connection weights between the first hidden unit layer and the second hidden unit layer and the second hidden unit layer is trained this limited Boltzmann machine to sdpecific dispersion algorithm.
Canonical correlation analysis process in described step (S43) is as follows:
(S43-1) by described I-DBM high-level characteristic vector and the standardization of T-DBM high-level characteristic vector, association's difference battle array is calculated;
(S43-2) calculate eigenwert and the proper vector of association's difference battle array, carry out sorting and judge whether equal;
(S43-3) by eigenwert according to sequence from big to small, and according to this order proper vector is sorted;
(S43-4) using the row vector of proper vector as matrix, canonical correlation analysis result is obtained.
Described I-DBM model visible layer node number is identical with the dimension of visual feature vector.
In identification and training process, input all using visual feature vector as I-DBM visible layer, therefore each node of I-DBM visible layer must be mutually corresponding with the element of one dimension every in visual feature vector, then the node number of I-DBM visible layer is identical with the dimension of visual feature vector.
Described T-DBM model visible layer node number is identical with vocabulary number in mark dictionary.
In identification and training process, all using the mark vocabulary of image vector as the input of T-DBM visible layer, therefore each node of T-DBM visible layer must be mutually corresponding with vocabulary in mark dictionary, then the node number of T-DBM visible layer is identical with vocabulary number in mark dictionary.
First hidden unit layer and the second hidden unit node layer number of described I-DBM empirically set, and are generally 400 ~ 500, can experimentally effect adjust in actual applications.
Described image low-level image feature vector comprises described bottom layer image proper vector and comprises color layout's description vectors, color structure description vectors, scalable color description vector, edge histogram description vectors, GIST proper vector and the visual word bag vector based on SIFT feature.
Described learns, with the image automatic annotation method of canonical correlation analysis, to it is characterized in that based on the degree of depth, and the visual word bag vector based on SIFT feature is obtained by following steps:
A () calculates the SIFT feature vector of all images of described model training data centralization;
B () is carried out cluster to all SIFT feature vectors and is obtained 500 cluster centres;
C (), using each cluster centre as vision word, adds up each vision word occurrence number in the SIFT feature vector of every width image and forms the visual word bag vector based on the feature of SIFT.
Embodiment
Below in conjunction with instantiation, the present invention is described in further detail.
Based on degree of depth study and an image automatic annotation method for canonical correlation analysis, comprising:
(1) the low-level image feature vector extracting image to be marked builds the visual feature vector obtaining respective image;
In this enforcement, low-level image feature vector comprises color layout's description vectors, color structure description vectors, scalable color description vector, edge histogram description vectors, GIST proper vector and the visual word bag vector based on SIFT feature.
Visual word bag vector based on SIFT feature is obtained by following steps extraction:
A () calculates the SIFT feature vector of all images of described model training data centralization;
B () is carried out cluster to all SIFT feature vectors and is obtained 500 cluster centres;
C () is using each cluster centre as vision word, add up each vision word in the SIFT feature vector of every width image occurrence number and formed respective image the visual word bag based on SIFT feature vector, the dimension of visual word bag vector equals 500 (equaling the number of cluster centre), and in visual word bag vector, each element is respectively the number of times that in all SIFT feature vectors of respective image, different vision word occurs.
(2) degree of depth Boltzmann machine model I-DBM that the input of the visual feature vector of image to be marked trains is obtained corresponding image high-level characteristic vector;
The degree of depth Boltzmann machine model trained used in step (2) in this example is obtained by following steps:
(S21) extracting training data concentrates the low-level image feature of every width image vector to form the visual feature vector obtaining respective image;
(S22) degree of depth Boltzmann machine model I-DBM is built, described degree of depth Boltzmann machine model comprises visible layer, the first hidden unit layer, the second hidden unit layer from bottom to up successively, any two nodes in each layer are without connection, and any two nodes between adjacent layer are bi-directionally connected;
(S23) utilize the visual feature vector of all images of model training data centralization to described degree of depth Boltzmann machine model training, obtain the degree of depth Boltzmann machine model trained
(3) described image high-level characteristic is projected in the canonical variable space established, search the image of model labeled data collection adjacent with it, and generate mark vocabulary high-level characteristic vector;
The canonical correlation space used in step (3) in this example is obtained by following steps:
(S31) the I-DBM high-level characteristic vector that training data concentrates all images is extracted;
(S32) the T-DBM high-level characteristic vector of the mark word that all images are corresponding in training set is extracted;
(S33) described I-DBM high-level characteristic vector is carried out canonical correlation analysis with T-DBM high-level characteristic vector, obtain projection matrix.
(4) described mark vocabulary high-level characteristic vector is inputted the degree of depth Boltzmann machine model T-DBM trained and marked vocabulary accordingly.
The canonical correlation space used in step (4) in this example is obtained by following steps:
(S41) the I-DBM high-level characteristic vector that training data concentrates all images is extracted;
(S42) the T-DBM high-level characteristic vector of the mark word that all images are corresponding in training set is extracted;
(S43) described I-DBM high-level characteristic vector is carried out canonical correlation analysis with T-DBM high-level characteristic vector, obtain projection matrix.
In this example, the canonical correlation analysis of step (S43) is undertaken by following steps:
(S43-1) by described I-DBM high-level characteristic vector and the standardization of T-DBM high-level characteristic vector, association's difference battle array is calculated;
(S43-2) calculate eigenwert and the proper vector of association's difference battle array, carry out sorting and judge whether equal;
(S43-3) by eigenwert according to sequence from big to small, and according to this order proper vector is sorted;
(S43-4) using the row vector of proper vector as matrix, canonical correlation analysis result is obtained.
I-DBM visible layer node number is identical with the dimension of visual feature vector, is 990 dimensions.
T-DBM visible layer node number is identical with the vocabulary number of mark dictionary, is 260 dimensions.
Node number in I-DBM first hidden unit layer and the second hidden unit layer is 400.
T-DBM first hidden unit layer and the second hidden unit layer interior joint number are 200.
Step (S23) and (S42) obtain the degree of depth Boltzmann machine model trained, and concrete training process is as follows:
(S2-1) using visual feature vector or mark lexical feature vector as visible layer;
(S2-2) using visible layer and the first hidden unit layer as limited Boltzmann machine, using visual feature vector as the input of visible layer, use and the end-state obtaining connection weights between visible layer and the first hidden unit layer and the first hidden unit layer is trained this limited Boltzmann machine to sdpecific dispersion algorithm;
(S2-3) using the first hidden unit layer and the second hidden unit layer as limited Boltzmann machine, using the end-state of the first hidden unit layer as the end-state of the first hidden unit layer as the input of the first hidden unit layer, use and the end-state obtaining connection weights between the first hidden unit layer and the second hidden unit layer and the second hidden unit layer is trained this limited Boltzmann machine to sdpecific dispersion algorithm.
The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, is anyly familiar with those skilled in the art in the technical scope that the present invention discloses; the change that can expect easily or replacement, all should be encompassed within protection scope of the present invention.
Claims (7)
1., based on degree of depth study and an image automatic annotation method for canonical correlation analysis, it is characterized in that, comprise:
(1) model training data set is built;
(2) the low-level image feature vector extracting image to be marked builds the visual feature vector obtaining respective image;
(3) degree of depth Boltzmann machine model I-DBM that the input of described visual feature vector trains is obtained corresponding image high-level characteristic vector;
(4) described image high-level characteristic is projected in the canonical variable space established, search the image of model labeled data collection adjacent with it, and generate mark vocabulary high-level characteristic vector;
(5) described mark vocabulary high-level characteristic vector is inputted the degree of depth Boltzmann machine model T-DBM trained and marked vocabulary accordingly;
The model training data set of described step (1) is obtained by following steps:
(S11) the mark dictionary comprising several text marking vocabulary is created;
(S12) select the image marked of respective classes as model training data set according to mark dictionary;
The degree of depth Boltzmann machine I-DBM trained in described step (3) is obtained by following steps:
(S31) extracting training data concentrates the low-level image feature of every width image vector to form the visual feature vector obtaining respective image, and determines the mark lexical feature vector of every width image according to mark dictionary and mark vocabulary;
(S32) degree of depth Boltzmann machine model I-DBM is built, described degree of depth Boltzmann machine model comprises visible layer, the first hidden unit layer, the second hidden unit layer from bottom to up successively, any two nodes in each layer are without connection, and any two nodes between adjacent layer are bi-directionally connected;
(S33) utilize the visual feature vector of all images of model training data centralization to described degree of depth Boltzmann machine model training, obtain the degree of depth Boltzmann machine model trained;
The canonical variable space established in described step (4) is obtained by following steps:
(S41) the I-DBM high-level characteristic vector that training data concentrates all images is extracted;
(S42) the T-DBM high-level characteristic vector of the mark word that all images are corresponding in training set is extracted;
(S43) described I-DBM high-level characteristic vector is carried out canonical correlation analysis with T-DBM high-level characteristic vector, obtain projection matrix;
The degree of depth Boltzmann machine T-DBM trained in described step (5) and (S42) is obtained by following steps:
(S51) the mark lexical feature vector of every width image is determined according to mark dictionary and mark vocabulary;
(S52) degree of depth Boltzmann machine model T-DBM is built, described degree of depth Boltzmann machine model comprises visible layer, the first hidden unit layer, the second hidden unit layer from bottom to up successively, any two nodes in each layer are without connection, and any two nodes between adjacent layer are bi-directionally connected;
(S53) utilize the mark lexical feature of model training data centralization all images vector to described degree of depth Boltzmann machine model training, obtain the degree of depth Boltzmann machine model trained.
2., as claimed in claim 1 based on the image automatic annotation method of degree of depth study with canonical correlation analysis, it is characterized in that, the training process of the degree of depth Boltzmann machine model in described step (S33) (S53) is as follows:
(S2-1) using visual feature vector or mark lexical feature vector as visible layer;
(S2-2) using visible layer and the first hidden unit layer as limited Boltzmann machine, using visual feature vector as the input of visible layer, use and the end-state obtaining connection weights between visible layer and the first hidden unit layer and the first hidden unit layer is trained this limited Boltzmann machine to sdpecific dispersion algorithm;
(S2-3) using the first hidden unit layer and the second hidden unit layer as limited Boltzmann machine, using the end-state of the first hidden unit layer as the end-state of the first hidden unit layer as the input of the first hidden unit layer, use and the end-state obtaining connection weights between the first hidden unit layer and the second hidden unit layer and the second hidden unit layer is trained this limited Boltzmann machine to sdpecific dispersion algorithm.
3., as claimed in claim 2 based on the image automatic annotation method of degree of depth study with canonical correlation analysis, it is characterized in that, the node number of described I-DBM visible layer is identical with the dimension of visual feature vector.
4., as claimed in claim 3 based on the image automatic annotation method of degree of depth study with canonical correlation analysis, it is characterized in that, the node number of described T-DBM visible layer is identical with the dimension of Text eigenvector.
5. learn with the canonical correlation analysis process in the automatic image annotation of canonical correlation analysis and step (S44) as follows based on the degree of depth as claimed in claim 4:
(S5-1) by described I-DBM high-level characteristic vector and the standardization of T-DBM high-level characteristic vector, association's difference battle array is calculated;
(S5-2) calculate eigenwert and the proper vector of association's difference battle array, carry out sorting and judge whether equal;
(S5-3) by eigenwert according to sequence from big to small, and according to this order proper vector is sorted;
(S5-4) using the row vector of proper vector as matrix, canonical correlation analysis result is obtained.
6. as described in claim arbitrary in Claims 1 to 5 based on degree of depth study and the image automatic annotation method of canonical correlation analysis, it is characterized in that, described bottom layer image proper vector comprises color layout's description vectors, color structure description vectors, scalable color description vector, edge histogram description vectors, GIST proper vector and the visual word bag vector based on SIFT feature.
7. as claimed in claim 6 based on degree of depth study and the image automatic annotation method of canonical correlation analysis, it is characterized in that, the visual word bag vector based on SIFT feature is obtained by following steps:
A () calculates the SIFT feature vector of all images of described model training data centralization;
B () is carried out cluster to all SIFT feature vectors and is obtained 500 cluster centres;
C (), using each cluster centre as vision word, adds up each vision word occurrence number in the SIFT feature vector of every width image and forms the visual word bag vector based on the feature of SIFT.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410843484.6A CN104572940B (en) | 2014-12-30 | 2014-12-30 | A kind of image automatic annotation method based on deep learning and canonical correlation analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410843484.6A CN104572940B (en) | 2014-12-30 | 2014-12-30 | A kind of image automatic annotation method based on deep learning and canonical correlation analysis |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104572940A true CN104572940A (en) | 2015-04-29 |
CN104572940B CN104572940B (en) | 2017-11-21 |
Family
ID=53089002
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410843484.6A Active CN104572940B (en) | 2014-12-30 | 2014-12-30 | A kind of image automatic annotation method based on deep learning and canonical correlation analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104572940B (en) |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105389326A (en) * | 2015-09-16 | 2016-03-09 | 中国科学院计算技术研究所 | Image annotation method based on weak matching probability canonical correlation model |
CN105702250A (en) * | 2016-01-06 | 2016-06-22 | 福建天晴数码有限公司 | Voice recognition method and device |
CN105741832A (en) * | 2016-01-27 | 2016-07-06 | 广东外语外贸大学 | Spoken language evaluation method based on deep learning and spoken language evaluation system |
CN105808752A (en) * | 2016-03-10 | 2016-07-27 | 大连理工大学 | CCA and 2PKNN based automatic image annotation method |
CN106250915A (en) * | 2016-07-22 | 2016-12-21 | 福州大学 | A kind of automatic image marking method merging depth characteristic and semantic neighborhood |
GB2547068A (en) * | 2016-01-13 | 2017-08-09 | Adobe Systems Inc | Semantic natural language vector space |
CN107066464A (en) * | 2016-01-13 | 2017-08-18 | 奥多比公司 | Semantic Natural Language Vector Space |
CN107169051A (en) * | 2017-04-26 | 2017-09-15 | 山东师范大学 | Based on semantic related method for searching three-dimension model and system between body |
CN107194437A (en) * | 2017-06-22 | 2017-09-22 | 重庆大学 | Image classification method based on Gist feature extractions Yu conceptual machine recurrent neural network |
CN107292322A (en) * | 2016-03-31 | 2017-10-24 | 华为技术有限公司 | A kind of image classification method, deep learning model and computer system |
US9811765B2 (en) | 2016-01-13 | 2017-11-07 | Adobe Systems Incorporated | Image captioning with weak supervision |
CN107357927A (en) * | 2017-07-26 | 2017-11-17 | 深圳爱拼信息科技有限公司 | A kind of Document Modeling method |
CN108628926A (en) * | 2017-03-20 | 2018-10-09 | 奥多比公司 | Topic association and marking for fine and close image |
CN109493249A (en) * | 2018-11-05 | 2019-03-19 | 北京邮电大学 | A kind of analysis method of electricity consumption data on Multiple Time Scales |
CN109833061A (en) * | 2017-11-24 | 2019-06-04 | 无锡祥生医疗科技股份有限公司 | The method of optimization ultrasonic image-forming system parameter based on deep learning |
CN110298386A (en) * | 2019-06-10 | 2019-10-01 | 成都积微物联集团股份有限公司 | A kind of label automation definition method of image content-based |
WO2020248391A1 (en) * | 2019-06-14 | 2020-12-17 | 平安科技(深圳)有限公司 | Case brief classification method and apparatus, computer device, and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120155774A1 (en) * | 2008-05-30 | 2012-06-21 | Microsoft Corporation | Statistical Approach to Large-scale Image Annotation |
CN103823845A (en) * | 2014-01-28 | 2014-05-28 | 浙江大学 | Method for automatically annotating remote sensing images on basis of deep learning |
CN104021224A (en) * | 2014-06-25 | 2014-09-03 | 中国科学院自动化研究所 | Image labeling method based on layer-by-layer label fusing deep network |
-
2014
- 2014-12-30 CN CN201410843484.6A patent/CN104572940B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120155774A1 (en) * | 2008-05-30 | 2012-06-21 | Microsoft Corporation | Statistical Approach to Large-scale Image Annotation |
CN103823845A (en) * | 2014-01-28 | 2014-05-28 | 浙江大学 | Method for automatically annotating remote sensing images on basis of deep learning |
CN104021224A (en) * | 2014-06-25 | 2014-09-03 | 中国科学院自动化研究所 | Image labeling method based on layer-by-layer label fusing deep network |
Non-Patent Citations (2)
Title |
---|
NITISH SRIVASTAVA ET AL.: "Multimodal Learning with Deep Boltzmann Machines", 《JOURNAL OF MACHINE LEARNING ?RESEARCH 15(2014)》 * |
李静: "基于多特征的图像标注研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105389326A (en) * | 2015-09-16 | 2016-03-09 | 中国科学院计算技术研究所 | Image annotation method based on weak matching probability canonical correlation model |
CN105389326B (en) * | 2015-09-16 | 2018-08-31 | 中国科学院计算技术研究所 | Image labeling method based on weak matching probability typical relevancy models |
CN105702250A (en) * | 2016-01-06 | 2016-06-22 | 福建天晴数码有限公司 | Voice recognition method and device |
US9792534B2 (en) | 2016-01-13 | 2017-10-17 | Adobe Systems Incorporated | Semantic natural language vector space |
GB2547068B (en) * | 2016-01-13 | 2019-06-19 | Adobe Inc | Semantic natural language vector space |
GB2547068A (en) * | 2016-01-13 | 2017-08-09 | Adobe Systems Inc | Semantic natural language vector space |
CN107066464A (en) * | 2016-01-13 | 2017-08-18 | 奥多比公司 | Semantic Natural Language Vector Space |
CN107066464B (en) * | 2016-01-13 | 2022-12-27 | 奥多比公司 | Semantic natural language vector space |
US9811765B2 (en) | 2016-01-13 | 2017-11-07 | Adobe Systems Incorporated | Image captioning with weak supervision |
CN105741832A (en) * | 2016-01-27 | 2016-07-06 | 广东外语外贸大学 | Spoken language evaluation method based on deep learning and spoken language evaluation system |
CN105808752A (en) * | 2016-03-10 | 2016-07-27 | 大连理工大学 | CCA and 2PKNN based automatic image annotation method |
CN105808752B (en) * | 2016-03-10 | 2018-04-10 | 大连理工大学 | A kind of automatic image marking method based on CCA and 2PKNN |
CN107292322A (en) * | 2016-03-31 | 2017-10-24 | 华为技术有限公司 | A kind of image classification method, deep learning model and computer system |
CN106250915A (en) * | 2016-07-22 | 2016-12-21 | 福州大学 | A kind of automatic image marking method merging depth characteristic and semantic neighborhood |
CN106250915B (en) * | 2016-07-22 | 2019-08-09 | 福州大学 | A kind of automatic image marking method of fusion depth characteristic and semantic neighborhood |
CN108628926A (en) * | 2017-03-20 | 2018-10-09 | 奥多比公司 | Topic association and marking for fine and close image |
CN107169051A (en) * | 2017-04-26 | 2017-09-15 | 山东师范大学 | Based on semantic related method for searching three-dimension model and system between body |
CN107169051B (en) * | 2017-04-26 | 2019-09-24 | 山东师范大学 | Based on relevant method for searching three-dimension model semantic between ontology and system |
CN107194437B (en) * | 2017-06-22 | 2020-04-07 | 重庆大学 | Image classification method based on Gist feature extraction and concept machine recurrent neural network |
CN107194437A (en) * | 2017-06-22 | 2017-09-22 | 重庆大学 | Image classification method based on Gist feature extractions Yu conceptual machine recurrent neural network |
CN107357927A (en) * | 2017-07-26 | 2017-11-17 | 深圳爱拼信息科技有限公司 | A kind of Document Modeling method |
CN107357927B (en) * | 2017-07-26 | 2020-06-12 | 深圳爱拼信息科技有限公司 | Document modeling method |
CN109833061A (en) * | 2017-11-24 | 2019-06-04 | 无锡祥生医疗科技股份有限公司 | The method of optimization ultrasonic image-forming system parameter based on deep learning |
US11564661B2 (en) | 2017-11-24 | 2023-01-31 | Chison Medical Technologies Co., Ltd. | Method for optimizing ultrasonic imaging system parameter based on deep learning |
CN109493249A (en) * | 2018-11-05 | 2019-03-19 | 北京邮电大学 | A kind of analysis method of electricity consumption data on Multiple Time Scales |
CN109493249B (en) * | 2018-11-05 | 2021-11-12 | 北京邮电大学 | Analysis method of electricity consumption data on multiple time scales |
CN110298386A (en) * | 2019-06-10 | 2019-10-01 | 成都积微物联集团股份有限公司 | A kind of label automation definition method of image content-based |
CN110298386B (en) * | 2019-06-10 | 2023-07-28 | 成都积微物联集团股份有限公司 | Label automatic definition method based on image content |
WO2020248391A1 (en) * | 2019-06-14 | 2020-12-17 | 平安科技(深圳)有限公司 | Case brief classification method and apparatus, computer device, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN104572940B (en) | 2017-11-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104572940A (en) | Automatic image annotation method based on deep learning and canonical correlation analysis | |
Bansal et al. | Zero-shot object detection | |
Liu et al. | Sign language recognition with long short-term memory | |
Zhou et al. | Application of deep learning in object detection | |
Alayrac et al. | Unsupervised learning from narrated instruction videos | |
Li et al. | Visual question answering with question representation update (qru) | |
CN106446526B (en) | Electronic health record entity relation extraction method and device | |
CN104217225B (en) | A kind of sensation target detection and mask method | |
CN109934261A (en) | A kind of Knowledge driving parameter transformation model and its few sample learning method | |
CN109376242A (en) | Text classification algorithm based on Recognition with Recurrent Neural Network variant and convolutional neural networks | |
CN105389326B (en) | Image labeling method based on weak matching probability typical relevancy models | |
CN109299657B (en) | Group behavior identification method and device based on semantic attention retention mechanism | |
CN106650694A (en) | Human face recognition method taking convolutional neural network as feature extractor | |
CN106778796A (en) | Human motion recognition method and system based on hybrid cooperative model training | |
Tung et al. | Reward learning from narrated demonstrations | |
CN109271539A (en) | A kind of image automatic annotation method and device based on deep learning | |
CN108765383A (en) | Video presentation method based on depth migration study | |
CN105718532A (en) | Cross-media sequencing method based on multi-depth network structure | |
CN106202030A (en) | A kind of rapid serial mask method based on isomery labeled data and device | |
CN113688894B (en) | Fine granularity image classification method integrating multiple granularity features | |
Liang et al. | An expressive deep model for human action parsing from a single image | |
CN110263165A (en) | A kind of user comment sentiment analysis method based on semi-supervised learning | |
CN109784288B (en) | Pedestrian re-identification method based on discrimination perception fusion | |
Chen et al. | Efficient maximum appearance search for large-scale object detection | |
Kindiroglu et al. | Temporal accumulative features for sign language recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20200214 Address after: 264001 Research and Academic Department, 188 Erma Road, Zhifu District, Yantai City, Shandong Province Patentee after: Naval Aviation University of PLA Address before: 264001 Yantai City, Zhifu Province, No. two road, No. 188, Department of research, Patentee before: Naval Aeronautical Engineering Institute PLA |
|
TR01 | Transfer of patent right |