CN108509521B - Image retrieval method for automatically generating text index - Google Patents
Image retrieval method for automatically generating text index Download PDFInfo
- Publication number
- CN108509521B CN108509521B CN201810198490.9A CN201810198490A CN108509521B CN 108509521 B CN108509521 B CN 108509521B CN 201810198490 A CN201810198490 A CN 201810198490A CN 108509521 B CN108509521 B CN 108509521B
- Authority
- CN
- China
- Prior art keywords
- word
- image
- images
- text
- dictionary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 238000012549 training Methods 0.000 claims abstract description 35
- 238000002372 labelling Methods 0.000 claims abstract description 24
- 239000013598 vector Substances 0.000 claims description 20
- 238000013527 convolutional neural network Methods 0.000 claims description 15
- 230000008569 process Effects 0.000 claims description 7
- 230000011218 segmentation Effects 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 4
- 238000007476 Maximum Likelihood Methods 0.000 claims description 3
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 230000000306 recurrent effect Effects 0.000 claims description 3
- 230000001419 dependent effect Effects 0.000 claims description 2
- 238000012163 sequencing technique Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an image retrieval method for automatically generating text indexes, which comprises the following steps: (1) training an automatic labeling model, firstly extracting image characteristics through a CNN part of the model, taking the characteristics and descriptors of the image as the input of the RNN part of the model, and performing back propagation by taking a cross entropy loss function as a target function; (2) generating a text index for the image, training to obtain an automatic labeling model and a dictionary, generating a description word sequence and confidence degrees corresponding to each word for the image which is not labeled through the automatic labeling model, normalizing the confidence degrees, and using the two words as the text index of the image to construct an image retrieval index; (3) when the query keyword is not in the dictionary, searching the word bank through the similar meaning word to find the similar meaning word of the keyword in the dictionary; (4) and finding out corresponding images in the image retrieval index according to the keywords or the similar meaning words thereof, and returning from high to low in sequence according to the confidence coefficient.
Description
Technical Field
The invention belongs to the technical field of information retrieval, particularly relates to image retrieval based on texts, and particularly relates to an image retrieval method for automatically generating text indexes for images.
Background
With the explosive growth of image data on the internet, how to screen out required data from these massive data becomes an urgent problem to be solved, and therefore image retrieval is receiving attention from more and more researchers.
The mainstream Image search can be classified into two categories, one is Content-Based Image search (CBIR) and the other is text-Based Image search (TBIR), depending on how the Image Content is described. The text-based image retrieval method describes the contents in the images in a text labeling mode, so that keywords for describing the contents of the images are formed for each image, such as objects, scenes and the like in the images; when searching, the user can provide the query key words according to the interest of the user, the searching system finds out the pictures marked with the corresponding query key words according to the query key words provided by the user, and finally, the searched result is returned to the user.
The image retrieval mode based on the text is intuitive, the interpretability of the result is strong, and the precision ratio is relatively high. The drawbacks of this approach are however also very significant: firstly, the method needs manual intervention in the labeling process, and obviously consumes a great deal of manpower and financial resources for completing text labeling for the images along with the rapid increase of the number of the images on the internet; secondly, the result obtained by manual labeling is often articles appearing in some images, namely, some nouns representing the articles, the information such as the number, the action, the state and the like of the articles is ignored, and each word in the result is not distinguished, namely, which word covers more image information cannot be distinguished; finally, the method can only carry out accurate retrieval, namely the query keyword of the user must appear in the label to return a corresponding result, but the same meaning can be expressed by a plurality of different terms generally, and the label data can not cover all terms, which causes that the content meeting the requirements in the database can not be retrieved.
Disclosure of Invention
The invention aims to solve the problems that the efficiency is low, the labeling result cannot cover all the contents of an image and words which do not appear in the label cannot be retrieved due to the fact that manual labeling is needed in the current text-based image retrieval, and provides an image retrieval method capable of automatically generating text indexes.
The purpose of the invention can be achieved by adopting the following technical scheme:
an image retrieval method for automatically generating text indexes comprises the following steps:
s1, learning the automatic labeling model M, and the process is as follows:
s101, acquiring a labeled training data set and an unlabeled image data set, wherein the training data set comprises training images and text descriptions corresponding to the training images, and the image data set only comprises images and does not have text descriptions corresponding to the images;
s102, segmenting all text descriptions of the training data set to construct a dictionary D;
s103, extracting the characteristics of each image in the training data set through CNN, wherein the characteristics are one-dimensional vectors;
s104, for a certain image i in the training data set, performing word segmentation on the corresponding text description to obtain wi1,wi2,…wiLTotal L words, of said image i to be extracted from CNN at the same timeCharacteristic fiAs initial input of hidden unit of RNN, and sequentially inputting word w in each step of recurrent neural network cyclei1,wi2,…wiLObtaining the probability value of each word output in the dictionary after the output result of each step passes through the softmax layer, and recording the word input in the t step as witThe probability distribution of the output is PitThen the step outputs the word witHas a probability of Pit(wit) From the maximum likelihood estimation, it is desirable to maximize the probability of equation (1),
s105, aiming at all images in the training data set, the probability of the formula (2) needs to be maximized, and the formula is taken as a target function to carry out back propagation to update the parameters of the model so as to obtain an automatic labeling model M, wherein the model consists of the CNN and the RNN;
s2, generating text indexes for all images through the automatic annotation model M;
for any image i in the image data set, firstly, the image feature f is extracted from the CNN part of the automatic labeling model MiAs an initial input to the RNN portion of the auto-labeling model M, then the words are generated in turn, generating the word w'itIs dependent on w 'already generated'i1,w′i2…w′i(t-1)Selecting the word with the maximum output probability value as a generated word at each step, and taking the probability value as the confidence coefficient of the generated word in the image and marking as z;
when the final words generated in the above steps or the length of the generated words reaches a preset threshold value, stopping continuously generating the words; for any image i described above, a sequence of descriptors w 'can be generated'i1,w′i2…w′ilAnd said descriptor is in the imageZ of (1)i1,zi2…zilNormalizing the confidence level by formula (3)
W 'mentioned above'i1,w′i2…w′ilAnd z'i1,z′i2…z′ilTogether constituting a text index of said image i;
s3, constructing an image retrieval index by the text index of each image, and aiming at any word w in the dictionary DuFind all images i described by the word1,i2…ioAnd the confidence level z 'that the term corresponds in the image'u1,z′u2…z′uoIf the images are ranked according to the confidence degree from high to low, a candidate image set ranked according to the confidence degree is generated for any word in the dictionary D in the mode;
s4, establishing a near-meaning word query word bank, obtaining text data which does not need to be labeled from a network text data set, training the text data through a word2vec algorithm, constructing a word bank DB, belonging to the DB, wherein each word in the word bank has a corresponding word vector, and calculating the meaning similarity of any two words in the word bank DB; when inquiring about the keyword wuNot present in dictionary D described above, and found with word w via lexicon DBuWord w having the closest meaning and appearing in dictionary DvAnd by the word wvRetrieving the relevant images;
s5, receiving query keywords to perform image retrieval, generating a group of candidate image sets sorted according to confidence degrees for any word in the word bank DB according to the steps S3 and S4, combining the candidate image sets generated by each word when a plurality of query keywords exist, and superposing all the confidence degrees of the candidate images i in different query keywords as the final confidence degree of the candidate images i for the candidate images i with the occurrence times larger than 1, removing redundant candidate images i and enabling the candidate images i to appear only once; and sorting the candidate images from high to low according to the confidence coefficient after superposition, and selecting the first plurality of candidate images as a returned result.
Further, in step S2, when the automatic labeling model M generates descriptors for an image, a confidence is generated for each descriptor at the same time, which indicates the accuracy of the descriptor in describing the image; and by sequencing the confidence level, images with higher relevance to the keywords are accurately retrieved.
Further, in the step S4 and the step S5, when the query keyword does not appear in the dictionary D, a word vector is constructed for each word by constructing a lexicon DB, and for any two word vectors ve,vuCalculating the similarity of the two words by formula (4), finding out the word which appears in the dictionary D and has the closest meaning to the query keyword, and further retrieving the corresponding image,
wherein v ise·vuRepresents the inner product of two vectors, | ve|×|vuI represents the product of the lengths of two vectors, and the larger the value, the closer the two meanings are.
Further, the CNN adopts ResNet, and the RNN adopts LSTM.
Compared with the prior art, the invention has the following advantages and effects:
1. according to the method, the automatic labeling model is learned, a plurality of description words can be automatically generated for the image, and when massive image data are faced, manual intervention can be effectively reduced, and required manpower and financial resources are greatly reduced.
2. The description words generated for the image in the invention can contain quantifier for describing the number of the objects, adjective for describing the state of the objects, verb for describing the action of the objects and the like besides the nouns for representing the objects, thereby more comprehensively covering the content of the image, and more accurately retrieving the image compared with the traditional processing mode of marking out nouns only by manual marking.
3. The method can process the query keywords which do not appear in the text of the image training set, train the similar meaning word query word bank by an unsupervised method, find the words with the similar meaning to the query keywords, and avoid the problem of low matching success rate of an accurate matching mode.
Drawings
FIG. 1 is a diagram of an automatic tagging model of the present invention;
FIG. 2 is a schematic diagram of text indexing of images generated by an automatic annotation model;
FIG. 3 is a schematic diagram of a process for finding a synonym for a query keyword from a synonym library;
fig. 4 is a flowchart of image retrieval by the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Examples
The image retrieval method for automatically generating the text index is mainly applied to retrieval of Internet images, such as search engines of Google, must and Baidu. The following are the implementation steps of the application of the invention:
step S1, learning the automatic labeling model M, which comprises the following steps:
step S101, acquiring a labeled training data set and an unlabeled image data set from the network image data, first selecting a specific language, here taking chinese as an example, acquiring the labeled training data set as an image chinese description data set of AI challenge, where the data set includes 300000 labeled images, and each image has sentences described by 5 sentences. The obtained unlabelled image data set is pictures provided by ImageNet, the number of the images is 14197122, and meanwhile, more pictures can be captured by means of a crawler and the like.
Step S102, segmenting the 1500000 samples in the training data set through the bus segmentation, counting different words appearing in the 1500000 samples, and constructing a dictionary D, wherein the size of the constructed dictionary is 8233.
Step S103, extracting features of the images in the training data set through a convolutional neural network (hereinafter, abbreviated as CNN), where the specific CNN used here is ResNet, and each image extracts a feature vector with a dimension of 2048.
Step S104, for a certain image i in the training data set, performing word segmentation on the corresponding text description to obtain wi1,wi2,…wiLThe feature vector of the image i extracted from the CNN is used as the initial input of a hidden unit of a recurrent neural network (hereinafter abbreviated as RNN), where the specific RNN used here is LSTM, and the specific step number of the input is determined according to the above L.
Before inputting, it also needs to add label words at the head and tail of L words obtained by word segmentation, where the head label word is marked as wsThe tail marker word is marked as weThe two tokens are the same for all samples in the training dataset. Therefore, the words w need to be input sequentially in each step of the RNN cycles,wi1,wi2,…wiL,weThe result output at each step passes through the softmax layer to obtain a probability distribution, which in this example is a 8233-long vector corresponding to 8233 words in the dictionary, and the value of one dimension in the vector represents the probability value of the corresponding word of the dimension. The word input in the t step is recorded as witThe probability distribution of the output is PitThen the step outputs the word witHas a probability of Pit(wit). In order to make the words output from the image i fit the words input to it as closely as possible, the probability of equation (1) needs to be maximized according to the maximum likelihood estimation.
Step S105, according to step S104, may fit the output of one image to its input, where in order to fit the output term of each image in the entire training set to the input term as much as possible, the probability of equation (2) needs to be maximized. However, in the training process, a negative sign is often added before the formula (2) and then used as a loss function of the model, so that the target is changed into a loss function minimization, then the parameters of the model are updated through back propagation, during the training, the training set and the verification set need to be divided, and during the training process, whether the model is converged or not is judged by observing the effect of the model on the verification set, and the automatic labeling model M can be obtained after the model is converged, wherein the specific structure of the model is shown in fig. 1.
Step S2, generating text indexes for all images through the automatic annotation model M described in step S1.
For any image i, firstly, the image feature f is extracted from the CNN part of the automatic labeling model MiImage feature fiAnd a head marker word w in step S104sTogether as an initial input to the RNN portion of the automatic annotation model M, the first word w 'describing the image is generated'i1Then the first word w'i1Generating a second descriptive term w 'as input to RNN'i2. Analogize in turn to generate the t < th > word w'itDependent on w 'generated ahead of it'i1,w′i2…w′i(t-1)。
And selecting the word with the maximum RNN output part probability value when generating each word, and taking the probability value as the confidence coefficient of the word in the image, and marking the confidence coefficient as z. When the tail markup word w in step S104 is generatedeOr stopping continuously generating the words when the length of the generated words reaches a preset threshold value, and recording that the words generated at last are w'il. Then for any image i described above, a sequence can be generatedDescriptor w'i1,w′i2…w′ilAnd the confidence z of these words in the imagei1,zi2…zilNormalizing the confidence level by the formula (3) to obtain a normalized confidence level z'i1,z′i2…z′il
W 'produced above'i1,w′i2…w′ilAnd z'i1,z′i2…z′ilTogether forming a text index for the image, fig. 2 shows an example of generating a text index for an image by an automatic annotation model.
Step S3, establishing index of all images for the text index generated by step S2, wherein the specific step is that any one of 8233 words in the dictionary is marked as wuFind all images i in which the word appears1,i2…ioAnd the confidence level z 'of the corresponding word in the image'u1,z′u2…z′uoThe images are ordered according to confidence from high to low. A candidate set ordered by confidence may be generated in this manner for any of 8233 words.
Step S4, establishing a thesaurus for solving the situation that the query keyword does not appear in the 8233 terms mentioned above. The method specifically comprises the steps of acquiring a large amount of text data which do not need to be labeled from a network text data set, wherein a corpus of Chinese Wikipedia is acquired. The Chinese Wikipedia corpus comprises the title and the text part of each entry in the Chinese Wikipedia, the number of the entries is 984451, and 984451 texts are obtained after the preprocessing steps of punctuation removal, complex and simple conversion, word segmentation and the like. 984451 texts of Wikipedia and 150000 texts in a training set are trained through word2vec algorithm, wherein all words form a word stock DB, the size of the word stock is 408787, and then each word in the DB can be generatedWord vectors, for any two word vectors ve,vuThe degree of similarity, v, can be calculated by formula (4)e·vuRepresents the inner product of two vectors, | ve|×|vuI represents the product of the lengths of two vectors, and the larger the value, the closer the two meanings are
When inquiring about the keyword wuNot present in dictionary D described above, and found with word w via lexicon DBuWord w having the closest meaning and appearing in dictionary DvAnd by the word wvRetrieving related images, FIG. 3 shows an example of finding a synonym for a query keyword from a synonym query thesaurus.
And step S5, receiving the query key words to perform image retrieval. A set of candidate image sets ordered according to confidence may be generated for any word in the thesaurus DB through steps S3, S4. When there are a plurality of query keywords w1,w2,…wnAt first, each keyword retrieves a group of candidate image sets<i1,z1>,<i2,z2>….<io,zo>And for the candidate image i with the occurrence frequency larger than 1, superposing all confidence degrees z of the candidate image i as the final confidence degree of the candidate image i, and removing redundant i to ensure that the candidate image i only appears once. And sorting the candidate images from high to low according to the confidence coefficient after superposition, and selecting the first plurality of candidate images as a returned result.
FIG. 4 shows a general flow chart of image retrieval according to the present invention, which integrates the contents of the above steps, first obtaining a text data set, training to obtain a query thesaurus of near-sense words; acquiring an image data set, training an automatic labeling model and generating a dictionary, generating a text index for an image through the automatic labeling model, and constructing an image retrieval index through the text index; when the query keyword of the user is not in the dictionary of the image data set, the word bank is queried through the similar meaning word to find the word which is closest to the meaning of the query keyword in the dictionary, and the image retrieval is carried out by replacing the query keyword.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Claims (4)
1. An image retrieval method for automatically generating a text index is characterized by comprising the following steps:
s1, learning the automatic labeling model M, and the process is as follows:
s101, acquiring a labeled training data set and an unlabeled image data set, wherein the training data set comprises training images and text descriptions corresponding to the training images, and the image data set only comprises images and does not have text descriptions corresponding to the images;
s102, segmenting all text descriptions of the training data set to construct a dictionary D;
s103, extracting the characteristics of each image in the training data set through CNN, wherein the characteristics are one-dimensional vectors;
s104, for a certain image i in the training data set, performing word segmentation on the corresponding text description to obtain wi1,wi2,…wiLL words in total, and the feature f of the image i extracted from the CNNiAs initial input of hidden unit of RNN, and sequentially inputting word w in each step of recurrent neural network cyclei1,wi2,…wiLObtaining the probability value of each word output in the dictionary after the output result of each step passes through the softmax layer, and recording the word input in the t step as witThe probability distribution of the output is PitThen the step outputs the word witHas a probability of Pit(wit) From the maximum likelihood estimation, it is desirable to maximize the probability of equation (1),
s105, aiming at all images in the training data set, the probability of the formula (2) needs to be maximized, and the formula is taken as a target function to carry out back propagation to update the parameters of the model so as to obtain an automatic labeling model M, wherein the model consists of the CNN and the RNN;
s2, generating text indexes for all images through the automatic annotation model M;
for any image i in the image data set, firstly, the image feature f is extracted from the CNN part of the automatic labeling model MiAs an initial input to the RNN portion of the auto-labeling model M, then the words are generated in turn, generating the word w'itIs dependent on w 'already generated'i1,w′i2…w′i(t-1)Selecting the word with the maximum output probability value as a generated word at each step, and taking the probability value as the confidence coefficient of the generated word in the image and marking as z;
when the final words generated in the above steps or the length of the generated words reaches a preset threshold value, stopping continuously generating the words; for any image i described above, a sequence of descriptors w 'can be generated'i1,w′i2…w′ilAnd the confidence z of said descriptor in the imagei1,zi2…zilNormalizing the confidence level by formula (3)
W 'mentioned above'i1,w′i2…w′ilAnd z'i1,z′i2…z′ilTogether constituting a text index of said image i;
s3, constructing images through the text index of each imageIndex of retrieval, for any word w in dictionary D as described aboveuFind all images i described by the word1,i2…ioAnd the confidence level z 'that the term corresponds in the image'u1,z′u2…z′uoIf the images are ranked according to the confidence degree from high to low, a candidate image set ranked according to the confidence degree is generated for any word in the dictionary D in the mode;
s4, establishing a near-meaning word query word bank, obtaining text data which does not need to be labeled from a network text data set, training the text data through a word2vec algorithm, constructing a word bank DB, belonging to the DB, wherein each word in the word bank has a corresponding word vector, and calculating the meaning similarity of any two words in the word bank DB; when inquiring about the keyword wuNot present in dictionary D described above, and found with word w via lexicon DBuWord w having the closest meaning and appearing in dictionary DvAnd by the word wvRetrieving the relevant images;
s5, receiving query keywords to perform image retrieval, generating a group of candidate image sets sorted according to confidence degrees for any word in the word bank DB according to the steps S3 and S4, combining the candidate image sets generated by each word when a plurality of query keywords exist, and superposing all the confidence degrees of the candidate images i in different query keywords as the final confidence degree of the candidate images i for the candidate images i with the occurrence times larger than 1, removing redundant candidate images i and enabling the candidate images i to appear only once; and sorting the candidate images from high to low according to the confidence coefficient after superposition, and selecting the first plurality of candidate images as a returned result.
2. The image retrieval method of claim 1, wherein in step S2, when the automatic annotation model M generates descriptors for an image, a confidence level is generated for each descriptor at the same time, which indicates the accuracy of the descriptor in describing the image; and by sequencing the confidence level, images with higher relevance to the keywords are accurately retrieved.
3. The image retrieval method of claim 1, wherein in the steps S4 and S5, when the query keyword does not appear in the dictionary D, a word vector is constructed for each word by constructing a lexicon DB, and v is a vector for any two wordse,vuCalculating the similarity of the two words by formula (4), finding out the word which appears in the dictionary D and has the closest meaning to the query keyword, and further retrieving the corresponding image,
wherein v ise·vuRepresents the inner product of two vectors, | ve|×|vuI represents the product of the lengths of two vectors, and the larger the value, the closer the two meanings are.
4. The image retrieval method of claim 1, wherein the CNN is ResNet and the RNN is LSTM.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810198490.9A CN108509521B (en) | 2018-03-12 | 2018-03-12 | Image retrieval method for automatically generating text index |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810198490.9A CN108509521B (en) | 2018-03-12 | 2018-03-12 | Image retrieval method for automatically generating text index |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108509521A CN108509521A (en) | 2018-09-07 |
CN108509521B true CN108509521B (en) | 2020-02-18 |
Family
ID=63376458
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810198490.9A Expired - Fee Related CN108509521B (en) | 2018-03-12 | 2018-03-12 | Image retrieval method for automatically generating text index |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108509521B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109635135A (en) * | 2018-11-30 | 2019-04-16 | Oppo广东移动通信有限公司 | Image index generation method, device, terminal and storage medium |
CN110188775B (en) * | 2019-05-28 | 2020-06-26 | 创意信息技术股份有限公司 | Image content description automatic generation method based on joint neural network model |
CN111243729B (en) * | 2020-01-07 | 2022-03-08 | 同济大学 | Automatic generation method of lung X-ray chest radiography examination report |
CN112349150B (en) * | 2020-11-19 | 2022-05-20 | 飞友科技有限公司 | Video acquisition method and system for airport flight guarantee time node |
CN112381038B (en) * | 2020-11-26 | 2024-04-19 | 中国船舶工业系统工程研究院 | Text recognition method, system and medium based on image |
CN112148831B (en) * | 2020-11-26 | 2021-03-19 | 广州华多网络科技有限公司 | Image-text mixed retrieval method and device, storage medium and computer equipment |
CN113204666B (en) * | 2021-05-26 | 2022-04-05 | 杭州联汇科技股份有限公司 | Method for searching matched pictures based on characters |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101901249A (en) * | 2009-05-26 | 2010-12-01 | 复旦大学 | Text-based query expansion and sort method in image retrieval |
CN101582080B (en) * | 2009-06-22 | 2011-05-04 | 浙江大学 | Web image clustering method based on image and text relevant mining |
CN102360431A (en) * | 2011-10-08 | 2012-02-22 | 大连海事大学 | Method for automatically describing image |
US8687886B2 (en) * | 2011-12-29 | 2014-04-01 | Konica Minolta Laboratory U.S.A., Inc. | Method and apparatus for document image indexing and retrieval using multi-level document image structure and local features |
-
2018
- 2018-03-12 CN CN201810198490.9A patent/CN108509521B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN108509521A (en) | 2018-09-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108509521B (en) | Image retrieval method for automatically generating text index | |
CN109829104B (en) | Semantic similarity based pseudo-correlation feedback model information retrieval method and system | |
CN113268995B (en) | Chinese academy keyword extraction method, device and storage medium | |
CN107480200B (en) | Word labeling method, device, server and storage medium based on word labels | |
CN111061939B (en) | Scientific research academic news keyword matching recommendation method based on deep learning | |
CN110888991A (en) | Sectional semantic annotation method in weak annotation environment | |
Li et al. | Question answering over community-contributed web videos | |
US20190095525A1 (en) | Extraction of expression for natural language processing | |
Gong et al. | A semantic similarity language model to improve automatic image annotation | |
CN103136221B (en) | A kind of method for generating requirement templet, demand know method for distinguishing and its device | |
CN114328800A (en) | Text processing method and device, electronic equipment and computer readable storage medium | |
Perdana et al. | Instance-based deep transfer learning on cross-domain image captioning | |
Zhang et al. | Semantic image retrieval using region based inverted file | |
CN115687960B (en) | Text clustering method for open source security information | |
Tian et al. | Automatic image annotation with real-world community contributed data set | |
CN113516202A (en) | Webpage accurate classification method for CBL feature extraction and denoising | |
Ramachandran et al. | Document Clustering Using Keyword Extraction | |
Anusha et al. | Multi-classification and automatic text summarization of Kannada news articles | |
Aref | Mining publication papers via text mining Evaluation and Results | |
Tian et al. | Textual ontology and visual features based search for a paleontology digital library | |
Wiesen et al. | Overview of uni-modal and multi-modal representations for classification tasks | |
CN113742520B (en) | Video query and search method of dense video description algorithm based on semi-supervised learning | |
Joga et al. | Semantic text analysis using machine learning | |
Wu et al. | A new passage ranking algorithm for video question answering | |
Training-Less | Wael Alkhatib (), Saba Sabrin, Svenja Neitzel, and Christoph Rensing Communication Multimedia Lab, TU Darmstadt, Rundeturmstr. 10, 64283 Darmstadt, Germany {wael. alkhatib, svenja. neitzel, christoph. rensing}@ kom. tu-darmstadt. de, saba. sabrin@ stud. tu-darmstadt. de |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20200218 |
|
CF01 | Termination of patent right due to non-payment of annual fee |