CN103853797A

CN103853797A - Image retrieval method and system based on n-gram image indexing structure

Info

Publication number: CN103853797A
Application number: CN201210523756.5A
Authority: CN
Inventors: 陆平; 董振江; 罗圣美; 刘丽霞; 陈清财; 刘胜宇
Original assignee: ZTE Corp; Shenzhen Graduate School Harbin Institute of Technology
Current assignee: ZTE Corp; Shenzhen Graduate School Harbin Institute of Technology
Priority date: 2012-12-07
Filing date: 2012-12-07
Publication date: 2014-06-11
Anticipated expiration: 2032-12-07
Also published as: CN103853797B

Abstract

The invention discloses an image retrieval method and an image retrieval system based on an n-gram image indexing structure, and relates to the technical field of image retrieval. The method disclosed by the invention comprises the following steps: when retrieval operation of a user is received, judging the form of retrieval contents input by the user; if the form of the retrieval contents input by the user is a text form, performing text interior vectorization processing based on n-gram image indexing on a text input by the user, and performing image retrieval under an index in a text label according to a processing result; if the form of the retrieval contents input by the user is an image form, performing automatic image semantic labeling based on the n-gram image indexing structure on an image input by the user, extracting n-gram images, performing image retrieval in an index in the semantically-labeled text label for TF-IDF (term frequency-inverse document frequency) characteristic vectors of the extracted n-gram images, and finally, sorting the retrieved images according to the similarity, and then outputting the images. The invention further discloses the image retrieval system based on the n-gram image indexing structure. According to the technical scheme, the retrieval efficiency and the retrieval effect are improved.

Description

A kind of picture retrieval method and system based on n unit picture indices structure

Technical field

The present invention relates to image search method and system, be specifically related to a kind of picture retrieval method and system based on n unit (n-gram) picture indices, be mainly used in field of image search.

Background technology

At present, picture retrieval is mainly divided into two kinds of modes, text based picture retrieval (text-based image retrieval) and content-based picture retrieval (content-based image retrieval).In traditional text based picture retrieval system (TBIR), picture is normally after artificial mark, and user is by the needed picture of keyword retrieval.The distinct disadvantage of this mode is that picture must be by manually marking, and in today of information big bang, this mode is unpractical.For overcoming the shortcoming of text based picture retrieval, content-based picture retrieval mode last century the eighties arise at the historic moment, wherein Chang in 1984 has done initiative work in this respect.So-called content-based picture retrieval (CBIR), refer to by extracting the original bottom visual signature of picture (as color characteristic, textural characteristics, shape facility etc.) picture is carried out to index, and finally carry out the mode of picture searching by the low-level image feature of picture.Comparatively famous gyp content-based picture retrieval instrument has QBIC, Photobook, Virage, VisualSEEK, Netra and SIMPLIcity.

Current conventional picture retrieval system, all that the concentrated picture of image data is extracted to higher-dimension low-level image feature vector mostly, by these higher-dimension low-level image feature vectors are set up to index, or to the image with picture mark, by text label, image is set up to index.User is by submitting to text or diagram picture to retrieve directory system.But, retrieval effectiveness and the efficiency of searching system are by this method unsatisfactory, its main cause is to carry out retrieval itself by low-level image feature just to have " semantic gap " problem, and by the sharp increase along with the quantity of index picture to the directory system of higher-dimension low-level image feature foundation, recall precision very low, therefore the index picture quantity of photo current search engine is also limited, and the picture effect of its user search is undesirable.And current most of picture retrieval system all do not use in picture with space characteristics information.The main method of current solution " semantic gap " problem is by picture is carried out to automatic marking, and current most of photographic search engine does not successfully apply to picture automatic marking technology in picture retrieval system.

But the development of current text retrieval is quite ripe, its index is set up and retrieval technique has certain accumulation, therefore can use for reference from text retrieval aspect correlation technique, improves the performance of current picture retrieval system.

Summary of the invention

Technical matters to be solved by this invention is to provide a kind of picture retrieval method and system based on n-gram picture indices structure, to improve picture retrieval efficiency and effect.

In order to solve the problems of the technologies described above, the invention discloses a kind of picture retrieval method based on n unit picture indices structure, comprising:

While receiving user's search operaqtion, judge the form of the retrieval of content of user's input;

In the time that the form of the retrieval of content of user input is textual form, the text of user's input is carried out to the text intrinsic vector processing based on n unit picture indices, utilize under the index of text intrinsic vector result in text label and carry out picture retrieval, the picture retrieving is also exported according to sequencing of similarity;

In the time that the form of the retrieval of content of user input is picture form, the picture of user's input is carried out to the picture semantic automatic marking based on n unit picture indices structure, extract n unit picture based on n meta-model, in the index of word frequency-reverse file frequency (TF-IDF) proper vector in the text label of semantic tagger for the n unit picture extracting, carry out picture retrieval, the picture retrieving is sorted according to similarity and export.

Preferably, said method also comprises:

Before user carries out search operaqtion, build the index based on n unit image, constructed index comprises taking image n unit as index, index structure taking image labeling and picture details as index object, and be labeled as index with picture, the index structure taking image n unit and picture details as index object.

Preferably, in said method, the process that builds the index based on n unit image is as follows:

Image data collection with text marking is carried out to pre-service, concentrate and extract " image lemma " from pretreated view data;

Build the image dictionary that comprises accordingly image n unit according to extracted " image lemma ";

According to constructed image dictionary, the picture concentrated to the image data with text marking cuts, and extracts corresponding image n unit, sets up the picture indices based on n meta-model.

Preferably, said method, carries out referring to based on the text intrinsic vector processing of n unit picture indices to the text of user's input:

According to the content of text of user's input, retrieve based on n unit picture indices structure, according to the probability weights of the image n unit retrieving, content of text is carried out to intrinsic vector processing.

Preferably, said method, utilizes under the index of text intrinsic vector result in text label and carries out picture retrieval, and the picture retrieving is referred to according to sequencing of similarity output:

Text to user's input carries out after vectorization, according to vectorization value after treatment, picture under index in text label is carried out to similarity calculating, according to the size of the similarity calculating, the picture retrieving is sorted and is exported.

The invention also discloses a kind of picture retrieval system based on n unit picture indices structure, comprise judging unit, the first indexing units and the second indexing units, wherein:

Described judging unit, while receiving user's search operaqtion, judge the form of the retrieval of content of user's input, in the time that the form of the retrieval of content of user input is textual form, the text of user's input is sent to described the first indexing units, in the time that the form of the retrieval of content of user input is picture form, the picture of user's input is sent to described the second indexing units;

Described the first indexing units, the text of user's input is carried out to the text intrinsic vector processing based on n unit picture indices, utilize under the index of text intrinsic vector result in text label and carry out picture retrieval, the picture retrieving is also exported according to sequencing of similarity;

Described the second indexing units, the picture of user's input is carried out to the picture semantic automatic marking based on n unit picture indices structure, extract n unit picture based on n meta-model, in the index of word frequency-reverse file frequency (TF-IDF) proper vector in the text label of semantic tagger for the n unit picture extracting, carry out picture retrieval, the picture retrieving is sorted according to similarity and export.

Preferably, said system also comprises: based on n unit picture indices construction unit, set up the index based on n unit image, this index comprises taking image n unit as index, index structure taking image labeling and picture details as index object, and be labeled as index with picture, the index structure taking image n unit and picture details as index object.

Preferably, in said system, be describedly divided into based on n unit picture indices construction unit:

" image dictionary " builds parts, and the image data collection with text marking is carried out to pre-service, concentrates and extracts " image lemma " from pretreated view data, builds the image dictionary that comprises accordingly image n unit according to extracted " image lemma ";

Index construct parts, build parts constructed image dictionaries according to described " image dictionary ", and the picture concentrated to the image data with text marking cuts, and extract corresponding image n unit, set up the picture indices based on n meta-model.

Preferably, in said system, described the first indexing units carries out referring to based on the text intrinsic vector processing of n unit picture indices to the text of user's input:

Preferably, in said system, described the first indexing units is utilized under the index of text intrinsic vector result in text label and is carried out picture retrieval, and the picture retrieving is referred to according to sequencing of similarity output:

Present techniques scheme, can effectively combine text based picture retrieval and content-based picture retrieval mode, effectively raises recall precision and effect.

Brief description of the drawings

Fig. 1 is the picture retrieval schematic flow sheet of the present embodiment based on n-gram picture indices structure;

Fig. 2 is the process flow diagram that extracts " image lemma " in the present embodiment;

Fig. 3 is that in the present embodiment, image cuts and extract the exemplary plot of n-gram;

Fig. 4 is for taking image n-gram as index, the image index topology example figure taking semantic label and image as index content;

Fig. 5 is for taking image, semantic label as index, the image index topology example figure taking image n-gram and image as index content;

Fig. 6 is the picture semantic automatic marking schematic flow sheet based on n-gram picture indices structure.

Embodiment

For making the object, technical solutions and advantages of the present invention clearer, below in connection with accompanying drawing, technical solution of the present invention is described in further detail.It should be noted that, in the situation that not conflicting, the feature in the application's embodiment and embodiment can combine arbitrarily mutually.

Embodiment 1

The present embodiment provides a kind of picture retrieval method based on n-gram picture indices structure, and the method comprises two kinds of retrieval modes: i.e. the picture retrieval of the picture retrieval of textual form and picture form.The enforcement principle of the method as shown in Figure 1.Specifically comprise the steps 100 to 400:

Step 100, while receiving user's search operaqtion, judges the form of the retrieval of content of user's input, if textual form enters step 200 (a), if picture form enters step 200 (b);

Step 200 (a), carries out the text intrinsic vector processing based on n-gram picture indices to the text of user's input, enters step 300 (a).

Particularly, this step is according to the content of text of user's input, and to retrieving based on n-gram picture indices structure, the probability weights of the image n-gram obtaining according to retrieval, carry out intrinsic vector processing to content of text.

Step 200 (b), carries out the picture semantic automatic marking based on n-gram picture indices structure to the picture of user's input, based on n-gram model extraction n-gram picture, enters step 300 (b).

The operation that this step is first extracted image n-gram to the picture of user's input, and then the proper vector of extraction picture, then picture is carried out to the semantic tagger processing based on n-gram picture indices structure.

Step 300 (a), utilizes in text intrinsic vector result that user the inputs index in text label and carries out picture retrieval, calculates the similarity of the picture retrieving, and enters step 400.

In this step, the text of user's input is carried out, after vectorization, according to the value after vectorization, the picture under corresponding text index being carried out to similarity calculating.

Step 300 (b), for TF (the Term Frequency Term Frequency of the n-gram picture extracting, word frequency)-IDF (Inverse Document Frequency, reverse file frequency) carry out picture retrieval in the index of proper vector in the text label of semantic tagger, the picture retrieving is sorted according to similarity and export.

After this step is carried out meaning automatic marking to the picture of user's input, the proper vector that picture is extracted, carries out similarity calculating in the picture under the text index of semantic tagger.

Step 400, is carrying out after similarity calculating, and the picture retrieving is sorted and returns to user the picture list retrieving according to this sequence according to the size of similarity.

Also be noted that, based on the above method, also have some preferred versions, before user carries out search operaqtion, also build the index based on n-gram image, constructed index comprises taking image n unit as index, the index structure taking image labeling and picture details as index object, and be labeled as index with picture, the index structure taking image n unit and picture details as index object.

Particularly, the process of the index of structure based on n-gram image is as follows:

Preferably to include the scheme that builds the index operation based on n-gram image as example, describe the above-mentioned picture retrieval process based on n-gram picture indices structure in detail below.

The first step, by the image data collection study image lemma of choosing at random, " image lemma " structure " image dictionary " then obtaining by study.

Wherein, the process of study " image lemma " as shown in Figure 2, comprises the steps:

First, the picture of choosing is carried out to text cutting, the mode of text cutting can design according to different application demands.The example of a kind of picture text cutting method providing in the embodiment of the present invention is that picture is evenly divided into the image fritter (as Fig. 3) of size for m*n, each fritter can be regarded " word " in similar text-processing as, and every width image can be regarded " article " accordingly as, the method for picture being carried out to text cutting is not limited to this.

The image low-level image feature of the equal-sized image fritter that secondly, extraction is cut into includes but not limited to color of image feature, image texture characteristic.And its multiple low-level image features are merged, thereby obtain the proper vector of an energy response diagram as the multiple low-level image feature of fritter.

Then, to the proper vector of the each image fritter obtaining, adopt clustering method to carry out cluster operation, finally by choosing the typical data point that represents respective cluster class as " image lemma ".Give corresponding numbering (as Fig. 3) to " the image lemma " that obtain.A kind of embodiment (as Fig. 2) that the present invention adopts, is by the proper vector of all image fritters is done to k-means cluster operation, pre-determines the quantity of clustering cluster, obtains " image lemma " by obtaining the barycenter of k-means cluster result.

Finally, study obtains after " image lemma ", be exactly by structure " image dictionary ", for the space characteristics of further presentation video, in " image dictionary ", add n-gram item, for any " image lemma ", a n-1 being adjacent " image lemma " forms " image lemma " sequence, all these " image lemma " sequences are all added in " image dictionary " as an item, add its length to be less than other " image lemma " sequences of n simultaneously, form " image dictionary ".For example, suppose that " the image lemma " that extract is 1,2,3, choosing n is 2, " image dictionary " that " image dictionary " obtaining so comprises is: (1), (2), (3), (1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1), (3,2), (3,3).Be K for extracting " image lemma " quantity, selecting n is that in 2 embodiment, the gram quantity that " image dictionary " comprises is K*K+K.

Second step, judges the form of the retrieval of content of user input;

Wherein, the form of the retrieval of content why this step is inputted user judges, is next step operation in order to determine that system should be taked, and corresponding input is done to suitable pre-service.If user's input is content of text, need content of text to make participle and remove the processing of stop words; If user's input is picture, so need picture to carry out corresponding format conversion and size normalization processing.

The 3rd step, judge user input for textual form time, carry out the text intrinsic vector processing based on n-gram picture indices structure, or judge user input for picture form time, carry out the picture semantic automatic marking based on n-gram picture indices structure;

The 4th step, utilize text intrinsic vector result that user inputs or the picture feature vector based on n-gram model, in the picture of text label index, retrieve;

If user's input is textual form in this step, the text of user's input is carried out to the text intrinsic vector processing based on n-gram picture indices, the method of text intrinsic vector is particularly: first in as the index structure of Fig. 5, retrieve, component of a vector weights using the Nweight value of corresponding n-gram as text vectorization, for the text that contains multiple participles, each last vector component value is added, and the intrinsic vector that obtains the text of user's input represents.

If user's input is picture, the image data of user's input to be carried out to picture semantic automatic marking as shown in Figure 6, and extract the TF-IDF proper vector of picture, the TF-IDF account form that the present embodiment uses is as follows:

t f_{i . j} = \frac{n_{i, j}}{Σ_{k} n_{k, j}}

N in formula _{i, j}---" image dictionary " item is at image d _jin appearance frequency;

∑ _kn _{k, j}---image d _jin all items there is frequency summation.

id f_{i} = \log \frac{| D |}{1 + | {j : t_{i} &Element; d_{j}} |}

In formula | the total number of images of D|---picture library;

| { j:t _i∈ d _j|---the amount of images that comprises this " image dictionary " ti (is n _{i, j}≠ 0 number of files).

The 5th step, finally the result of retrieval is sorted according to similarity, and export result for retrieval.

In this step, if user input is textual form, the text of user's input is carried out after text vector represents, in the picture to the vector obtaining under the corresponding text label index of user's input, carrying out similarity calculating;

If user's input is picture form, the picture of user's input is carried out after meaning automatic marking, in the picture under the tab indexes after mark, carry out similarity calculating, and return to the weights that similarity is calculated.

The weights size that all pictures that finally retrieval obtained calculate according to similarity sorts, and will picture list be returned to user according to sequence.

Embodiment 2

The present embodiment is introduced a kind of picture retrieval system based on n-gram picture indices structure, and this system at least comprises judging unit, the first indexing units and the second indexing units.

Judging unit, while receiving user's search operaqtion, judge the form of the retrieval of content of user's input, in the time that the form of the retrieval of content of user input is textual form, the text of user's input is sent to the first indexing units, in the time that the form of the retrieval of content of user input is picture form, the picture of user's input is sent to the second indexing units;

The first indexing units, text to user's input carries out the text intrinsic vector processing based on n-gram picture indices, utilize under the index of text intrinsic vector result in text label and carry out picture retrieval, the picture retrieving is also exported according to sequencing of similarity;

Wherein, when the first indexing units is carried out the text intrinsic vectorization processing based on n-gram picture indices to the text of user's input, according to the content of text of user's input, retrieve based on n-gram picture indices structure, according to the probability weights of the image n-gram retrieving, content of text is carried out to intrinsic vector processing.

And the first indexing units is utilized and under the index of text intrinsic vector result in text label, is carried out picture retrieval, by the picture retrieving according to sequencing of similarity and while exporting, mainly that the text of user's input is carried out after vectorization, according to vectorization value after treatment, picture under index in text label is carried out to similarity calculating, according to the size of the similarity calculating, the picture retrieving is sorted and exported.

The second indexing units, picture to user's input carries out the picture semantic automatic marking based on n-gram picture indices structure, based on n-gram model extraction n-gram picture, in index for the TF-IDF proper vector of n-gram picture of extracting in the text label of semantic tagger, carry out picture retrieval, the picture retrieving is sorted according to similarity and export.

Also have some preferred versions, on the basis of said system, increase and have based on n-gram picture indices construction unit, the index based on n-gram image is set up in this unit, the index of setting up comprises taking image n-gram as index, index structure taking image labeling and picture details as index object, and be labeled as index with picture, the index structure taking image n-gram and picture details as index object.

Particularly, can be divided into again based on n-gram picture indices construction unit, " image dictionary " builds parts and index construct parts.

" image dictionary " builds parts, the major function of these parts is according to image data set, the image dictionary that study comprises image n-gram, particularly, these parts carry out pre-service to the image data collection with text marking, concentrate and extract " image lemma " from pretreated view data, build the image dictionary that comprises accordingly image n-gram according to extracted " image lemma ";

Wherein, build " image dictionary " and first need the image data collection study image lemma by choosing at random, " image lemma " structure " image dictionary " then obtaining by study.

The method step of study " image lemma " as shown in Figure 2, describe as follows by concrete steps:

The first step, the picture of choosing is carried out to text cutting, the mode of text cutting can design according to different application demands.The example of a kind of picture text cutting method providing in the embodiment of the present invention is that picture is evenly divided into the image fritter (as Fig. 1) of size for m*n, each fritter can be regarded " word " in similar text-processing as, and every width image can be regarded " article " accordingly as, the method for picture being carried out to text cutting is not limited to this.

The image low-level image feature of the equal-sized image fritter that second step, extraction are cut into includes but not limited to color of image feature, image texture characteristic.And its multiple low-level image features are merged, thereby obtain the proper vector of an energy response diagram as the multiple low-level image feature of fritter.

The 3rd step, to the proper vector of the each image fritter obtaining, adopts clustering method to carry out cluster operation, finally by choosing the typical data point that represents respective cluster class as " image lemma ".Give corresponding numbering (as Fig. 1) to " the image lemma " that obtain.A kind of embodiment (as Fig. 2) that the present invention adopts, is by the proper vector of all image fritters is done to k-means cluster operation, pre-determines the quantity of clustering cluster, obtains " image lemma " by obtaining the barycenter of k-means cluster result.

And study obtains after " image lemma ", be exactly by structure " image dictionary ", for the space characteristics of further presentation video, in " image dictionary ", add n-gram item, for any " image lemma ", a n-1 being adjacent " image lemma " forms " image lemma " sequence, all these " image lemma " sequences are all added in " image dictionary " as an item, add its length to be less than other " image lemma " sequences of n simultaneously, form " image dictionary ".For example, suppose that " the image lemma " that extract is 1,2,3, choosing n is 2, " image dictionary " that " image dictionary " obtaining so comprises is: (1), (2), (3), (1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1), (3,2), (3,3).Be K for extracting " image lemma " quantity, selecting n is that in 2 embodiment, the gram quantity that " image dictionary " comprises is K*K+K.

Index construct parts, the major function of these parts be according to " image dictionary " to image data set, set up image index based on n-gram.Particularly, build parts constructed image dictionaries according to " image dictionary ", the picture concentrated to the image data with text marking cuts, and extracts corresponding image n unit, sets up the picture indices based on n meta-model.And the image index based on n-gram of setting up comprises two class index structures: one is taking image n-gram as index, the index structure taking image labeling and picture details as index object; Another kind is to be labeled as index with picture, the index structure taking image n-gram and picture details as index object.

Below again taking build the image index based on n-gram that includes above-mentioned two kinds of index structures as example, the detailed process of picture retrieval is described.

1. taking image n-gram as index, taking image labeling and image details as index object, as shown in Figure 4, in figure, Mnode is master index node, in master index node, is the item in " image dictionary ", comprises unigram and bigram.(1,1) be image bigram, the content of master index node index comprises two parts: the details that 1, comprise all pictures of " image dictionary " item in master index node, taking Mnode as example, picture of its lower index is the details of all pictures of comprising " image dictionary " (1,1); 2, comprise text marking label (sun) with and the subindex node (Cnodel) of corresponding weights (Lweightsun).Taking Cnodel as example, subindex node comprises the text label sun that occurs in view data and by the corresponding weights Lweightsun calculating.Lweightsun reaction be " image dictionary " in master index node with subindex node in the relation of text label, the computing method of the present invention's employing are as follows:

Lweigh t_{sun} = p (sun | (1,1)) = \frac{p (sun, (1,1))}{p ((1,1))} = \frac{p ((1,1) | sun) \cdot p (sun)}{p ((1,1))}

Wherein:

p ((1,1) | sun) = \frac{p (sun, (1,1))}{p (sun)} \approx \frac{N ((1,1) + | sun)}{N (n - gram | sun)}

p (sun) = \frac{Nimg (sun)}{Nimg (All)}

p ((1,1)) = \frac{N ((1,1))}{N (n - gram)}

N in formula ((1,1) | sun)---in all pictures with sun label, the number that comprises (1,1);

N (n-gram|sun)---in the index picture with sun label, the number that comprises all n-gram;

Nimg (sun)---with the number of all pictures of sun label;

Nimg (All)---the quantity of all pictures of data centralization;

N ((1,1))---image data is concentrated the quantity of all (1,1);

N (n-gram)---image data is concentrated the quantity of all n-gram

Under subindex node, index is " image dictionary " both having comprised in master index node (Mnode), simultaneously again with the details of all pictures of the text label in subindex node, taking Cnodel as example, the picture of its lower index comprises (1,1) " image dictionary ", simultaneously again with sun label.

2. taking image, semantic label as index, taking image n-gram and image as index content, as shown in Figure 5, in figure, Mnode is master index node, in master index node, be the concentrated text label of image data, as shown in Figure 4, sun is the text label that view data is concentrated.The content of master index node index comprises two parts: 1, data centralization is with the details of all pictures of this text label, and taking Mnode as example, the content of its lower index is all picture details that comprise text label sun; 2, comprise corresponding " image dictionary " ((1)) with and the subindex node (Cnodel) of corresponding weights (Nweightsun).Taking Cnodel as example, subindex node comprises (1) in " image dictionary " and by the corresponding weights Nweight (1) calculating.What Nweight (1) reacted is the latent ATM layer relationsATM of " image dictionary " item in text label item and the subindex node in master index node, and what the present invention provided is specifically calculated as follows:

Nweigh t_{((1))} = p ((1)) | sun) = \frac{p (sun, (1))}{p (sun,)} \approx \frac{N ((1) | sun)}{N (n - gram | sun)}

In formula: N ((1) | sun)---be the number that comprises (1) with the picture of label sun;

N (n-gram|sun)---be the quantity that comprises all n-gram with the picture of label sun.

Under subindex node, index is both with the text label in master index node (Mnode), comprise again the details of all pictures of " image dictionary " item in subindex node simultaneously, taking Cnodel as example, the picture of its lower index, with sun text label, comprises again (1) " image dictionary " simultaneously.

One of ordinary skill in the art will appreciate that all or part of step in said method can carry out instruction related hardware by program and complete, described program can be stored in computer-readable recording medium, as ROM (read-only memory), disk or CD etc.Alternatively, all or part of step of above-described embodiment also can realize with one or more integrated circuit.Correspondingly, the each module/unit in above-described embodiment can adopt the form of hardware to realize, and also can adopt the form of software function module to realize.The application is not restricted to the combination of the hardware and software of any particular form.

The above, be only preferred embodiments of the present invention, is not intended to limit protection scope of the present invention.Within the spirit and principles in the present invention all, any amendment of making, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims

1. the picture retrieval method based on n unit picture indices structure, is characterized in that, the method comprises:

2. the method for claim 1, is characterized in that, the method also comprises:

3. method as claimed in claim 2, is characterized in that, the process that builds the index based on n unit image is as follows:

4. the method as described in claims 1 to 3 any one, is characterized in that, the text of user's input is carried out referring to based on the text intrinsic vector processing of n unit picture indices:

5. method as claimed in claim 4, is characterized in that, utilizes under the index of text intrinsic vector result in text label and carries out picture retrieval, and the picture retrieving is referred to according to sequencing of similarity output:

6. the picture retrieval system based on n unit picture indices structure, is characterized in that, this system comprises judging unit, the first indexing units and the second indexing units, wherein:

7. system as claimed in claim 6, is characterized in that, this system also comprises:

Based on n unit picture indices construction unit, set up the index based on n unit image, this index comprises taking image n unit as index, index structure taking image labeling and picture details as index object, and be labeled as index with picture, the index structure taking image n unit and picture details as index object.

8. system as claimed in claim 7, is characterized in that, is describedly divided into based on n unit picture indices construction unit:

9. the system as described in claim 6 to 8 any one, is characterized in that, described the first indexing units carries out referring to based on the text intrinsic vector processing of n unit picture indices to the text of user's input:

10. system as claimed in claim 9, is characterized in that, described the first indexing units is utilized under the index of text intrinsic vector result in text label and carried out picture retrieval, and the picture retrieving is referred to according to sequencing of similarity output: