CN101901249A

CN101901249A - Text-based query expansion and sort method in image retrieval

Info

Publication number: CN101901249A
Application number: CN2010101847252A
Authority: CN
Inventors: 张玥杰; 金城; 薛向阳; 岑磊; 彭琳
Original assignee: Fudan University
Current assignee: Fudan University
Priority date: 2009-05-26
Filing date: 2010-05-12
Publication date: 2010-12-01

Abstract

The invention belongs to the field of multimedia information retrieval and relates to a method for realizing thesaurus-based query expansion and sort in image retrieval. The method comprises a WordNet-based English word semantic similarity metric algorithm, a HowNet-based Chinese word semantic similarity metric algorithm, an expansion rule-based query expansion word selection and optimization algorithm and a retrieval result evaluation and optimization algorithm. In the method, an image search engine is improved by the relevant text processing method and the relevant semantic network dictionary; and the retrieval result is sorted through semantic expansion, user interaction and improved similarity measurement. Compared with the traditional method, the method has the advantages of high accuracy rate, high integrality and low space-time cost. The method has very important significance for performing high-efficiency image retrieval according to image high-layer semantic information and on the basis of a large-scale image data set, and has wide application value in the field of cross-linguistic and cross-media retrieval.

Description

Text based query expansion and sort method in a kind of image retrieval

Technical field

The invention belongs to the multimedia information retrieval field, relate to a kind of search method of specific medium-image, be specifically related to a kind of in image retrieval, the realization based on the query expansion of thesaurus and the method for ordering.This method can be used for cooperating content-based image search method, improves the picture search quality, improves user search and experiences.

Background technology

In recent years, along with the development of Internet and social informatization, the capacity of digital picture increases just fast, all has every day the view data of magnanimity to produce.How to search quickly and accurately, access images, and effectively utilize these images to become the problem that presses for solution, so-called image retrieval technologies that Here it is.Mainly adopt the earliest the text-based image retrieval technology (Text-based Image Retrieval, TBIR); Since the nineties in 20th century, CBIR (Content-based Image Retrieval, CBIR) research and application obtain significant progress, so develop the image retrieval that based on semanteme again, based on the image retrieval of feedback and based on the image retrieval of knowledge ^[1]TBIR is as a kind of early stage technology, very depends on the annotation results of image, and this is limited part place, but the text based retrieval as a kind of comparatively proven technique, its rapid and reliable characteristics are still very outstanding so far.Therefore, TBIR remains an aspect that is worth research, if can draw additive method some characteristics or and other several methods use alternately, good effect can be arranged.

From view data itself, can be divided into two classes.One class has relevant textual description information, and as the news image, the reporter has brief text description when beaming back image usually; Another kind of then is no explanatory note.For the image library of " having textual description information ", general searching system all is to extract keyword to realize image retrieval as index from the textual description of image.Owing to obtained many achievements at Study on Text Retrieval for many years, and to compare based on the searching system of image low-level image feature merely, this class image indexing system often can be supported the retrieval based on senior semanteme better.But about the textual description of image is the image author from self understanding and hobby angle Short Description that image is done, with different to the markup information that image carried out at the retrieval purpose, then both described contents also certainly exist difference.Therefore, the keyword that extracts from textual description is different from the keyword of the manual mark that is specifically designed to the retrieval purpose, not only causes precision ratio to descend, and may return the many incoherent Query Results of user, makes the user at a loss as to what to do ^[2]Simultaneously, in image retrieval is used, user's real information demand between the query requests of user's submission and query requests have certain deviation between the query requests of system understanding.Exist between the information that these deviations cause between the associated picture that checks out and the user inquiring and even the user wishes to obtain and do not match.Optimize user inquiring by query expansion, more accurate, express user's query demand objectively, help the user to obtain needed information rapidly and accurately, become the important research focus of information retrieval field, particularly field of image search ^[3,4]

Present enquiry expanding method roughly can be divided into three classes, promptly based on thesaurus, based on global analysis and based on partial analysis ^[5,6,7]Based on the method for thesaurus generally by means of the semantic knowledge dictionary ^[8,9,10], to select and inquire about word and exist the speech of certain semantic relevance to expand, the foundation of selection is generally hyponymy between the speech and synonymy etc. ^[11, ^{12,13,14,15,16]}This method depends on complete semantic system, is independent of object set to be retrieved ^[17,18,19]Its basic thought of method based on global analysis is that speech in whole searching objects or phrase are carried out correlation analysis, will add initial query to generate new inquiry by the highest speech or phrase with inquiring about the word correlation degree ^[20], this method concerns that the renewal after the searching object set changes is costly though can seeking between speech to greatest extent, and along with also can there be infeasibility in increasing progressively on the space-time cost of scale of searching object set.And be the inquiries of two stages based on the method for partial analysis, just at first user's initial query is done retrieval for the first time, choosing the top n searching object according to result for retrieval analyzes, find out the wherein higher speech of importance, form new inquiry with initial query, utilize new inquiry to carry out the retrieval second time then ^[20]There is " inquiry drift " problem easily in this method, when the first time, result for retrieval was not good, may selects and inquire about the incoherent speech of theme and be added to initial query, can seriously reduce the inquiry precision, even be lower than the situation of not doing query expansion.

On the other hand, how desired result for retrieval is presented to the user, help the user promptly to locate resource needed, also be the target of image retrieval always.When user input query, hope can in time retrieve the result who wants most, and these results can come the foremost of result for retrieval ^[21]Especially when returning a large amount of result for retrieval,, only be concerned about the result of some of fronts basically, and the result for retrieval after leaning on can not be unwilling also to travel through one by one very much, even the probability of being read by the user is almost nil from user's the custom of browsing ^[22]Therefore, can the ordering effect of result for retrieval directly has influence on the user obtain resource requirement easily, is also determining the satisfaction of user to this searching system simultaneously ^[23]Especially image search engine, it organizes a large amount of all kinds of image resources, is the resource discovery tool at particular media asset.Can the user uses this class search engine to have stronger purpose, more pay close attention to and find resource requirement as early as possible in result for retrieval, and this just handles the ordering of image search engine and proposes higher requirement.The key factor of decision ranking results is the ordering strategy of image search engine, and one of ordering strategy part that to be image search engine most crucial also is the image search engine key of success.

Existing universal search engine sort algorithm can be divided into five kinds on principle, i.e. the website rank algorithm of word frequency and position weighting sort algorithm, Direct Hit algorithm, Alexa, sort algorithm and the similarity algorithm of Google ^[24,25,26]Utilizing word frequency and position weighting algorithm is the main thought of the early stage ordering of search engine, and its technical development is the most ripe, is still the core ordering techniques of many search engines so far.The advantage of this algorithm is simply, easily realizes, relatively is applicable to the structured document data.Direct Hit is a kind of sort algorithm of paying attention to information quality and user behavior feedback, can satisfy " user ensures principle " to a certain extent, also considers quality of information simultaneously.But, be difficult to guarantee the accuracy of ranking results because user behavior is more random.Alexa is absorbed in issue website, world rank, mainly considers two aspects of overall ranking and classification rank.The sort algorithm of Google is the deciding factor of its outstanding Search Results, adopts a kind of mode---PageRank of sort-net page file grade of precision.Inquiry also is an important evidence of search engine ordering with the similarity degree of result for retrieval record, and method relatively more commonly used at present is all regards query string and document as vector, wherein needs to consider to inquire about the length with result for retrieval.

Can find that by above-mentioned analysis-by-synthesis in import the image search engine that keyword inquires about based on the user, the keyword of user's input is the unique Query Information that obtains of engine.Do not have under the condition of variation in image library and pictograph descriptor, the relevant information corresponding with keyword sequence must be unique result for retrieval.Therefore, excavating information as much as possible from keyword sequence comes nonproductive poll will help engine better to understand user's intention.Come to this a kind of mode of extend information of query expansion.If can will can bring the better retrieval result to a certain extent by query expansion dominant information or recessive uncertainty of dwindling user view.In addition, the Query Result of image search engine is often a lot, and for the user, often only can watch preceding tens results patiently.In other words, how will be close to the users more that to be put into return results more quite important on the position of front for the image searching result of search intention.As there being 50 correct results not necessarily more to produce effect than having only 20 correct results to seem in 100 return results, so precision ratio (Precision) is all very important with recall ratio (Recall).Practice shows that do not have the user can all Search Results all be used, the user only chooses useful person, and this sorts just and will offer user convenience.

Simultaneously, more than analyze and illustrate that also present existing query expansion and result for retrieval sort algorithm come from text retrieval usually, at extensive text information processing ^[27,28]Though the text-based image retrieval technology is derived from now ripe text retrieval technology, wherein has some inapplicable technology, can bring negative effect to image retrieval.General query expansion model or order models can not be all effective to image retrieval, are still waiting to strengthen and in-depth at the query expansion of text retrieval and the research of search file sequencing model about existing.

List of references related to the present invention has:

[1]Oilscoil?Chathair，Bhaile?Cliath，A.F.Sineaton，I.Quigley，Alan?F.Smeaton，Ian?Quigleyand?Glasnevin?Dublin.“Experiments?on?Using?Semantic?Distances?between?Words?in?ImageCaption?Retrieval”.In?Research?and?Development?in?Information?Retrieval，pp.174-180，1996.

[2]Yang?Linpeng，Ji?Donghong，Tang?Li?and?Niu?Zhenyu.“Chinese?Information?Retrieval?basedon?Terms?and?Relevant?Terms”.In?ACM?Transactions?on?Asian?Language?Information?Processing；Vol.4(3)：357-374，September，2005.

[3]Xiqing?Lin?and?Ximing?Chen.New?Methods?for?Query?Expansion?and?Query?Re-weighting?forDocument?Retrieval.Master?Thesis，Department?of?Information?Engineering，National?Scienceand?Technology?University，Taiwan，2005.

[4]YiXuan?Hong.Ontological?Inference?for?User?Intention?Extraction，Query?Expansion?andConcept-based?Retrieval.Master?Thesis，Department?of?Information?Engineering，NationalDong-hua?University，Taiwan，2004.

[5]C.Ch.Latiri，S.Ben?Yahin，J.P.ChevaVet?and?A.Jaouaa.“Query?Expansion?using?FuzzyAssociation?Rules?between?Terms”.In?Proceedings?of?the?4th?JIM?International?Conferenceon?Knowledge?Discovery?and?Discrete?Mathematics，Mets，France，2003.

[6]Hsi-Ching?Lin，Li-Hui?Wang?and?Shyi-Ming?Chen.“Query?Expansion?for?Document?Retrievalbased?on?Fuzzy?Rules?and?User?Relevance?Feedback?Techniques”.In?Expert?Systems?withApplications，Vol.31(2)：397-405，August?2006.

[7]Hang?Cui，Ji-Rong?Wen，Jian-Yun?Nie?and?Wei-Ying?Ma.“Query?Expansion?by?Mining?User?Logs”.In?IEEE?Transactions?on?Knowledge?and?Data?Engineering，Vol.15(4)：829-839，July/August2003.

[8]Christiane?Fellbaum(ed.).WordNet：An?Electronic?Lexical?Database.The?MIT?Press，Cambridge，MA，1998.

[9]George?A.Miller?and?Florentina?Hristea.“WordNet?Nouns：Classes?and?Instances”.InComputational?Linguistics，Vol.2(1)：1-3，2006.

[10] the long actor of Dong Zhen east Hao Dong Qiang, " theory of knowing net is found ", " Journal of Chinese Information Processing ", Vol.21 (4): 3-9,2007 years.

[11]Zhiguo?Gong，Chan?Wa?Cheang，and?Leong?Hou?U.“Web?Query?Expansion?by?WordNet”.In?LectureNotes?in?Computer?Science?Volume?4080/2006，pp.379-388，2005.

[12]Ming-Hung?Hsu，Ming-Feng?Tsai?and?Hsin-Hsi?Chen.“Query?Expansion?with?ConceptNet?andWordNet：An?Intrinsic?Comparison”.In?Proceedings?of?AIRS?2006，pp.1-13，2006.

[13]Alexander?Budanitsky?and?Graeme?Hirst.“Evaluating?WordNet-based?Measures?of?LexicalSemantic?Relatedness”.In?Computational?Linguistics，Vol.32(1)：13-47，2006.

[14] Liu Qun Li Su builds, " the lexical semantic similarity based on " knowing net " is calculated ", Computational Linguistics and ChineseLanguage Processing, Vol.7 (2): 59-76,2002 years.

[15] Li Feng Li Fang, " Chinese word semantic similarity calculates---based on " knowing net " ", " Journal of Chinese Information Processing ", Vol.21 (3): 99-105,2007 years.

[16] the refined Wang Hong of Jiang Minxiao poem is luxuriant executes water, " a kind of improved phrase semantic similarity based on " knowing net " is calculated ", " Journal of Chinese Information Processing ", Vol.22 (5): 84-89,2008 years.

[17]Diana?Inkpen?and?Graeme?Hirst.“Building?and?Using?a?Lexical?Knowledge?Base?of?Near-SynonymDifferences”.In?Computational?Linguistics，Vol.32(2)：223-262，2006.

[18]Ted?Pedersen，Satanjeev?Banerjee?and?Siddharth?Patwardhan.“Maximizing?SemanticRelatedness?to?Perform?Word?Sense?Disambiguation”.In?University?of?MinnesotaSupercomputing?Institute?Research?Report?UMSI?2005/25，March，2005.

[19]Budanitsky，Alexander?and?Graeme?Hirst.“Semantic?Distance?in?WordNet：An?Experimental，Application-Oriented?Evaluation?of?Five?Measures.In?Proceedings?of?the?Workshop?on?WordNetand?Other?Lexical?Resources，The?Second?Meeting?of?the?North?American?Chapter?of?theAssociation?for?Computational?Linguistics，Pittsburgh，PA，pp.29-34，2001.

[20]Yuen-Hsien?Tseng，Da-Wei?Juang?and?Shiu-Han?Chen.“Global?and?Local?Expansion?TermExpansion?for?Text?Retrieval”.In?Proceedings?of?the?Fourth?NTCIR?Workshop?on?Evaluationof?Information?Retrieval，Automatic?Text?Summarization?and?Question?Answering，Tokyo，Japan，June?2-4，2004.

[21]H.Vernon?Leighton，Jaideep?Srivastava.Precision?among?World?Wide?Web?Search?Services(Search?Engines)：AltaVista，Excite，Hotbot，Infoseek，Lycos.September，2006.http：//www.winona.msus.edu/library/webind2/webind2.htm.

[22]Claudio?Carpineto，Giovanni?Romano?and?Vittorio?Giannini.“Improving?Retrieval?Feedbackwith?Multiple?Term-Ranking”.In?ACM?Transactions?on?Information?Systems，Vol.20(3)：259-290，July，2002.

[23]Kemafor?Anyanwu，Angela?Maduko?and?Amit?Sheth.“SemRank：Ranking?Complex?RelationshipSearch?Results?on?the?Semantic?Web”.In?Proceedings?of?WWW?2005，pp.117-127，Chiba，Japan，May?10-14，2005.

[24]Shengli?Wu?and?Fabio?Crestani.“Methods?for?Ranking?Information?Retrieval?Systems?withoutRelevance?Judgements”.In?Proceedings?of?the?2003?ACM?Symposium?on?Applied?Computing，pp.811-816，Melbourne，Florida，USA，2003.

[25]Boanerges?Aleman-Meza，Chris?Halaschek，I.Budak?Arpinar?and?Amit?Sheth.“Context-AwareSemantic?Association?Ranking”.In?Technical?Report?03-010，LSDIS?Lab，Computer?Science，Universi?ty?of?Georgia，August，2003.

[26]Taher?H.Haveliwala.“Topic-Sensitive?PageRank：A?Context-Sensitive?Ranking?Algorithmfor?Web?Search”.In?IEEE?Transactions?on?Knowledge?and?Data?Engineering，Vol.15(4)：784-796，July/August，2003.

[27]Jinan?Fiaidhi，Sabah?Mohammed，Jihad?Jaam?and?Ahmad?Hasnah.“A?Standard?Framework?forPersonalization?via?Ontology-based?Query?Expansion”.In?Pakistan?Journal?of?Informationand?Technology，Vol.2(2)：96-103，2003.

[28]Chris?Buckley?and?Ellen?M.Voorhees.“Retrieval?Evaluation?with?Incomplete?Information”.In?Proceedings?of?SIGIR?2004，pp.25-32，Sheffield，UK，2004.

Summary of the invention

The objective of the invention is to overcome the deficiencies in the prior art, propose a kind of method of in text-based image retrieval, effectively carrying out the ordering of query expansion and result for retrieval.

The present invention uses for reference the text retrieval technology, sets up " a dedicated query expansion and result for retrieval order models " that is suitable for the image retrieval own characteristic.Adopt the design philosophy of " from generally to special ", a kind of technological frame (comprising four main algorithm) is disclosed, promptly from " general inquiry expansion and result for retrieval order models ", use relevant text handling method and semantic network dictionary that image search engine is improved, by semantic extension and user interactions, wherein use semantic network dictionary and the extension rule of being set up; And by improved measuring similarity result for retrieval is sorted, wherein pay close attention to structure and optimization, final " dedicated query expansion and the result for retrieval order models " that is applicable to text-based image retrieval of setting up in scoring algorithm.

Query expansion and result for retrieval sort method in the image retrieval that the present invention proposes, be to utilize relevant text handling method and semantic network dictionary that image search engine is improved, it comprises following aspect: (1) pre-service and preanalysis (Pre-Processing and Pre-Analysis)---at initial query, finish the participle and the punctuation mark mark-on of inquiry by pre-service, and, finish stop word mark-on, part of speech analysis and keyword extraction by preanalysis based on through pretreated initial query; (2) phrase semantic measuring similarity (Word Lexical Semantic Similarity Measurement)---measure at the english vocabulary semantic similarity, the path Network Based and the degree of depth are come the computing semantic distance, and measure at the Chinese terms semantic similarity, calculate based on taking all factors into consideration the former similarity of main classes justice, semantic formula similarity and the former framework similarity of main classes justice, incorporate maximum match rule and adopted former depth information simultaneously; (3) query expansion (QueryExpansion based on Fusion of Expansion Rules) of fusion extension rule---based on semantic network identity, merge the particular extension rule of being set up simultaneously, carry out semantic extension at the keyword sequence that comes from initial query; (4) based on the result for retrieval ordering (Retrieval Results Ranking based on Scoring) of scoring---the result for retrieval that returns with search engine is as process object, based on " the close degree " between phrase semantic measuring similarity assessment searching keyword sequence and the iamge description explanation, thereby obtain scoring, and be optimized, thereby final score is returned the sort by of image as search engine by improved scoring algorithm.

Compared with the prior art, said method of the present invention finally obtains result for retrieval based on the inquiry of being expanded in image search engine, has three big advantages, and promptly accuracy rate height, integrality are strong and the space-time cost is low.It is considerably less that its accuracy rate height is embodied in the expansion word that is come by non-common semantic speech in the query expansion, and this can guarantee to have the height general character through expansion word that is obtained after the expansion and initial query keyword; The correlativity difference of being returned by search engine in the result for retrieval ordering or the image of " mistake " come the back location of result for retrieval tabulation as far as possible, this can guarantee through after the ordering in the same result set more the sort result of " good " more before see so that the user is easier.Its integrality is embodied in initial query keyword sequence additional extension in the query expansion by force and the expansion word that comes is very complete, describes with the picture specification in the image set to have high correlation and consistance; The search behavior of search engine is not subjected to any interference in the result for retrieval ordering, and this can guarantee not to be affected through its return results collection after the ordering.Its space-time cost is low to be embodied in query expansion and the result for retrieval ordering, cast aside network transfer speeds and server process speed grade is wanted factor, has better spatiotemporal efficiency, will be consuming time more much bigger and the transmission in the actual environment is consuming time than concrete calculating, then for the user, experience less than such lead time.

Outstanding contributions of the present invention have been to provide (1) english vocabulary semantic similarity metric algorithm based on WordNet; (2) based on the Chinese terms semantic similarity metric algorithm of HowNet; (3) select and optimized Algorithm based on the query expansion selected ci poem of extension rule; (4) scoring of result for retrieval and optimized Algorithm.Utilize above four core algorithms to design the technological frame of query expansion and result for retrieval ordering in a kind of image retrieval.

Above-mentioned advantage of the present invention can satisfy at extensive image data set, considers the image high-layer semantic information and carries out the application demand of efficient image retrieval, and striding language, to stride the medium retrieval be exactly that it is mainly used.

Description of drawings

Fig. 1 is the flow process frame diagram of the inventive method, number in the figure: (one) is pre-service and preanalysis functional module; (2) be phrase semantic measuring similarity functional module; (3) for merging the query expansion functional module of extension rule; (4) be based on the result for retrieval ranking function module of marking.

Fig. 2 is for demonstrating the concrete steps of above-mentioned algorithm flow framework by concrete example, by provide each algoritic module in the middle of the final result for retrieval of output and this framework, give intuitively and understand,

Wherein, label (1) is inquired about and Chinese queries with the original English that (2) are respectively user's input; Label (3) is respectively with (4) and utilizes English thesaurus WordNet and Chinese thesaurus HowNet, and what employing " the query expansion selected ci poem based on extension rule is selected and optimized Algorithm " was obtained inquires about and the corresponding expansion word set of Chinese queries with original English; Label (5) and (6) are respectively the expansion word set based on original English inquiry and Chinese queries, the corresponding result for retrieval that utilizes image search engine to obtain; Label (7) is respectively based on the initial retrieval of original English inquiry and Chinese queries with (8) and tabulates the final result for retrieval that " scoring of result for retrieval and the optimized Algorithm " of utilization fusion " based on the english vocabulary semantic similarity metric algorithm of WordNet " and " based on the Chinese terms semantic similarity metric algorithm of HowNet " obtained.

Embodiment

Introduce the present invention carries out the flow process framework of query expansion and result for retrieval ordering and forms this framework in image retrieval four core algorithms in detail below in conjunction with accompanying drawing:

Embodiment 1

1. the flow process framework of algorithm

Accompanying drawing 1 is the process flow diagram of this application framework, and label 1-4 represents four above-mentioned main functional modules respectively.

This framework is divided into four main modular: the query expansion (QueryExpansion based on Fusion of Expansion Rules) of pre-service and preanalysis (Pre-Processing and Pre-Analysis), phrase semantic measuring similarity (Concept Semantic Relativity Measurement), fusion extension rule and the result for retrieval ordering (Retrieval ResultsRanking based on Scoring) based on scoring.Wherein three of four core algorithms, based on the english vocabulary semantic similarity metric algorithm of WordNet, will be used for result for retrieval order module, expand module and select the inquiry that will be used to merge extension rule with optimized Algorithm based on the query expansion selected ci poem of extension rule based on scoring based on the scoring of the Chinese terms semantic similarity metric algorithm of HowNet and result for retrieval and optimized Algorithm.In preceding two modules of this application framework, also will use some present prior aries of comparative maturity.

(1) pre-service and preanalysis (Pre-Processing and Pre-Analysis): at initial query, its pretreated main task is to finish the participle and the punctuation mark mark-on of inquiry.Wherein, take maximum probability participle strategy at Chinese queries, and need attach list speech initial letter capital and small letter conversion process at English language query.Based on the pretreated initial query of process, three tasks are mainly finished in preanalysis.The first is marked the stop word in the inquiry; It two is that each word is carried out the part of speech analysis, determines correct part of speech under it, wherein also needs to carry out form for the word that has version in the English language query and recovers to handle; And it three is according to the part of speech analysis result, extracts the inquiry lexical item as keyword.

(2) phrase semantic measuring similarity (Word Lexical Semantic Similarity Measurement): the reflection of phrase semantic similarity is the polymerization characteristics between the word, can weigh with the replaceable degree between two words.The phrase semantic distance can be counted as the reverse side of phrase semantic similarity, and two phrase semantic distances are big more, and then its similarity is low more; Otherwise two phrase semantic distances are more little, and then its similarity is big more.Measure at the english vocabulary semantic similarity, institutional framework based on the WordNet thesaurus, the main calculating of considering the pairing synonym set of notion, get a pair of synonym set of similarity maximum (perhaps semantic distance minimum) and represent two synsets to add up to the net result of calculating respectively, wherein also need consider particular requirement at image search engine space-time cost.At Chinese terms semantic similarity tolerance,,, semantic similarity is divided into three parts calculates according to the architectural characteristic of knowledge system descriptive language based on the institutional framework of HowNet thesaurus.First considers the former similarity of main classes justice separately; Second portion is considered the similarity of whole semantic formula, and divides by level justice is former according to the level characteristic of knowledge description language description, and every layer of method that adopts maximum match carried out similarity and calculated then; Third part is considered the similarity of the former framework of main classes justice.Simultaneously, when calculating adopted former similarity, add adopted former depth information, contain the adopted former of different quantity of information to treat with a certain discrimination.

(3) query expansion (Query Expansion based on Fusion of ExpansionRules) of fusion extension rule: for given searching keyword, determine another speech as its expansion word according to extension rule, i.e. the semantic extension rule of word.After determining this rule, a semantic extension just can be understood as based on semantic network identity each word of keyword sequence to be expanded then respectively the result is merged.Consider the randomness that the user imports, order is not considered before and after the word in the keyword sequence, makes no exception.Simultaneously, concentrate picture specification descriptor and the dual consideration of expanding the word set scale, keyword set after the query expansion is optimized processing based on view data.

(4) based on the result for retrieval ordering (Retrieval Results Ranking based on Scoring) of scoring: the result for retrieval that returns based on search engine, according to keyword sequences and picture specification description image is marked, this image score will be as the sort by of returning image.In fact, scoring is described liking picture specification, and image itself be there is no any understanding, and this is a pure marking scheme of trusting and relying on the picture specification description.Just, based on the calculating support of " phrase semantic is close ", searching keyword sequence and picture specification are described " close " degree of keyword sequence and marked.Wherein, describe the result of calculation of keyword, all give respective weights, be used for outstanding possible " giving prominence to " object of image for picture specification.Simultaneously,, set up the multilayer caching mechanism, improve the scoring strategy, thereby simplify the complexity of calculating, save the plenty of time, promote processing speed by the relatedness computation result cache of suitable size is set.

2. based on the english vocabulary semantic similarity metric algorithm of WordNet

Based on the intention of the english vocabulary semantic similarity metric algorithm of WordNet based on following imagination: wish in online treatment image searching result sequencer procedure system can dynamic calculation English language query keyword sequence and the result for retrieval keyword sequence between semantic similarity, but will be on the basis that guarantees suitable space-time cost magnitude, the statistical information of " general english vocabulary semantic similarity measurement model " is not exclusively lost.That is, wish to utilize the resulting synset of the suitable processing of process from specific English Semantic network knowledge source, on former phrase semantic measuring similarity model based, revise.Yet, " general english vocabulary semantic similarity measurement model " exist data sparse with the limited problem of word part of speech of handling, so can't merge synset in the online treatment process and semantic computation model is accurately measured the phrase semantic similarity.Therefore, the invention provides and a kind ofly can utilize the new algorithm of on former general english vocabulary semantic similarity measurement model, revising through resultant synset after the special extension process.New algorithm of the present invention satisfies three following conditions simultaneously:

(1) avoid the sparse problem of data---the word quantity of forming the phrase semantic definition is often insufficient, thereby causes taking place in the semantic computation process the sparse problem of data.Therefore, can only utilize through resultant word set after the special extension process and on former general english vocabulary semantic similarity measurement model, revise, to address this problem.

(2) the word part of speech is not limited---and cannot only be applicable to the tolerance of semantic similarity between the noun word, should have the semantic similarity tolerance ability of striding between the part of speech word.

(3) the space-time complicacy is low---and cannot use the algorithm of high space-time complicacy in the processing procedure, will read great number of statistic data as the probability of use Calculation Method and handle, need to consider that algorithm improves and optimization, to satisfy the requirement of space-time cost.

The present invention meets the new algorithm of above three conditions by the design of following step,

Relevant english vocabulary semantic similarity tolerance, classical Lesk algorithm is seen the phrase semantic definition speech bag of out-of-order as, and occurs simultaneously with the word between defining and to weigh its similarity.It is also similar that Lesk thinks that semantic close phrase semantic defines employed word, but the word quantity of forming definition is often insufficient, thereby cause the sparse problem of data.Therefore, for addressing this problem, the present invention proposes some expansion algorithms.

The Lesk expansion algorithm can overcome the sparse problem of data in the classical Lesk algorithm to a certain extent by the semantical definition of expansion word.EKEDAHL and GOLUB do to adjust, by searching nearest two hypernyms of certain notion, expand the phrase semantic definition that is used to calculate overlapping number the Lesk algorithm by using WordNet.Pedersen etc. adopt another kind of extended method, and consideration and all semantical definitions that certain word directly links to each other on the WordNet structure comprise hypernym and hyponym etc., give phrase bigger weight simultaneously.The author declares, under the same conditions, this algorithm has on performance than traditional Lesk algorithm and significantly improves.The most frequently used information of Lesk expansion algorithm is hypernym information, and promptly the father node among the WordNet is the further abstract of phrase semantic.

Existing Lesk expansion algorithm is mainly considered the semantic information, particularly hypernym that directly link to each other with certain word in the WordNet hierarchical structure, comes phrase semantic definition to expand, and can overcome the sparse problem of data to a certain extent.Though these methods can effectively be utilized the direct information in the WordNet structure, neglect some very useful collateral information.Thus, the present invention sets up a kind of Lesk expansion algorithm based on equal speech (Coordinate Terms) (being called for short Lesk-C), can further expand the phrase semantic definition, wherein equal speech is defined as synset under certain word and is combined in sibling (for example, the equal speech of " basketball " comprises " football ", " volleyball " etc.) in the WordNet hierarchical structure.Obviously, pairing equal speech certainly exists a public father node with it in a synonym set.

The Lesk-C algorithm defines by all (perhaps parts) of introducing a meaning of a word equal speech expands this notion semantical definition, and its thought is set up based on following hypothesis, and promptly any notion and its equal speech are for determining that role is consistent in the context.According to above-mentioned hypothesis, consider that it is one group of equal speech that " basketball ", " football " reach " volleyball ", adopt the Lesk-C algorithm of being set up, expand the definition of " volleyball " by the definition of using equal speech " basketball " and " football " etc., can increase the crossing possibility of word undoubtedly.Equal speech is the sibling among the WordNet, though be not the abstract of original meaning of a word even do not contact directly, but any notion semantical definition and equal speech thereof for consistent this assumed condition of notion role of determining in the context under, equal speech and hypernym are meaningful equally.

For the complete word semantical definition of being obtained afterwards through expansion based on Lesk-C, because wherein each word all belongs to a plurality of synsets, then the tolerance of the semantic relevancy between two notions (perhaps semantic distance calculating) in fact is exactly the calculating of two synsets.In general, take a pair of synset of similarity maximum (perhaps semantic distance minimum) to represent two synsets to calculate net result respectively.

The present invention hereinafter description agreement and with the symbol used as giving a definition:

(1) two synset S1 and the path distance of S2 on the WordNet semantic network are from S ₁To S ₂The limit number of path process, with Len (S ₁, S ₂) function representation.

(2) when the limit of the last the next type of only considering the WordNet semantic network, semantic network is degenerated to forest.Behind the root node that increases a void, this forest is converted to one tree.Two synset S ₁With S ₂Minimum public father node (Lowest Super-Ordinate) Lcs (S in justice is set up and down ₁, S ₂) function representation, and its degree of depth in the tree is by Depth () function representation.

(3) notion semantic relevancy Sim (C ₁, C ₂) and semantic distance Dist (C ₁, C ₂) between the pass be:

Sim(C ₁，C ₂)+Dist(C ₁，C ₂)＝1 (1)

In notion semantic similarity tolerance, the WordNet hierarchical structure is regarded as a figure, utilize routing information to calculate the degree of correlation then.Wherein more direct idea is: the distance of two nodes is near more, and the degree of correlation so between the two is big more.That is to say that if the public hypernym of two node representative notions is near more from them, then the similarity between the two is big more.Here, employed similarity formula is as follows:

Sim (C_{1}, C_{2}) = Sim (S_{1}, S_{2}) = \frac{2 \times Depth (Lcs (S_{1}, S_{2}))}{Depth (S_{1}) + Depth (S_{2})} - - - (2)

Wherein, Depth () is the notion C or the degree of depth of synset S in the WordNet hierarchical structure, and (S1 is to be notion C S2) to LCS ₁With C ₂Perhaps synset S ₁With S ₂All public hypernyms in that hypernym of degree of depth maximum.

This formula can be converted to following formula by distortion:

Dist (C_{1}, C_{2}) = \frac{Len (C_{1}, Lso (C_{1}, C_{2})) + Len (C_{2}, Lcs (C_{1}, C_{2}))}{Len (C_{1}, Lcs (C_{1}, C_{2})) + Len (C_{2}, Lcs (C_{1}, C_{2})) + 2 \times Depth (Lcs (C_{1}, C_{2}))} - - - (3)

Have the polysemy phenomenon in the English, the phrase semantic similarity should be calculated the similarity between the notion (the perhaps meaning of a word, semantical definition), and the semantic similarity of two alone words is maximal values of similarity between its all notions.

Sim(W ₁，W ₂)＝maxSim(C _1i，C _2j)i＝1Λn，j＝1Λm (4)

Wherein, W1 represents word 1 and has n notion, W ₂Represent word 2 and have m notion, C _1iBe the i item notion of W1, C _2jBe W ₂J item notion.

The step of above-mentioned algorithm is described below with false code:

(1) obtains input: two word W ₁With W ₂

(2) select two notion C _1iWith C _2j

(3) search WordNet semantic network file, obtain and represent C respectively _1iWith C _2jTwo synonyms set S _1iWith S _2j

(4) according to formula (1)～(3), with S _1iWith S _2jInput Dist (C _1i, C _2j) the computing semantic distance results.

(5) repeating step (2)～(4), each is worth the similarity between the notion (semantic distance) to obtain two words.According to formula (4), therefrom select maximal value as final word similarity value.

Wherein, calculating Dist (C _1i, C _2j) time, only use hyponymy.

3. based on the Chinese terms semantic similarity metric algorithm of HowNet

Based on the intention of the Chinese terms semantic similarity metric algorithm of HowNet based on following imagination: wish in online treatment image searching result sequencer procedure system can dynamic calculation Chinese queries keyword sequence and the result for retrieval keyword sequence between semantic similarity, but on the basis that guarantees suitable space-time cost magnitude, can take into full account the many difficult points and the complicacy that exist in the Chinese.That is, wish to utilize corresponding Chinese terms notion multiple hierachical description from certain semantic network knowledge source, extract and enrich semantic information, set up the tolerance mechanism that meets human subjective sensation more.Yet, " general Chinese terms semantic similarity measurement model " exists can't fully obtain intrinsic association between the word notion, field unbalancedness and the sparse problem of data, so the online treatment process is more prone to calculate the similarity of word notion itself, and not too pay close attention to its different semantemes.Therefore, the invention provides a kind of new algorithm that can utilize word notion semantic description on general Chinese terms semantic similarity measurement model, to revise with " correctness, without prejudice and completeness ".New algorithm of the present invention satisfies three conditions simultaneously:

(1) avoid the sparse problem of data---the word quantity of forming the notion semantical definition is often insufficient, thereby causes in the semantic computation process sparse data problem taking place.Therefore, can only utilize the multiple hierachical description and the additional ancillary information of word notion semanteme, thereby on former general Chinese terms semantic similarity measurement model, revise, to address this problem.

(2) have height differentiation power---should be able to effectively utilize the structure of knowledge of Chinese semantic meaning network, with different word group differentiations at different similarity levels.

How description now designs the new algorithm that meets above three conditions.

Be complexity, consistance and the accuracy of guaranteeing word notion semantic description, HowNet adopts a kind of knowledge description standard system---knowledge system descriptive language (Knowledge Database Mark-up Language, KDML), have following four kinds of important composition forms.

(1) justice is former---and used word is called as justice former (Sememes) among the KDML, as " exercise| exercises " and " sport| physical culture ", and organizes according to the KDML syntax rule.Justice is former not to have an ambiguousness, is " the most basic and be not easy to cut apart again meaning least unit " that is extracted from Chinese character (comprising single-morpheme word), the just least unit of Miao Shuing.

(2) main classes justice is former---and first justice in the semantic formula is former, and to be also referred to as main classes justice simultaneously former, and it is former that " exercise| exercise " is main classes justice in the previous examples.The former meaning that must be pointed out that notion is the most basic of main classes justice can think that it has the strongest descriptive power to notion.

(3) semantic formula---" DEF={...} " is the core of whole record, is definition and description for notion, is referred to as semantic formula.Be complexity, consistance and the accuracy of guaranteeing conceptual description, utilize KDML to carry out standard.

(4) the former framework of main classes justice---briefly, also carry out the semantic formula definition as word for most of justice is former, as shown below.Wherein, for justice former " thing| all things on earth ", the former framework of its main classes justice is " { entity| entity: { ExistAppear| deposits cash: existent={～} } } ", describes the grammer strictness and follows the KDML descriptive language.

In based on the word notion semantic description that KDML set up, be in the adopted former descriptive power difference in the different bracket levels for the phrase semantic definition, the adopted former descriptive power to notion that is in the outer bracket is strong more; Otherwise being in adopted former in the internal layer bracket is to the former specific explanations of last layer justice, is the intermediate description to notion, descriptive power relatively a little less than.Therefore, when tolerance phrase semantic similarity, be necessary it is treated with a certain discrimination.

As the important foundation of word measuring similarity, adopted former calculation of similarity degree is carried out according to the former hierarchical system of justice (being hyponymy).Based on tree-shaped hierarchical structure, consider the path between the node, introduce the level degree of depth of node simultaneously, and set up adopted former calculation of similarity degree formula, as follows.

Sim (S_{1}, S_{2}) = \frac{α \times \min (Depth (S_{1}), Depth (S_{2}))}{α \times \min (Depth (S_{1}), Depth (S_{2})) + Dist (S_{1}, S_{2})} - - - (1)

Wherein, S ₁With S ₂Represent that respectively two justice are former; Dist (S ₁, S ₂) the adopted former S of expression ₁With S ₂Between path; α is for regulating parameter; Depth (S ₁) and Depth (S ₂) represent adopted former S respectively ₁With S ₂The level degree of depth; Min (Depth (S ₁), Depth (S ₂)) the adopted former S of expression ₁With S ₂Smaller in the level degree of depth.The branch of the former entrained semantic information of justice with size, the node semantic information that is in bottom is abundant more, and the node semanteme that is in high level is abstract more, so should treat adopted former on the different levels with a certain discrimination.

Have the polysemy phenomenon in the Chinese, the phrase semantic similarity should be calculated the similarity between the word notion, and the semantic similarity of two alone words (not being in certain context) is the maximal value of similarity between its all notions.

Sim(W ₁，W ₂)＝maxSim(C _1i，C _2j)i＝1Λn，j＝1Λm (2)

Wherein, speech W ₁Have n notion, speech W2 has m notion, C _1iBe W ₁I item notion, C _2jBe W ₂J item notion.According to the architectural characteristic of KDML, the notion semantic similarity is divided into three parts calculates:

Sim(C ₁，C ₂)＝w ₁ ^*P ₁+w ₂ ^*P ₂+w ₃ ^*P ₃ (3)

Wherein, P ₁Be the similarity of two notion main classes justice between former; P ₂Be the similarity between the whole semantic formula; P ₃Be at calculation of similarity degree between two former frameworks of DEF main classes justice; w ₁, w ₂With w ₃Be respectively three pairing weights of part similarity, should satisfy constraint condition w ₁+ w ₂+ w ₃=1 and w ₂＞w ₁, w ₂＞w ₃

For P ₁, by formula calculate (1), has aforementionedly illustrated that main classes justice is former and have the most direct semantic description ability for notion, therefore it is single-rowly considered highly significant for a part.

For P ₂,, so do it as a whole and to calculate its semantic similarity with reference to the KDML rule necessary because semantic formula is a complete individuality, and has oneself syntax rule.This part is the part of the most complicated and weights proportion maximum in the whole semantic similarity tolerance.Because need to consider whole semantic formula.Its computation process can be divided into two stages, and the level characteristic of describing according to KDML is divided by level justice is former, and every layer of method that adopts maximum match carried out semantic similarity and calculated then.At first, calculate every group of semantic similarity that justice is former, therefrom one of the selective value maximum group.If exist the former semantic similarity of many group justice identical, then optional one group gets final product.Secondly, still select semantic similarity the maximum in remaining adopted former group, the rest may be inferred.When two notions when not waiting, can occur the situation that the former and empty element of justice matches with the adopted former number of layer, can unify to get at this moment smaller value r (parameter that sets).At last, adopted former group of selected semantic similarity addition averaged, can obtain P ₂The value of part.

For P ₃, its computing method and P ₂Identical.Semantic similarity tolerance at the former framework of main classes justice is actually the another kind of method of calculating the former semantic similarity of main classes justice, emphasizes the former direct descriptive power for notion of main classes justice again.

Finally, based on above-mentioned three part calculation of similarity degree, can calculate semantic similarity between every pair of notion according to formula (3), then by formula (2) get maximal value as the semantic similarity between word.

Some special case that should be noted that are, in the time only just explaining a word fully with justice is former, illustrate that the former quantity of information that contains of this justice is bigger, are one that is in the adopted elite tree than bottom.At this moment,, then can improve the former descriptive power of this single justice, make the phrase semantic similarity more approach expectation value if add adopted former depth information.In addition, former for the Special Significance justice of using quotation marks to bracket, also can be referred to as concrete speech, comprise abundant and concrete semantic information, the character of its notion of describing is had direct decisive action and influence power.Therefore, it is former it should to be different from common justice, gives an adjusting parameter for the semantic similarity between the concrete term.

In above-mentioned semantic similarity measurement model, based on whole semantic formula, divide justice is former, and adopt the method for maximum match by level, consider the former direct descriptive power of main classes justice simultaneously separately for notion.The mechanism of this tolerance semantic similarity can more effectively be utilized the structure of knowledge of HowNet, makes the result more have differentiation power.Simultaneously, because in metrics process, suitably add the consideration of adopted former depth information, make the result more accurate, especially in semantic formula under the few situation of adopted former number effect more obvious.

The step of this algorithm is described below with false code:

(1) obtains input: two word W ₁With W ₂

(2) select two notion C _1iWith C _2j

(3) search the semantic network file of HowNet, obtain notion C _1iWith C _2jRelevant information such as former, the semantic formula of main classes justice, semantic formula framework.

(4), obtain the similarity information P of two notion main classes justice between former based on the former calculation of similarity degree formula of justice (1) ₁

(5), calculate the similarity P between the former framework of two notion semantic formulas and main classes justice respectively based on two stage solution procedure ₂With P ₃

(6) the similarity information of comprehensive three parts according to formula (3), is obtained two similarity values between the notion.

(7) repeating step (2)～(6), obtain two words each to the similarity value between the notion.According to formula (2), therefrom select maximal value as final word similarity value.

4. select and optimized Algorithm based on the query expansion selected ci poem of extension rule

At the inquiry semantic extension of being carried out based on the thesaurus semantic network, there is dual mode to use for reference.The first will add based on the Search Results of original query in the original query keyword sequence automatically, and this mode generally needs the machine learning and the accumulation of artificial participation and certain scale, otherwise will introduce a large amount of irrelevant vocabulary, make spreading result very bad.Whether and another kind of mode is to give the user with right to choose, and the result after the expansion only is provided, as for being suitable for or using which spreading result then to be determined by the user.Though use a kind of mode in back to need the user initiatively to participate in, increase the complexity that the user uses search engine to a certain extent, the expansion word that obtains thus is actually a kind of user's input, has higher use value.

Automatic query expansion in the text retrieval field is a comparatively proven technique, and as [3,4,7], such algorithm considers it more is the relevant information that merges in the search file, however many search files and inquiry and irrelevant.The research of relevant semi-automatic query expansion in conjunction with user interactions is comparative maturity also, as [5,6,20], all relative words information that such algorithm will can extract from search file usually all offer the user, cause the user to face all more options of wide scope, select inappropriate or introduce unnecessary noise information and cause easily.Simultaneously, above-mentioned two kinds of query expansion modes are all set up at text retrieval and in conjunction with the characteristics of text message, are not suitable fully for text-based image retrieval.In view of this, a kind of in conjunction with two kinds of classical query expansion technology and be applicable to that the new algorithm of image retrieval arises at the historic moment, easier realization and use are a kind of light weights and have the more method of direct effect.

Determine after the query expansion pattern, in realizing the query expansion function, must consider that at first the query expansion rule that is fit to the image retrieval characteristics makes up.That is,, which kind of concrete rule to determine that another lexical item is its suitable expansion word according to, just the semantic extension rule of single lexical item for a given searching keyword item.Based on determining of extension rule, the inquiry semantic extension just can be understood as expands each lexical item of searching keyword sequence then with its merging respectively.Consider the randomness that the user imports, do not have difference in proper order before and after each lexical item in the keyword sequence, make no exception.

Wish by selecting the expansion lexical item to make the more tangible purpose of search intention that for satisfying the user to greatest extent the present invention considers that following two kinds of situations set up the query expansion rule.

(1) user can not find good vocabulary to the image as object search and carries out abstractdesription, and the mark person of image often uses word direct and commonly used on the other hand, makes that user's word input is more thorny.Thereby,, should expand the lexical item that some and searching keyword have general character at the markup information of image itself and the requirement of retrieval.For example, the user wishes to search for the image of relevant big cat, if input " big_cat ", the result understands no good cake.Because the markup information of most of images is concrete lexical items such as " tiger ", " lion ", then return results will be seldom.At this situation, its best solution route is, the name of big cats such as " tiger ", " lion " all is input in the search box, but this mode is obviously very loaded down with trivial details concerning the user.Therefore, if can only import the name that " tiger " expands some other big cats then, then the user only need select into that search box get final product with the expansion lexical item, and need not to add thinking in addition other which name need import by keyboard.

When (2) the searching keyword item of importing as the user has multiple connotation (this is a situation about often occurring), adding keyword sequence by choosing the expansion lexical item, is to provide certain disambiguation foundation to search engine.

For example, for key word character " bank ", if can be in the same place with " water " or " coast ", image search engine just has according to avoiding an image about " bank " to return or marking too high.

Based on above-mentioned extension rule, in the inquiry semantic extension, semantic network by search thesaurus (comprise WordNet, at the HowNet of Chinese queries) at English language query, to return as the expansion lexical item with the relevant lexical item that original English searching keyword item has part relations (Part), brotherhood (Sibling) and a children relation (Child), and directly use the DEF coupling that original Chinese queries is expanded.Wherein, the children relation at the English language query expansion only comprises direct children, the direct descendent during promptly semantic hierarchies concern.

Except that above-mentioned extension rule, it finally is to add in the keyword sequence that the present invention also considers to expand lexical item, carries out matching treatment and the lexical item of keyword sequence will be used for search procedure with the image labeling information of image library.Therefore, the lexical item that did not occur in the image labeling information is because useless to Search Results, and then expansion is come out meaningless in expansion module.Lexical item number in the thesaurus is tens thousand of usually, even more than 100,000, all might be selected as the expansion lexical item by above-mentioned extension rule.But image labeling information generally only everyday words can occur, and the everyday words set is just little a lot.Therefore, in the inquiry semantic extension based on extension rule, its final step is to utilize the mark speech to filter the expansion word set, and the expansion lexical item that occurs in the mark word set will be abandoned.

In addition, a problem that needs to solve in the English language query semantic extension based on WordNet is, because each lexical item in the keyword sequence is separate, final spreading result is the union of each lexical item expansion word set.Similarly, the union of the expansion word set that the expansion word set of each lexical item also can be by comprising each synset of this lexical item Synset obtains.If certain Synset of a lexical item is present position relatively " intensive " in semantic network, and the semantic relation more complicated, then set out the expansion word set scale that obtains well beyond other several Synset by this Synset.And all Synset of a lexical item have par, then might introduce many expansion lexical item information that become noise by the aforementioned Synset that brings fairly large superset.For example, to key word item " tiger " when expanding, a large amount of expansion lexical items is relevant with " humanity " unexpectedly.This unexpected result is from a Synset of " tiger ", its semanteme is " a fierce or audacious person ", and the semanteme that is somebody's turn to do " personification " is not the semanteme commonly used of " tiger ", under the situation of not carrying out the disambiguation processing, can't find out this type of Synset by concrete rule.Therefore, when carrying out semantic extension, the expansion word set of each Synset is done with the restriction on the scale (expanding 15 expansion lexical items at the most as limiting each Synset), thereby avoided because certain more deserted Synset expands a large amount of underproof expansion lexical items.

The step of this algorithm is described below with false code:

(1) obtains input: the original query keyword sequence.

(2) select its certain key word item.

(3) if be the English keywords item, search the semantic network file of WordNet, obtain its synset Synset.If be the Chinese key word item, search the semantic network file of HowNet, obtain its semantical definition DEF.

(4),,, seek corresponding near synonym word set as the expansion word set according to the part relations in the semantic network hierarchical structure (Part), brotherhood (Sibling) and children relation (Child) at each Synset of English keywords item based on extension rule; At each DEF of Chinese key word item, do with direct coupling expansion.

(5),, the expansion word set is carried out filtering screening, thereby obtain the final expansion word set after the optimization according to image library mark collection information based on expansion aftertreatment strategy.

(6) repeat (2)～(5), the expansion word set that obtains each key word item in the original query merges, with its as with the back query express of the corresponding expansion of original query.

5. the scoring of result for retrieval and optimized Algorithm

The ordering elementary cell of image search engine is an image, and the basic foundation of its ordering is a characteristics of image.In the image retrieval based on picture material, the level image feature is as the feature of piece image; And in text-based image retrieval, the markup information of image is characteristics of image.For the latter, the searching keyword of further user input is as the order standard of image, and ordering will mark image that word sequence more approaches the searching keyword sequence exactly and be arranged in result for retrieval in tabulating on the more forward position.Therefore, need the scoring and the optimized Algorithm of result for retrieval, to determine as each width of cloth image of result for retrieval which width of cloth more " good " for the user inquiring keyword sequence of comparing.Yet the standard of " good " does not in fact exist, even different users imports same inquiry, probably same return results is made evaluation far from each other yet.So the scoring of result for retrieval and optimized Algorithm just define a kind of code of points, be that the image scoring is sorted by adjusting parameter, in the hope of reaching more the effect of " good ".

Handling in view of the ordering in the disclosed text retrieval of the prior art field is a comparatively proven technique, as [24,25,26], it more is the direct coupling of searching keyword and search file that such algorithm is considered, might cause the search file that does not comprise the user inquiring keyword but be correlated with really not to be returned.The present invention sets up the ordering strategy be applicable to text-based image retrieval, and a kind of result for retrieval scoring and optimized Algorithm are provided, and the result shows, the search behavior of image search engine is not added any interference, and the result for retrieval that is returned is had no effect.The main effect of this algorithm is, allow sort result that same result for retrieval concentrates more " good " more before observe so that the user is easier.

The result for retrieval of image search engine is often a lot, and the user often only can watch some results of front patiently.In other words, how will more the be close to the users result for retrieval of search intention is put to return results more quite important on the position of front.Therefore, design is sorted to the result for retrieval that returns based on the scoring algorithm of phrase semantic similarity, markup information (promptly marking word sequence) according to searching keyword sequence and image is marked, thus with the return score of each width of cloth image as sort by.In fact, the scoring of this algorithm to as if the mark collection of image, and image itself is not had any understanding, this is a pure marking scheme of trusting and relying on image labeling information.The aforementioned phrase semantic measuring similarity of discussing can access the calculating support of " semantic close " exactly for the sort algorithm here.

Because the input of user inquiring keyword is to carry out in random mode, so each lexical item in the fair play searching keyword sequence.Yet for the mark word sequence of image, the lexical item of supposing to come the front is more credible.This hypothesis is based on a fact, and promptly the mark person tends at first the most outstanding object in the input picture.Really, to same width of cloth image, different mark persons have different judgements, and not necessarily have the most outstanding object.But for most of images, the focus object in the image is still very obvious.Just with that in mind, result of calculation additional weight all of mark speech in the scoring algorithm is used for outstanding possible " giving prominence to " object of image.Thus, use following formula to come the ranking score of computed image:

Score = \frac{Σ_{i = 1}^{n} Σ_{j = 1}^{m} w (j, m) Sim (k_{i}, t_{j})}{Σ_{j = 1}^{m} w (j, m)} - - - (1)

Wherein, k _iI keyword for keyword sequence; t _jJ mark speech for the image labeling word sequence; Sim (k _i, t _j) be used to calculate two lexical item k _iWith t _jBetween semantic similarity; W (j m) is associated weight, and w (j, m)=(m+1-j) ², be used for the context that outstanding mark sequence marks lexical item; N and m then are respectively the lexical item numbers that searching keyword sequence and image labeling word sequence are comprised.Consider that first mark speech weight in the image labeling word sequence is m ², then with respect to total weight

Its proportion is:

\frac{m^{2}}{Σ_{j = 1}^{m} w (j, m)} = \frac{m^{2}}{Σ_{j = 1}^{m} j^{2}} = \frac{{6 m}^{2}}{m (m + 1) (2 m + 1)} = \frac{6 m}{(m + 1) (2 m + 1)} - - - (2)

This function is a decreasing function, and along with the increase of image labeling word sequence, the weights influence of file leader's speech is linear successively decreases.If piece image contains too many object, will make that each object can be not outstanding especially.

A kind of situation that should be noted that is to have a large amount of double countings in the score calculation.Searching keyword participates in all phrase semantic similarities and calculates, and the mark sequence of every width of cloth image all can comprise at least one searching keyword (otherwise this image can not be returned as result for retrieval), and also can have a large amount of identical mark lexical items in the result for retrieval image, so actual necessary semantic calculating lacked a lot than the invoked number of times of semantic calculating.By a suitably similarity result of calculation buffer memory of size is set, note some semantic result of calculations, the lifting tool of processing speed is had very great help.On the other hand, image search engine is advocated paging result of each visit and is all searched for again when the Pagination Display that the processing result for retrieval too much causes.If can use a buffer memory, the appraisal result buffer memory of some images is got up, when the user is switched the different pagings of image searching result, will avoid a large amount of calculating so.From similarity (semantic distance) result cache, result document buffer memory, until the synset read module buffer memory of bottom, based on the multilayer caching mechanism original scoring algorithm is optimized, can save the processing time to a great extent.

What the account form of scoring algorithm provided is a mark of accurately integrating according to the semantic similarity function result, and in the actual computation, some approximation equally also can be finished the work.After all, only need good ordering effect, and concrete fractional value is unimportant.In score calculation, the mark word sequence of every width of cloth image all has the mark speech that is complementary with searching keyword, equally also has the mark speech with searching keyword semantic similarity quite little (perhaps semantic distance is quite far away).Each mark speech in the mark word sequence has the corresponding order weight, ranking in the back and the very little mark speech of semantic similarity to the influence of final retrieval tabulation, what be a ranking at mark speech preceding and that be complementary with searching keyword is 1/20th or 1/50th obviously unimportant.Therefore, to the little result of calculation of final retrieval tabulation influence, a kind of identical result of unified employing represents but not calculates with following the prescribed order that this will simplify the complexity of calculating greatly at these.

The step of this algorithm is described below with false code:

(1) obtains input: the original query keyword sequence.

(2) use the query expansion function to obtain the expansion word set of original query keyword sequence.

(3) be each to searching keyword Xiang Yuqi expansion lexical item computing semantic similarity, and the result all is stored among the buffer memory.

(4), obtain and the corresponding result for retrieval of original query based on image search engine.

(5) be every width of cloth image calculation scoring in the result for retrieval.If can from buffer memory, obtain relevant information, then utilize ready-made existing result; Otherwise, just be used as semantic result of calculation very big (being that semantic distance is far), use a unified constant result, calculate and no longer carry out semanteme.

(6) according to the scoring of every width of cloth image in the result for retrieval, each width of cloth image is resequenced, to return final retrieval tabulation.Wherein, in the scoring process of step (5), all semanteme calculates will be as pre-service, and need not any semantic calculating during actual scoring, and then total semantic calculation times will reduce an order of magnitude than the scoring algorithm before optimizing, and improve treatment effeciency with this.

Embodiment 2 application examples

Accompanying drawing 2 is concrete steps of demonstrating above-mentioned algorithm flow framework by a concrete example, by provide each algoritic module in the middle of the output and the final result for retrieval of this framework, give intuitively and understand.

Label (1) is inquired about and Chinese queries with the original English that (2) are respectively user's input; Label (3) is respectively with (4) and utilizes English thesaurus WordNet and Chinese thesaurus HowNet, and what employing " the query expansion selected ci poem based on extension rule is selected and optimized Algorithm " was obtained inquires about and the corresponding expansion word set of Chinese queries with original English; Label (5) and (6) are respectively the expansion word set based on original English inquiry and Chinese queries, the corresponding result for retrieval that utilizes image search engine to obtain; Label (7) is respectively based on the initial retrieval of original English inquiry and Chinese queries with (8) and tabulates the final result for retrieval that " scoring of result for retrieval and the optimized Algorithm " of utilization fusion " based on the english vocabulary semantic similarity metric algorithm of WordNet " and " based on the Chinese terms semantic similarity metric algorithm of HowNet " obtained.

Claims

1. text based query expansion and sort method in the image retrieval is characterized in that comprising the steps:

(1) pre-service and preanalysis

At initial query, finish the participle and the punctuation mark mark-on of inquiry by pre-service, and, finish stop word mark-on, part of speech analysis and keyword extraction by preanalysis based on through pretreated initial query;

(2) phrase semantic measuring similarity

Measure at the english vocabulary semantic similarity, the path Network Based and the degree of depth are come the computing semantic distance, measure at the Chinese terms semantic similarity, calculate based on taking all factors into consideration the former similarity of main classes justice, semantic formula similarity and the former framework similarity of main classes justice, incorporate maximum match rule and adopted former depth information simultaneously;

(3) query expansion of fusion extension rule

Based on semantic network identity, merge the particular extension rule of being set up simultaneously, carry out semantic extension at the keyword sequence that comes from initial query;

(4) result for retrieval based on scoring sorts

The result for retrieval that returns with search engine is as process object, based on " the close degree " between phrase semantic measuring similarity assessment searching keyword sequence and the iamge description explanation, obtain scoring, and be optimized by scoring algorithm, final score is returned the sort by of image as search engine.

2. method according to claim 1, it is characterized in that, in the prototype of described english vocabulary semantic similarity metric algorithm, set up a kind of Lesk expansion algorithm based on equal speech, further expand the phrase semantic definition, wherein equal speech is defined as synset under certain word and is combined in sibling in the WordNet hierarchical structure, wherein, synonym set is public father node of pairing equal speech existence with it.

3. method according to claim 1, it is characterized in that, in the prototype of described Chinese terms semantic similarity metric algorithm, based on whole semantic formula, divide justice is former by level, adopt the method for maximum match, consider the former direct descriptive power of main classes justice separately for notion; Simultaneously, in metrics process, add the consideration of adopted former depth information, notion semantic similarity wherein is divided into following three parts and calculates:

Sim(C ₁，C ₂)＝w ₁*P ₁+w ₂*P ₂+w ₃*P ₃

4. method according to claim 1 is characterized in that its algorithm steps of query expansion of described fusion extension rule adopts following false code to describe:

(1) obtains input: the original query keyword sequence;

(2) select its certain key word item;

(3) if be the English keywords item, search the semantic network file of WordNet, obtain its synset Synset;

If be the Chinese key word item, search the semantic network file of HowNet, obtain its semantical definition DEF;

(4) based on extension rule, at each Synset of English keywords item, according to the part relations in the semantic network hierarchical structure, brotherhood, and children relation are sought corresponding near synonym word set as the expansion word set; At each DEF of Chinese key word item, do with direct coupling expansion;

(5) based on expansion aftertreatment strategy, according to image library mark collection information, the expansion word set is carried out filtering screening, obtain the final expansion word set after the optimization;

5. method according to claim 1 is characterized in that, in the described prototype based on the result for retrieval sort algorithm of marking, the result of calculation additional weight of mark speech in the scoring algorithm is used for outstanding possible " giving prominence to " object of image; Adopt the ranking score of following formula computed image:

Score = \frac{Σ_{i = 1}^{n} Σ_{j = 1}^{m} w (j, m) Sim (k_{i}, t_{j})}{Σ_{j = 1}^{m} w (j, m)}

Wherein, k _iI keyword for keyword sequence; t _jJ mark speech for the image labeling word sequence; Sim (k _i, t _j) be used to calculate two lexical item k _iWith t _jBetween semantic similarity; W (j m) is associated weight, and w (j, m)=(m+1-j) ², be used for the context that outstanding mark sequence marks lexical item; N and m are respectively the lexical item numbers that searching keyword sequence and image labeling word sequence are comprised; First mark speech weight in the image labeling word sequence is m ², then with respect to total weight

Its proportion is:

\frac{m^{2}}{Σ_{j = 1}^{m} w (j, m)} = \frac{m^{2}}{Σ_{j = 1}^{m} j^{2}} = \frac{6 m^{2}}{m (m + 1) (2 m + 1)} = \frac{6 m}{(m + 1) (2 m + 1)}

This function is a decreasing function, and along with the increase of image labeling word sequence, the weights influence of file leader's speech is linear successively decreases.