CN103324678B - Information retrieval method and device - Google Patents

Information retrieval method and device Download PDF

Info

Publication number
CN103324678B
CN103324678B CN201310200430.3A CN201310200430A CN103324678B CN 103324678 B CN103324678 B CN 103324678B CN 201310200430 A CN201310200430 A CN 201310200430A CN 103324678 B CN103324678 B CN 103324678B
Authority
CN
China
Prior art keywords
expression formula
semantics expression
semantics
statement
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310200430.3A
Other languages
Chinese (zh)
Other versions
CN103324678A (en
Inventor
俞声
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201310200430.3A priority Critical patent/CN103324678B/en
Publication of CN103324678A publication Critical patent/CN103324678A/en
Application granted granted Critical
Publication of CN103324678B publication Critical patent/CN103324678B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention provides a kind of information retrieval method and device, searches for relevant document according to keyword, the problem that the Search Results obtained and customer need are not inconsistent for solving in prior art in document library. Wherein, method comprises: the inquiry statement obtaining user, and inquiry statement comprises one or more word or phrase; Inquiry statement is carried out semantics recognition, obtains the first set of semantics expression formula that inquiry statement is corresponding; First set of semantics expression formula is mated with each the 2nd set of semantics expression formula in concordance list prepared in advance, it is determined that the 2nd set of semantics expression formula mated mutually with the first set of semantics expression formula; Concordance list obtains the relevant document identification of the 2nd set of semantics expression formula mated mutually; Document library obtains the document corresponding to relevant document identification; The document corresponding to relevant document identification is returned to user.

Description

Information retrieval method and device
Technical field
The present invention relates to a kind of areas of information technology, particularly relate to a kind of information retrieval method and device.
Background technology
In prior art, text message indexing method mainly comprises: obtain keyword, searches for relevant document according to keyword, return to user after being sorted by relevant document in document library.
But in prior art, it is not enough to due to keyword reflect that user wants the meaning expressed, thus it is possible to cause Search Results and customer need not to be inconsistent.
Summary of the invention
The present invention provides a kind of information retrieval method and device, for solving in prior art, searches for relevant document according to keyword, the problem that the Search Results obtained and customer need are not inconsistent in document library.
The first aspect of the invention is to provide a kind of information retrieval method, comprising:
Obtaining the inquiry statement of user, described inquiry statement comprises one or more word or phrase;
Described inquiry statement is carried out semantics recognition, obtain the first set of semantics expression formula that described inquiry statement is corresponding, described first set of semantics expression formula is for representing one or more semantic units and semantic attribute thereof, if for representing multiple semantic unit, then described first set of semantics expression formula also modified relationship for representing between described multiple semantic unit;
By described first set of semantics expression formula, mate with each the 2nd set of semantics expression formula in concordance list prepared in advance, determine and the 2nd set of semantics expression formula that described first set of semantics expression formula is mated mutually, described 2nd set of semantics expression formula is for representing one or more semantic units and semantic attribute thereof, if for representing multiple semantic unit, then described 2nd set of semantics expression formula also modified relationship for representing between described multiple semantic unit;
Described concordance list obtains the relevant document identification of the 2nd set of semantics expression formula of described phase coupling;
Document library obtains the document corresponding to described relevant document identification;
The document corresponding to described relevant document identification is returned to described user.
Another aspect of the present invention provides a kind of information indexing device, comprising:
Acquisition module, for obtaining the inquiry statement of user, described inquiry statement comprises one or more word or phrase;
Semantics recognition module, for described inquiry statement is carried out semantics recognition, obtain the first set of semantics expression formula that described inquiry statement is corresponding, described first set of semantics expression formula is for representing one or more semantic units and semantic attribute thereof, if for representing multiple semantic unit, then described first set of semantics expression formula also modified relationship for representing between described multiple semantic unit;
Matching module, for by described first set of semantics expression formula, mate with each the 2nd set of semantics expression formula in concordance list prepared in advance, determine and the 2nd set of semantics expression formula that described first set of semantics expression formula is mated mutually, described 2nd set of semantics expression formula is for representing one or more semantic units and semantic attribute thereof, if for representing multiple semantic unit, then described 2nd set of semantics expression formula also modified relationship for representing between described multiple semantic unit;
Described matching module also for obtaining the relevant document identification of the 2nd set of semantics expression formula of described phase coupling in described concordance list, obtains the document corresponding to described relevant document identification in document library;
Sending module, for returning to described user by the document corresponding to described relevant document identification.
The present invention is by carrying out semantics recognition to the inquiry statement of user, obtain the first set of semantics expression formula that inquiry statement is corresponding, according to the first set of semantics expression formula, obtain in concordance list the 2nd set of semantics expression formula and relevant document identification mated to the first set of semantics expression formula, in document library, corresponding document is obtained, it is to increase the dependency of Search Results and customer need according to relevant document identification.
Accompanying drawing explanation
Fig. 1 is the schema of an information retrieval method embodiment provided by the invention;
Fig. 2 is the schema of information retrieval another embodiment of method provided by the invention;
Fig. 3 is the structural representation of an information indexing device embodiment provided by the invention.
Embodiment
For making the object of the embodiment of the present invention, technical scheme and advantage clearly, below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described. Obviously, described embodiment is the present invention's part embodiment, instead of whole embodiments. Based on the embodiment in the present invention, those of ordinary skill in the art, not making other embodiments all obtained under creative work prerequisite, belong to the scope of protection of the invention.
The schema of Fig. 1 information retrieval method embodiment provided by the invention, as shown in Figure 1, comprising:
101, information indexing device obtains the inquiry statement of user, and inquiry statement comprises one or more word or phrase.
Wherein, inquiring about statement can be one or more word or a word of phrase composition. Such as, for medical science, word, phrase can be " embolism ", " acute ", " upper lobe of left lung " or " lung's aorta " etc.
102, inquiry statement is carried out semantics recognition, obtain the first set of semantics expression formula that inquiry statement is corresponding, first set of semantics expression formula is for representing one or more semantic units and semantic attribute thereof, if for representing multiple semantic unit, then the first set of semantics expression formula also modified relationship for representing between multiple semantic unit.
First, it is necessary to obtain the semantic unit in inquiry statement. Semantic unit is semantic unit, and the sincere word can paid close attention to by each technical applications in the text embodies. Sincere word can have concrete implication for inquiring about word in statement, such as noun, verb, adjective, number, measure word, pronoun or adverbial word etc. Such as, when the inquiry statement of user is for " lung's aorta has acute embolism ", sincere word can be " lung's aorta ", " acute " and " embolism ". Again such as, when the inquiry statement of user is not for " patient lung has embolism ", sincere word can be " patient ", " lung " and " embolism ".
The mark of semantic unit hereinafter it is used as with sincere word. In addition, the mark of semantic unit can also use the pattern or other marks that can uniquely identify sincere word meaning that the coding that sincere word is corresponding in dictionary, sincere word are corresponding. For medical science, dictionary can be " Unified Medical Language System " (UnifiedMedicalLanguageSystem, UMLS), " the clinical term of medical system nomenclature " (SystematizedNomenclatureofMedicine--ClinicalTerms, SNOMEDCT), " national drug file reference term " (NationalDrugFile-ReferenceTerminology, NDF-RT) etc. Such as, the coding of " embolism " (embolism) in UMLS is C0013922, and the coding in SNOMEDCT is 414086009, and the coding in NDF-RT is N0000001067. Two words " embolism " close for implication and " thrombus ", it is possible to adopt identical coding.
Secondly, it is determined that the attribute of all semantic units in inquiry statement, and the modified relationship between semantic unit, attribute can comprise positive or negative attribute. Such as, the attribute inquiring about " lung's aorta ", " acute " and " embolism " in statement " lung's aorta has acute embolism " is all affirmative attribute. Wherein, " lung's aorta " and " acute " is semantic first for modifying, and " embolism " is by the semantic unit of modification.
Again such as, in inquiry statement " patient lung does not have embolism ", the attribute of " patient " and " lung " is affirmative attribute, and the attribute of " embolism " is negative attribute. " patient " and " lung ", for modifying semantic unit, " embolism " is by the semantic unit of modification.
Further, the sincere word of part can also as attribute, as " acute " in above-mentioned inquiry statement " lung's aorta has acute embolism " can as the attribute of sincere word " embolism ". " patient/family members " " new send out/recurrence " " current/once " etc. can also as attribute, it is possible to select suitable attribute according to concrete applied environment. Attribute can also be multiple, it is possible to using attribute as certain sincere word while of multiple in above-mentioned attribute.
Again, according to the attribute of semantic units all in inquiry statement, and the modified relationship between semantic unit, generate the first set of semantics expression formula that inquiry statement is corresponding. Such as, the first set of semantics expression formula corresponding to inquiry statement " lung's aorta has acute embolism " is " embolism: Y | lung's aorta: Y | acute: Y ", wherein only uses attribute and negative attribute certainly, and Y represents that N represents negative certainly. Can separate with ": " between semantic unit mark and attribute, can separate with " | " between semantic unit. In addition, modify semantic unit and can there is no sequencing in set of semantics expression formula, it is also possible to spelling or coding according to semantic unit mark are sorted.
Again such as, the first set of semantics expression formula corresponding to statement " patient lung does not have embolism " is inquired about for " embolism: N | patient: Y | lung: Y ".
It should be noted that, separator between semantic unit mark and attribute, and separator between semantic unit can also with "; ", "/", " * ", the symbol such as " & " or " # ", as long as the separator between the separator between semantic unit, semantic unit mark and attribute, both can distinguish out. By the semantic unit of modification, it is possible to be positioned at the starting position of set of semantics expression formula, terminate position, or other fixed positions etc.
When inquiry statement comprises nested modification, it is possible to the modified relationship in the first set of semantics expression formula is tiled. Such as inquiring about in statement " the sub-section artery of upper lobe of left lung has embolism ", " upper lobe of left lung " modification " sub-section artery ", " sub-section artery " modification " embolism ", belongs to nested modification. If representing nested modifier with bracket, then inquire about the first set of semantics expression formula of statement for " embolism: Y (sub-section artery: Y (upper lobe of left lung: Y)) ". If being tiled, then the first set of semantics expression formula is " embolism: Y | sub-section artery: Y | upper lobe of left lung: Y ".
In addition, it is also possible to semantic for the modification in the first corresponding for inquiry statement set of semantics expression formula unit is generated the semantic metaset of modification and closes; Adopt modify semantic metaset close in all included by each proper subclass modify the first corresponding with this statement respectively set of semantics expression formulas of semantic unit generated set of semantics expression formula by the semantic unit of modification, be called and derive from set of semantics expression formula. Such as, the semantic unit of modification in first set of semantics expression formula " embolism: Y | lung's aorta: Y | acute: Y " is " lung's aorta " and " acute ", and the derivation set of semantics expression formula of generation is: " embolism: Y ", " embolism: Y | lung's aorta: Y " and " embolism: Y | acute: Y ".
103, information indexing device is by the first set of semantics expression formula, mate with each the 2nd set of semantics expression formula in concordance list prepared in advance, determine the 2nd set of semantics expression formula mated mutually with the first set of semantics expression formula, 2nd set of semantics expression formula is for representing one or more semantic units and semantic attribute thereof, if for representing multiple semantic unit, then the 2nd set of semantics expression formula also modified relationship for representing between multiple semantic unit.
In addition, information indexing device by derivation set of semantics expression formula corresponding for the first set of semantics expression formula, can also mate with the 2nd set of semantics expression formula in concordance list, it is determined that with the set of semantics expression formula that derivation set of semantics expression formula is mated mutually.
104, information indexing device obtains the relevant document identification of the 2nd set of semantics expression formula mated mutually in concordance list.
Wherein, concordance list can comprise the 2nd set of semantics expression formula and relevant document identification corresponding to the 2nd set of semantics expression formula. Relevant document refers to the document mated in document library with the 2nd set of semantics expression formula.
Such as, comprising 6 documents in document library, each document identification and content are as shown in table 1.
Table 1: document example
In prior art, when in document library, the keyword of document is " embolism ", " acute ", " chronic ", " lung's aorta ", " upper lobe of left lung ", " lobe of left lung " and " sub-section artery ", as shown in table 2 according to the concordance list that keyword is set up.
Table 2: based on the concordance list of keyword
Keyword Document identification
Embolism 1,2,3,4,5,6
Acute 2,3,4,6
Chronic 1,3,4,6
Lung's aorta 3,4
Upper lobe of left lung 1,2,3
Lobe of left lung 4
Sub-section artery 1,2,3,4
The keyword that the inquiry statement " lung's aorta has acute embolism " of user is corresponding is " lung's aorta ", " acute " and " embolism ", according to above-mentioned keyword, document library is searched for, obtain comprise " lung's aorta " simultaneously, the document of " acute " and " embolism " keyword be 3 and 4. But document 4 " patient lung aorta has chronic em-bolization, and the sub-section artery of lobe of left lung has acute embolism " is not consistent with the inquiry statement " lung's aorta has acute embolism " of user.
And the concordance list that in the present invention, information indexing device is set up in advance according to the 2nd set of semantics expression formula, as shown in table 3.
Table 3: based on the concordance list of set of semantics expression formula
Set of semantics expression formula Document identification
Embolism: Y 1,2,3,4,6
Embolism: N 5
Embolism: Y | acute: Y 2,3,4
Embolism: N | acute: Y 6
Embolism: Y | chronic: Y 1,3,4,6 4-->
Embolism: Y | sub-section artery: Y 1,2,3,4
Embolism: Y | upper lobe of left lung: Y 1,2,3
Embolism: Y | lobe of left lung: Y 4
Embolism: Y | sub-section artery: Y | upper lobe of left lung: Y 1,2,3
Embolism: Y | sub-section artery: Y | lobe of left lung: Y 4
Embolism: Y | acute: Y | sub-section artery: Y 2,4
Embolism: Y | chronic: Y | sub-section artery: Y 1,3
Embolism: Y | acute: Y | upper lobe of left lung: Y 2
Embolism: Y | acute: Y | lobe of left lung: Y 4
Embolism: Y | chronic: Y | upper lobe of left lung: Y 1,3
Embolism: Y | acute: Y | sub-section artery: Y | upper lobe of left lung: Y 2
Embolism: Y | acute: Y | sub-section artery: Y | lobe of left lung: Y 4
Embolism: Y | chronic: Y | sub-section artery: Y | upper lobe of left lung: Y 1,3
Embolism: Y | lung's aorta: Y 3,4
Embolism: Y | lung's aorta: Y | acute: Y 3
Embolism: Y | lung's aorta: Y | chronic: Y 4
First set of semantics expression formula of inquiry statement is " embolism: Y | lung's aorta: Y | acute: Y ", and relevant document identification corresponding to the 2nd set of semantics expression formula of its coupling in concordance list is " 3 ", and Search Results is consistent with the inquiry statement of user.
In addition, information indexing device can also obtain the relevant document identification of derivation set of semantics expression formula correspondence in concordance list of the first set of semantics expression formula, to expand search coverage. Such as: derivation set of semantics expression formula corresponding to the first set of semantics expression formula is: " embolism: Y ", " embolism: Y | lung's aorta: Y " and " embolism: Y | acute: Y ". In concordance list, the relevant document identification of above-mentioned each derivation set of semantics expression formula is respectively: " 1,2,3,4,6 ", " 3,4 " and " 2,3,4 ".
105, in document library, the document corresponding to relevant document identification is obtained.
Wherein, document identification refers to can uniquely identify in document library the numbering of document or stores address etc. Using the numbering of document as document identification in the present embodiment. Such as, when the relevant document identification of the 2nd set of semantics expression formula is " 3 ", corresponding document content is for " patient lung aorta has acute embolism, and the sub-section artery of upper lobe of left lung has chronic em-bolization. "
Further, before obtaining the document corresponding to relevant document identification in document library, it is also possible to comprising: relevant document identification is carried out relevance ranking.
Such as: document identification corresponding to the first set of semantics expression formula " embolism: Y | lung's aorta: Y | acute: Y " is " 3 "; It derives from relevant document identification in concordance list of set of semantics expression formula " embolism: Y ", " embolism: Y | lung's aorta: Y " and " embolism: Y | acute: Y " and is respectively: " 1,2,3; 4,6 ", " 3,4 " " 2; 3,4 ", after the document that the relevant document identification of above-mentioned set of semantics expression formula is corresponding is carried out relevance ranking, the relevant document order obtained is: " 3,4,2; 1; 6 ", after sequence terminates, the document after sequence can be returned to user by information indexing device.
106, the document corresponding to relevant document identification is returned to user.
In the present embodiment, by the inquiry statement of user is carried out semantics recognition, obtain the first set of semantics expression formula that inquiry statement is corresponding, according to the first set of semantics expression formula, obtain in concordance list the 2nd set of semantics expression formula and relevant document identification mated to the first set of semantics expression formula, in document library, corresponding document is obtained, it is to increase the dependency of Search Results and customer need according to relevant document identification.
Fig. 2 is the schema of information retrieval another embodiment of method provided by the invention, as shown in Figure 2, on basis embodiment illustrated in fig. 1, information indexing device is before mating the first set of semantics expression formula with each the 2nd set of semantics expression formula in concordance list, also to every section of labelling document and index, comprising:
107, each statement of document in document library is carried out semantics recognition by information indexing device.
108, information indexing device determines the attribute of all semantic units in each statement, and the modified relationship between semantic unit, and attribute comprises positive or negative attribute.
Information indexing device determines the attribute of all semantic units in each statement herein, and the process of the modified relationship between semantic unit, and the process determining to inquire about the modified relationship between the attribute of semantic unit in statement and semantic unit in step 102 is similar, it is possible to refer step 102 is determined to inquire about the process of the modified relationship between the attribute of semantic unit in statement and semantic unit.
109, information indexing device generates a 2nd set of semantics expression formula for each statement, and the 2nd set of semantics expression formula comprises semantic unit marks all in this statement and attribute-bit thereof, and the modified relationship mark between semantic unit.
Further, after each statement generation the 2nd set of semantics expression formula to one section of document, if being attribute certainly by the semantic unit of modification, information indexing device can generate according to the semantic unit of modification in the 2nd set of semantics expression formula corresponding to each statement to be modified semantic metaset and closes; Adopt modify semantic metaset close in being generated by the semantic unit of modification in corresponding with each statement respectively the 2nd set of semantics expression formula of semantic unit included by each proper subclass derive from set of semantics expression formula.
110, information indexing device according to the 2nd set of semantics expression formula and derives from set of semantics expression formula renewal concordance list.
That is, if concordance list comprises the current 2nd set of semantics expression formula that each statement according to current document generates, then the document identification that the 2nd set of semantics expression formula is corresponding in this prior adds the mark of current document; As do not comprised current 2nd set of semantics expression formula in concordance list, then current 2nd set of semantics expression formula being added concordance list, the document identification corresponding with current 2nd set of semantics expression formula is the mark of current document. In addition, when there being one section of new document to add document library, also perform step 107 to 110.
In the embodiment of the present invention, by each statement of documents all in document library is carried out semantics recognition, generate the 2nd set of semantics expression formula, set up the concordance list mapping to document identification by the 2nd set of semantics expression formula, after receiving the inquiry statement of user, inquiry statement is carried out semantics recognition, obtain the first set of semantics expression formula that inquiry statement is corresponding, and then in acquisition concordance list to first set of semantics expression formula coupling the 2nd set of semantics expression formula and relevant document identification, in document library, corresponding document is obtained according to relevant document identification, improve the dependency of Search Results and customer need.
One of ordinary skill in the art will appreciate that: all or part of step realizing above-mentioned each embodiment of the method can be completed by the hardware that programmed instruction is relevant. Aforesaid program and data can be stored in a computer read/write memory medium, and data can exist with various forms such as file, database or memory data structure. This program, when performing, performs the step comprising above-mentioned each embodiment of the method; And aforesaid storage media comprises: various storage medias such as ROM, RAM, magnetic disc or CDs.
Fig. 3 is the structural representation of an information indexing device embodiment provided by the invention, as shown in Figure 3, comprising:
Acquisition module 31, for obtaining the inquiry statement of user, inquiry statement comprises one or more word or phrase.
Semantics recognition module 32, for inquiry statement is carried out semantics recognition, obtain the first set of semantics expression formula that inquiry statement is corresponding, first set of semantics expression formula is for representing one or more semantic units and semantic attribute thereof, if for representing multiple semantic unit, then the first set of semantics expression formula also modified relationship for representing between multiple semantic unit.
Matching module 33, for the first set of semantics expression formula is mated with each the 2nd set of semantics expression formula in concordance list prepared in advance, determine the 2nd set of semantics expression formula mated mutually with the first set of semantics expression formula, 2nd set of semantics expression formula is for representing one or more semantic units and semantic attribute thereof, if for representing multiple semantic unit, then the 2nd set of semantics expression formula also for represent institute multiple semantic first between modified relationship.
The relevant document identification of the 2nd set of semantics expression formula that matching module 33 also mates mutually for obtaining in concordance list, obtains the document corresponding to relevant document identification in document library.
Sending module 34, for returning to user by the document corresponding to relevant document identification.
Further, semantics recognition module 32 specifically may be used for determining inquiring about the attribute of all semantic units in statement, and modification between semantic unit and by modified relationship, and attribute comprises attribute or negative attribute certainly; Generating the first set of semantics expression formula for inquiry statement, the first set of semantics expression formula comprises semantic unit marks all in inquiry statement and attribute-bit thereof, and the modified relationship mark between semantic unit.
Further, matching module 33 is by the first set of semantics expression formula, mate with each the 2nd set of semantics expression formula in concordance list prepared in advance, before determining the 2nd set of semantics expression formula mated mutually with the first set of semantics expression formula, semantics recognition module 32 also for, each statement of document in document library is carried out semantics recognition; Determine the attribute of all semantic units in each statement, and the modified relationship between semantic unit, attribute comprises attribute or negative attribute certainly; Generating a 2nd set of semantics expression formula for each statement, the 2nd set of semantics expression formula comprises semantic unit marks all in this statement and attribute-bit thereof, and the modified relationship mark between semantic unit.
Again further, after semantics recognition module 32 generates the 2nd set of semantics expression formula for each statement, information indexing device can also comprise: more new module.
After semantics recognition module 32 generates a 2nd set of semantics expression formula for each statement, semantics recognition module also for, for each statement, when being affirmative attribute by the semantic unit of modification, semantic for the modification in the 2nd corresponding for this statement set of semantics expression formula unit is generated modification semanteme metaset and closes; Adopt modify semantic metaset close in all included by each proper subclass modify the 2nd corresponding with this statement respectively set of semantics expression formulas of semantic unit generated set of semantics expression formula by the semantic unit of modification, obtain deriving from set of semantics expression formula;
More new module, for according to the 2nd set of semantics expression formula and derivation set of semantics expression formula, upgrading concordance list.
In addition, information indexing device can also comprise order module, and order module is used for, before matching module 33 obtains the document corresponding to relevant document identification, relevant document identification is carried out relevance ranking.
In the embodiment of the present invention, by the inquiry statement of user is carried out semantics recognition, obtain the first set of semantics expression formula that inquiry statement is corresponding, according to the first set of semantics expression formula, obtain in concordance list the 2nd set of semantics expression formula and relevant document identification mated to the first set of semantics expression formula, in document library, corresponding document is obtained, it is to increase the dependency of Search Results and customer need according to relevant document identification.
Last it is noted that above each embodiment is only in order to illustrate the technical scheme of the present invention, it is not intended to limit; Although with reference to foregoing embodiments to invention has been detailed description, it will be understood by those within the art that: the technical scheme described in foregoing embodiments still can be modified by it, or wherein some or all of technology feature is carried out equivalent replacement; And these amendments or replacement, do not make the scope of the essence disengaging various embodiments of the present invention technical scheme of appropriate technical solution.

Claims (8)

1. an information retrieval method, it is characterised in that, comprising:
Obtaining the inquiry statement of user, described inquiry statement comprises one or more word or phrase;
Described inquiry statement is carried out semantics recognition, obtain the first set of semantics expression formula that described inquiry statement is corresponding, described first set of semantics expression formula is for representing one or more semantic units and semantic attribute thereof, if for representing multiple semantic unit, then described first set of semantics expression formula also modified relationship for representing between described multiple semantic unit;
By described first set of semantics expression formula, mate with each the 2nd set of semantics expression formula in concordance list prepared in advance, determine and the 2nd set of semantics expression formula that described first set of semantics expression formula is mated mutually, described 2nd set of semantics expression formula is for representing one or more semantic units and semantic attribute thereof, if for representing multiple semantic unit, then described 2nd set of semantics expression formula also modified relationship for representing between described multiple semantic unit;
Described concordance list obtains the relevant document identification of the 2nd set of semantics expression formula of described phase coupling;
Document library obtains the document corresponding to described relevant document identification;
The document corresponding to described relevant document identification is returned to described user;
Described described inquiry statement is carried out semantics recognition, obtains the first set of semantics expression formula that described inquiry statement is corresponding, comprising:
Determine the attribute of all semantic units in described inquiry statement, and the modified relationship between semantic unit, described attribute comprises positive or negative attribute;
Generating the first set of semantics expression formula for described inquiry statement, described first set of semantics expression formula comprises semantic unit marks all in described inquiry statement and attribute-bit thereof, and the modified relationship mark between semantic unit.
2. method according to claim 1, it is characterized in that, described by described first set of semantics expression formula, mate with each the 2nd set of semantics expression formula in concordance list prepared in advance, determine and before the 2nd set of semantics expression formula that described first set of semantics expression formula is mated mutually, also comprise:
Each statement of document in document library is carried out semantics recognition;
Determine the attribute of all semantic units in each statement, and the modified relationship between semantic unit, described attribute comprises positive or negative attribute;
Generating a 2nd set of semantics expression formula for each statement, described 2nd set of semantics expression formula comprises semantic unit marks all in this statement and attribute-bit thereof, and the modified relationship mark between semantic unit.
3. method according to claim 2, it is characterised in that, described generate a 2nd set of semantics expression formula for each statement after, also comprise:
For each statement, when being affirmative attribute by the semantic unit of modification, semantic for the modification in the 2nd corresponding for this statement set of semantics expression formula unit is generated modification semanteme metaset and closes;
Adopt described modify semantic metaset close in all included by each proper subclass modify the 2nd corresponding with this statement respectively set of semantics expression formulas of semantic unit generated set of semantics expression formula by the semantic unit of modification, obtain deriving from set of semantics expression formula;
According to described 2nd set of semantics expression formula and described derivation set of semantics expression formula, upgrade described concordance list.
4. method according to claim 1, it is characterised in that, before the described document that acquisition is corresponding to described relevant document identification in document library, also comprise:
Described relevant document identification is carried out relevance ranking.
5. an information indexing device, it is characterised in that, comprising:
Acquisition module, for obtaining the inquiry statement of user, described inquiry statement comprises one or more word or phrase;
Semantics recognition module, for described inquiry statement is carried out semantics recognition, obtain the first set of semantics expression formula that described inquiry statement is corresponding, described first set of semantics expression formula is for representing one or more semantic units and semantic attribute thereof, if for representing multiple semantic unit, then described first set of semantics expression formula also modified relationship for representing between described multiple semantic unit;
Matching module, for by described first set of semantics expression formula, mate with each the 2nd set of semantics expression formula in concordance list prepared in advance, determine and the 2nd set of semantics expression formula that described first set of semantics expression formula is mated mutually, described 2nd set of semantics expression formula is for representing one or more semantic units and semantic attribute thereof, if for representing multiple semantic unit, then described 2nd set of semantics expression formula also modified relationship for representing between described multiple semantic unit;
Described matching module also for, described concordance list obtains the relevant document identification of the 2nd set of semantics expression formula of described phase coupling; Document library obtains the document corresponding to described relevant document identification;
Sending module, for returning to described user by the document corresponding to described relevant document identification;
Described semantics recognition module, specifically for determining the attribute of all semantic units in described inquiry statement, and the modified relationship between semantic unit, described attribute comprises positive or negative attribute;
Generating the first set of semantics expression formula for described inquiry statement, described first set of semantics expression formula comprises semantic unit marks all in described inquiry statement and attribute-bit thereof, and the modified relationship mark between semantic unit.
6. device according to claim 5, it is characterized in that, described matching module is by described first set of semantics expression formula, mate with each the 2nd set of semantics expression formula in concordance list prepared in advance, determine with, before the 2nd set of semantics expression formula that described first set of semantics expression formula is mated mutually, described semantics recognition module is also for carrying out semantics recognition to each statement of document in document library; Determine the attribute of all semantic units in each statement, and the modified relationship between semantic unit, described attribute comprises positive or negative attribute; Generating a 2nd set of semantics expression formula for each statement, described 2nd set of semantics expression formula comprises semantic unit marks all in this statement and attribute-bit thereof, and the modified relationship mark between semantic unit.
7. device according to claim 6, it is characterised in that, also comprise: more new module;
After described semantics recognition module generates a 2nd set of semantics expression formula for each statement, described semantics recognition module also for, for each statement, when being affirmative attribute by the semantic unit of modification, semantic for the modification in the 2nd corresponding for this statement set of semantics expression formula unit is generated modification semanteme metaset and closes; Adopt described modify semantic metaset close in all included by each proper subclass modify the 2nd corresponding with this statement respectively set of semantics expression formulas of semantic unit generated set of semantics expression formula by the semantic unit of modification, obtain deriving from set of semantics expression formula;
Described more new module, for according to described 2nd set of semantics expression formula and described derivation set of semantics expression formula, upgrading described concordance list.
8. device according to claim 5, it is characterised in that, also comprise: order module;
Before described matching module obtains the document corresponding to described relevant document identification in document library, described order module is used for, and described relevant document identification is carried out relevance ranking.
CN201310200430.3A 2013-05-27 2013-05-27 Information retrieval method and device Expired - Fee Related CN103324678B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310200430.3A CN103324678B (en) 2013-05-27 2013-05-27 Information retrieval method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310200430.3A CN103324678B (en) 2013-05-27 2013-05-27 Information retrieval method and device

Publications (2)

Publication Number Publication Date
CN103324678A CN103324678A (en) 2013-09-25
CN103324678B true CN103324678B (en) 2016-06-01

Family

ID=49193421

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310200430.3A Expired - Fee Related CN103324678B (en) 2013-05-27 2013-05-27 Information retrieval method and device

Country Status (1)

Country Link
CN (1) CN103324678B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106503265A (en) * 2016-11-30 2017-03-15 北京赛迈特锐医疗科技有限公司 Structured search system and its searching method based on weights

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073729A (en) * 2011-01-14 2011-05-25 百度在线网络技术(北京)有限公司 Relationship knowledge sharing platform and implementation method thereof
CN102737039A (en) * 2011-04-07 2012-10-17 北京百度网讯科技有限公司 Index building method, searching method and searching result sorting method and corresponding device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110295688A1 (en) * 2010-05-28 2011-12-01 Microsoft Corporation Defining user intent

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073729A (en) * 2011-01-14 2011-05-25 百度在线网络技术(北京)有限公司 Relationship knowledge sharing platform and implementation method thereof
CN102737039A (en) * 2011-04-07 2012-10-17 北京百度网讯科技有限公司 Index building method, searching method and searching result sorting method and corresponding device

Also Published As

Publication number Publication date
CN103324678A (en) 2013-09-25

Similar Documents

Publication Publication Date Title
US9275062B2 (en) Computer-implemented system and method for augmenting search queries using glossaries
CN102479191B (en) Method and device for providing multi-granularity word segmentation result
CN103136352B (en) Text retrieval system based on double-deck semantic analysis
US20130311487A1 (en) Semantic search using a single-source semantic model
US8983947B2 (en) Augmenting search with association information
TWI434187B (en) Text conversion method and system
CN104537101A (en) Medical information search engine system and search method
CN102810114A (en) Personal computer resource management system based on body
CN114911917B (en) Asset meta-information searching method and device, computer equipment and readable storage medium
CN107436955B (en) English word correlation degree calculation method and device based on Wikipedia concept vector
CN102156711A (en) Cloud storage based power full text retrieval method and system
US20140101162A1 (en) Method and system for recommending semantic annotations
CN112115227A (en) Data query method and device, electronic equipment and storage medium
US20090234852A1 (en) Sub-linear approximate string match
CN102693320A (en) Searching method and device
CN102609455B (en) Method for Chinese homophone searching
CN102930049A (en) Embedded compiling method of user interest point data capable of supporting increment update
CN112836008B (en) Index establishing method based on decentralized storage data
KR101145979B1 (en) Named entity marking apparatus, named entity marking method, and computer readable medium thereof
CN103324678B (en) Information retrieval method and device
JP2001184358A (en) Device and method for retrieving information with category factor and program recording medium therefor
Manguinhas et al. A geo-temporal web gazetteer integrating data from multiple sources
CN102831151B (en) Method and device for generating electronic document
KR20160001167A (en) Method and Apparatus for moving data in DBMS
Kashyapi et al. TREMA-UNH at TREC 2018: Complex Answer Retrieval and News Track.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160601

CF01 Termination of patent right due to non-payment of annual fee