CN101414310A - Method and apparatus for searching natural language - Google Patents

Method and apparatus for searching natural language Download PDF

Info

Publication number
CN101414310A
CN101414310A CNA2008102243411A CN200810224341A CN101414310A CN 101414310 A CN101414310 A CN 101414310A CN A2008102243411 A CNA2008102243411 A CN A2008102243411A CN 200810224341 A CN200810224341 A CN 200810224341A CN 101414310 A CN101414310 A CN 101414310A
Authority
CN
China
Prior art keywords
framework
semantic
verb
search
storehouse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2008102243411A
Other languages
Chinese (zh)
Inventor
李茹
刘开瑛
由丽萍
王文晶
高俊杰
王瑞波
吕国英
谷波
李双红
钟立军
彭洪宝
陈雪艳
郭海旭
宋小香
邢欣
刘海静
郭韦昱
孙占虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanxi University
Original Assignee
Shanxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanxi University filed Critical Shanxi University
Priority to CNA2008102243411A priority Critical patent/CN101414310A/en
Publication of CN101414310A publication Critical patent/CN101414310A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a searching method and a device for natural languages. The method is as follows: building a Chinese framework knowledge base CFN and a professional field knowledge body base; then utilizing the Chinese framework knowledge base CFN to carry out Chinese framework meaning character automatic marking on a searching sentence inputted in natural language searching and extracting a triad with meaning information from the searching sentence according to the marking, finally taking the triad as the searching input and utilizing the body base to generate the searching answer. The invention can be applied to identify the searching sentence inputted in the natural languages by the user. When in answer extraction, a large quantity of answer bases are not needed for matching.

Description

A kind of method and apparatus of Natural Language Search
Technical field
The present invention relates to the Natural Language Search technical field, particularly about a kind of searching method and device of natural language.
Background technology
Chang Yong search technique mainly is based on keyword matching or subject classification in the prior art, but owing to lack semantic information, lack knowledge understanding and inferential capability, exist the search return message to comprise a large amount of irrelevant informations, and return message also exists quality to hang down problems such as the precision that reaches information dropout, precision ratio is not enough, its main cause is the defective that the Internet exists aspect information representation and retrieval, do not offer the enough readable informations of computing machine, limited the automatic analysis ability of computing machine in retrieval.
The method of available technology adopting is, at first semantic analysis carried out in user's search input, cooperates part-of-speech tagging, finds out the significant keyword of search engine.And then the index file of business datum retrieved.
As in search the input " how going to Wutai Mountain? " from packet header can carry out semantic fractionation to sentence according to Chinese dictionary, be that participle becomes: " packet header ", " how going ", " Wutai Mountain " these semantic just main bodys also are the keywords that Natural Language Search needs.Because for search, searched content has been carried out the establishment of index in advance by the entry in the dictionary.So the answer of returning when search may be relevant information, the relevant information in Wutai Mountain, the while in packet header also to comprise the information of how to go to Wutai Mountain from packet header that the user need inquire about.As seen in the prior art because the semanteme of user input is not understood accurately, so when the information of returning, the needed information of feedback user that can not be promptly and accurately.
Summary of the invention
The invention provides a kind of searching method and device of natural language, when being used to solve prior art and carrying out Natural Language Search, just return the problem of a large amount of related web pages for inquiring user.
A kind of method of Natural Language Search comprises:
A plurality of lemmas, framework with identical semanteme and the framework element that constitutes framework are preserved in A, structure Chinese framework knowledge base CFN and professional domain ontologies storehouse in the described Chinese framework knowledge base, wherein said framework is used to explain described identical semanteme;
B, at the search statement of inquiring user input, lemma at least one verb in the described search statement and the Chinese framework knowledge base is mated, find the affiliated framework of described verb, and described search statement is marked according to the framework element that comprises in the described framework;
C, select in the described verb one, and generate tlv triple according to described mark extracts described semantic predicate and this semantic predicate from described search statement main body and/or object as semantic predicate;
D, with described tlv triple as inquiry input, utilize described professional domain ontologies storehouse to generate the candidate answers collection.
Wherein, the content in the described Chinese framework knowledge base is described by the Semantic Web SGML.
Described Chinese language knowledge framework storehouse comprises framework storehouse, sentence storehouse and lemma storehouse:
Described framework storehouse is to be unit with the framework, preserves the definition of framework, the framework element that constitutes framework and the relation between framework and the framework;
Described sentence storehouse record has the sentence of framework semantic tagger information, and the described sentence that has framework semantic tagger information is the framework that provided according to the framework storehouse and the framework semantic information and the syntactic information of framework element mark sentence;
The involved lemma of each framework is preserved in described lemma storehouse.
Wherein, make up professional domain ontologies storehouse, concrete steps comprise:
Make up the ontology model in this field with reference to the taxonomic hierarchies standard relevant with professional domain;
By the body edit tool relation of the notion of each knowledge entry in the ontology library, each knowledge entry and example are represented with the Semantic Web SGML, and be stored as computer-readable document format.
After the described step B, further comprise:
When in the search statement a plurality of verb being arranged, entry relationship in each verb and the ontology library compared obtain the semantic index of described verb, and according to the semantic predicate of described semantic index selection verb as described statement, described semantic index is used to weigh the importance of verb.
Wherein, described step D comprises:
From described search statement, extract tlv triple according to described mark with semantic information;
According to described tlv triple generated query statement, in ontology library, search related content with this tlv triple coupling;
If search success then generate the candidate answers collection; If search failure, then utilize corresponding rule searching to create inference machine and carry out reasoning, and generate corresponding data model and inquire about, generate corresponding candidate answers collection after the successful inquiring.
After the described generation candidate answers collection, further comprise:
The answer that candidate answers is concentrated is sorted, and the answer after will sorting returns to inquiring user.
Further, when the search statement of user's input is question sentence, after generating tlv triple, comprising:
Carry out the question sentence analysis, extract the interrogative and the query purpose speech of described question sentence, obtain the inquiry message of this question sentence;
Described inquiry message and tlv triple are imported as inquiry, utilized described professional domain ontology library to generate the candidate answers collection.
According to said method, the present invention also provides a kind of Natural Language Search device, comprising:
Memory module, be used to store Chinese framework knowledge base CFN and professional domain ontologies storehouse, preserve a plurality of lemmas, framework with identical semanteme and the framework element that constitutes framework in the described Chinese framework knowledge base, wherein said framework is used to explain described identical semanteme;
Analysis module, be used for when inquiring user inputted search statement, lemma at least one verb in the described search statement and the Chinese framework knowledge base is mated, find the affiliated framework of described verb, and described search statement is marked according to the framework element that comprises in the described framework;
The semantic predicate module, one that is used for selecting described verb as semantic predicate, and generates tlv triple according to described mark extracts described semantic predicate and this semantic predicate from described search statement main body and/or object;
The answer generation module is used for described tlv triple is imported as inquiry, utilizes described professional domain ontologies storehouse to generate the candidate answers collection.
Wherein, described memory module also is used for utilizing the Semantic Web SGML to describe the content of Chinese framework knowledge base.
Further, described analysis module comprises:
The framework determining unit is used for when inquiring user inputted search statement the lemma in verb in the search statement and the Chinese framework knowledge base being mated, and finds the affiliated framework of described verb;
The mark unit, the framework element that is used for comprising according to described framework marks described search statement.
Described semantic predicate module comprises:
Selected cell is used for selecting a verb as semantic predicate from the verb of search statement;
Extraction unit is used for and extracts the main body of described semantic predicate and this semantic predicate and/or object generates tlv triple according to described mark from described search statement.
Described answer generation module comprises:
Query unit is used for described tlv triple is imported as query search, utilizes described professional domain ontologies storehouse to generate the candidate answers collection;
Reasoning element is used for searching when failure when enquiry module, utilizes corresponding rule searching to create inference machine and carries out reasoning, and generate corresponding data model and inquire about and generate the candidate answers collection.
Sequencing unit is used for the answer that candidate answers is concentrated is sorted, and according to this ordering answer is returned to the user.
Further, described selected cell also is used for when search statement has a plurality of verb, entry relationship in each verb and the ontology library compared obtain the semantic index of described verb, and according to the semantic predicate of verb of described semantic index selection as described statement, described semantic index is used to weigh the importance of verb.
This device also comprises:
The question sentence module is used for carrying out the question sentence analysis when the search statement of user's input is question sentence, extracts the interrogative and the query purpose speech of described question sentence, obtains the inquiry message of this question sentence;
Then described answer generation module also is used for described inquiry message and tlv triple are imported as inquiry, utilizes described professional domain ontology library to generate the candidate answers collection.
The present invention utilizes CFN that the natural search statement of inquiring user input is marked automatically, extracts the tlv triple with semantic information then, described tlv triple is carried out the search of answer in ontology library as the inquiry input.So can be fast and search definite answer efficiently because before carrying out ontology library search, carried out semantic analysis and mark.
Description of drawings
Fig. 1 is the process flow diagram of the method for a kind of Natural Language Search of the embodiment of the invention;
Fig. 2 is the annexation figure of each word bank in the Chinese framework semantic knowledge-base in the embodiment of the invention;
Fig. 2 A is the frame network figure that each framework constitutes in the Chinese framework knowledge base in the embodiment of the invention;
Fig. 3 is the embodiment of the invention is extracted tlv triple from search statement a process flow diagram;
Fig. 3 A is that the embodiment of the invention utilizes Chinese framework knowledge base query statement to be carried out the process flow diagram of semantic character labeling;
Fig. 4 utilizes ontology library for the embodiment of the invention and carries out the process flow diagram of the extraction of answer;
Fig. 4 A is the fundamental diagram of inference machine;
Fig. 5 is a kind of querying method process flow diagram at the simple search statement of the embodiment of the invention;
Fig. 6 utilizes the inventive method to be applied to the process flow diagram of tour field;
Fig. 6 A is sight spot, lodging, the vehicles, amusement, food and drink and the relation model figure between 6 classes (notion) of doing shopping;
Fig. 7 carries out the process flow diagram that tlv triple is extracted for the embodiment of the invention to question sentence;
Fig. 8 is the installation drawing of a kind of Natural Language Search device of the embodiment of the invention;
Fig. 9 is an analysis module installation drawing in a kind of Natural Language Search device of the embodiment of the invention;
Figure 10 is semantic predicate modular device figure in a kind of Natural Language Search device language of the embodiment of the invention;
Figure 11 is an answer generation module installation drawing in a kind of Natural Language Search device language of the embodiment of the invention.
Embodiment
In the embodiment of the invention, make up Chinese framework knowledge base CFN and professional domain ontologies storehouse, utilize Chinese framework knowledge base that the query statement of Natural Language Search input is marked then, and according to described mark the extraction from query statement has the tlv triple of semantic information, at last described tlv triple is imported as inquiry, utilized the answer of described ontology library generated query.
Below in conjunction with Figure of description the specific embodiment of the present invention is elaborated, as shown in Figure 1, the method for a kind of Natural Language Search of the embodiment of the invention comprises step:
Step 101, make up Chinese framework knowledge base (Chinese Frame Net, CFN).
It is the Chinese framework knowledge base of description object with limited set of words that the embodiment of the invention has at first made up one, and with Semantic Web SGML (extend markup language (XML, Extensible MarkupLanguage), resource description framework (RDF, Resource Description Framework), Web body SGML (OWL, Web Ontology Language)) the various resources of this semantic knowledge-base have been represented.
(1) Chinese framework knowledge base mainly is made up of framework storehouse, sentence storehouse and lemma storehouse, and particular content comprises:
Lemma is mainly deposited in A, lemma storehouse, and described lemma is the class word with identical semanteme, and wherein said identical semanteme is a framework.
For example explain " statement " semantic lemma, as shown in table 1 comprising:
Speech v Confess v Assert n Declare v Announce v Declaration n Assert v Claim v Statement n Statement v
V sings one's own praises Warning v Comment v Comment n Note v Complaint v Honest v Hysteria is said v Pass on v Leak v
Call out v Explain orally v V is described Express v Emphasize v Mention v Advertise v Expression v Propose v Statement v
Reaffirm v Narration v Say v Talk about v Report v Report n Disclose v Say v V speaks Comment v
State v Talk v Talk v Speech v State v Report v Report n Say frankly v Tell v V states outright
Scatter v Mention v Mention v Disclose v Declaration n Read out v Declare v Advocate v Explain and publicise v Make noise v
Comment v Comment v Comment n Divulge v Refer to v V tries to justify oneself Accusation v Smooth dew v State frankly v Speech n
Report v Appraise v through discussion
Table 1
B, framework storehouse are to be unit with the framework, clearly provide the definition of framework and the framework element (being also referred to as semantic role) of framework, and describe the conceptual relation between this framework and other frameworks.
Framework is mainly deposited in the storehouse: the 1. definition of framework; 2. (the different components that constitute framework serve as different roles to the framework element, are referred to as semantic role and are also referred to as the framework element.Comprising core frame element and non-core framework element); 3. the relation of framework.
The related content that below is " statement " framework comprises: the definition, as shown in table 1, the non-core framework element of core frame element (core semantic role) (non-core semantic role) that comprise framework are shown in table 2 and table 3.
The framework definition of " statement ": what this framework was expressed is the behavior that the speaker conveys a message to obedient person with language.
Figure A200810224341D00121
Table 2
Table 3
C, sentence storehouse record have the sentence of framework semantic tagger information, and the principle of mark is according to the mark of the sentence under the framework of framework storehouse example, and is the branch framework, divides lemma to deposit.
CFN provides the sentence that has framework semantic tagger information for each senses of a dictionary entry of each lemma, and these sentences are from real database for natural language, rather than create by linguist or dictionary editor.Choosing on the sentence, making every effort to demonstrate as much as possible all possible syntactic-semantic combination of this lemma.This makes the data of CFN provide abundant material for the syntactic-semantic combinatorial property of summarizing word, for automatic semantic tagger Study on Technology provides training data.
A sentence example of " statement " framework:
Britain side's face measure in requital announces that also the diplomat of 4 Russian embassies is the person non grata.
<spkr-np-subj English jn aspect n〉announces v as v revenge v measure n also d<tgt<the diplomat u n of the msg-dj-obj 4m name q Russia nsy n of embassy be v not d be subjected to the u people n of v welcome v.
(2) contact between each element in the Chinese framework knowledge base:
As shown in Figure 2, lemma storehouse, sentence storehouse and framework storehouse three's relation comprises in embodiments of the present invention: the lemma storehouse depends on the framework storehouse, be that specific word is under the jurisdiction of specific framework (though the phenomenon of one-to-many is arranged, promptly a lemma can be under the jurisdiction of several frameworks), because same lemma is under different frameworks, its semantic collocation pattern is different with the sentence structure way of realization, so the sentence storehouse depends on lemma storehouse and framework storehouse again.
Also have multiple contact between framework and the framework, constitute a knowledge network shown in Fig. 2 A, wherein the contact between each framework comprises: inheritance, total points relation, total territory/minute territory relation, reference relation, cause-effect relationship, follow-up relation.Simultaneously a framework relates to a plurality of lemmas, marks with the framework element set of same framework; Conversely, a polysemant is represented a plurality of lemmas, belongs to several different frameworks, promptly represents that with different framework elements such information has been arranged, and an application system just might be distinguished the different meanings of same morphology in different environments for use.
Step 102, structure professional domain ontologies storehouse specifically comprise:
At first determine the field and the scope of body with reference to the taxonomic hierarchies standard, and list the important terms in the body, described term roughly shows all things that relate to of modeling process, and the relation between the attribute that these things had and these attributes etc.Define the relation between the support, attribute, attribute of class and class, the restriction of attribute then, obtain the ontology model of this body at last.
By body edit tool (wherein comparatively common body edit tool comprises Ontolingua, OntoEdit, Ontosaurus and Prot é g é etc.) notion of each knowledge entry in the ontology model, relation and example (being tlv triple) are showed with the Web SGML, and be stored as computer-readable document form.
The reverse-power (Inverse Of) of having set up strict difinition between the class of body, transitive relation (TranstiveProperty), funtcional relationship (Functional Property), symmetric relation (Symmetric Property), inverse function relation (Inverse Functional Property) and to the restriction of attribute.
The tlv triple of step 103, search statement extracts.
Behind the search statement that receives user's input, at first carry out pre-service, promptly carry out the part of speech of participle and all words of mark.Extract all verbs in the described search statement then, and the lemma in each verb and the Chinese framework knowledge base is mated, find the affiliated framework of each verb, described search statement is marked according to the framework element in the described framework.Select in the verb one to generate tlv triple at last with semantic information as the semantic predicate of described search statement and the subject and object that extracts this semantic predicate, described subject and object is a previous noun and a back noun adjacent with semantic predicate in the query statement, and described tlv triple has been expressed the semantic information of INQUIRE statement and the annexation between each framework element.
Wherein, can lack main body or object in the tlv triple, promptly tlv triple be by semantic predicate add the above semantic predicate main body and/object forms.
Further, if there is not verb in the search statement, then described semantic predicate then is the word that can represent this statement search intention.If described search statement is not for comprising the question sentence of verb, then described semantic predicate is an interrogative, and subject and object then is the noun adjacent with interrogative.
As shown in Figure 3, be example with the verb tlv triple, the extraction of tlv triple is further detailed, specifically comprise step:
S301, query statement is carried out semantic character labeling according to Chinese framework knowledge base.As shown in Figure 3A, specifically comprise step:
S3A01, the search statement that inquiring user is imported carry out pre-service, extract all verbs in this search statement.
S3A02, the lemma in described verb and the Chinese framework knowledge base is mated, thereby obtain framework under this verb.
S3A03, described search statement is marked according to the framework element that is comprised in this framework.Specifically comprise three layers:
Ground floor framework element mark, the framework element is the various participants in the framework, the framework element is divided into core frame element and non-core framework element.The core frame element is the mandatory component of a framework on conceptual understanding, and their types in different frameworks are different with quantity, demonstrates the individual character of framework.Non-core framework element is the individual character of display frame not, peripheral semantic components such as expression time, space, environmental baseline, reason, purpose.Second layer phrase type mark.The 3rd layer of syntactic function mark.
The semantic predicate of S302, the described search statement of selection, described semantic predicate are can explain the word of the topmost search purpose of search statement really.
1, when having only a verb in the search statement of user's input, then this verb is the semantic predicate of described search statement.
When if a plurality of verb is arranged in the search statement of 2 users input, then (entry relationship comprises: the notion in the model with the entry relationship in each verb and the ontology library, and the example of relation between the notion and notion, for example: automotive-type is arranged in the vehicles, between automobile and the sight spot class relation is arranged, and sight spot and automobile all have instantiation separately) comparing obtains the semantic index of each verb, and described semantic index is used to weigh the importance of semantic predicate.Select the semantic predicate of a verb according to semantic index then as this search statement.
The extraction of S303, tlv triple.
Generate the tlv triple of the semantic information that can express this search statement according to the subject and object of the described semantic predicate of mark extraction.The main body or the object that can lack semantic predicate in the described tlv triple.
Owing to have a large amount of spoken languages in the statement of user's input, so the semanteme of query statement is understood according to the spoken vocabulary dictionary of vocabulary correspondence in the restricted domain.
Step 104 utilizes ontology library to carry out the extraction of answer.As shown in Figure 4, specifically comprise step:
S401, utilize described tlv triple generated query statement in ontology library, to search the relevant information that is complementary with this tlv triple.
If S402 searches successfully, change S405 over to after then generating the candidate answers collection, if search failure, then change S403 over to.
S403, utilize corresponding rule searching (the whole relations in the ontology library of depositing in custom rule in the inference machine and the inference machine) to create inference machine, carry out reasoning, and generate corresponding data model, inquire about once more.
If the S404 successful inquiring then generates corresponding answer set, and changes S405 over to; If inquiry is failure once more, then go to S406.
S405, the concentrated answer that checks on one's answers are sorted, and the answer after will sorting returns to the user.
S406, looked into content and can not be found by inquiring user returns.
In embodiments of the present invention, when the tlv triple extracted and the relevant information in the ontology library were mated, the kit-Jena of rdf model can be resolved and inquire about to employing.Jena body resolver can be resolved RDF, and the inquiry of RDQL is supported and to the parsing of OWL.Jena provides the RBR machine simultaneously.
Jena provides RBR machine (as RDFS Reasoner, OWL Reasoner etc.), and the user can also self-defined as required inference rule in addition, also can register and use third party's inference engine.Shown in Fig. 4 A, the principle of work of inference machine is: the inference machine login mechanism is created out inference machine according to basic RDF vector description (information resources) and Ontology, inference machine can generate the model object (InferenceGraph that comprises inference mechanism thus, InfGraph), in Jena, figure (Graph) is also referred to as model (Model), and the form of expression is model interface (ModelInterface), the application programming interface that can use a model then (Model API, ModelApplication Programming Interface) and body application programming interface (Ontology API, OntologyApplication Programming Interface) this model is operated and handled, thus the information retrieval of realization semantic level.
As shown in Figure 5, the embodiment of the invention also provides a kind of querying method at the simple search statement, specifically comprises step:
Step 501, structure professional domain ontologies storehouse.
Make up the professional domain ontologies towards the restricted domain, with reference to " (Chinese classification scheme vocabulary ", professional domain relevant criterion, and, make up the ontology model in this field according to all information relevant as can be known of relation between each component or the like in the basic term of professional domain and the professional domain with this professional domain.Adopt OWL that ontology model is encoded then, by the body edit tool Prot é g é of Stanford Univ USA, the notion of each clauses and subclauses in the ontology library, relation and example are showed with OWL and RDF, and be stored as the OWL document at last.
The reverse-power (Inverse Of) of having set up strict difinition between the class of body, transitive relation (TranstiveProperty), funtcional relationship (Functional Property), symmetric relation (Symmetric Property), inverse function relation (Inverse Functional Property) and to the restriction of attribute.
Step 502, at first search statement is carried out pre-service, extract the tlv triple in the search statement, utilize described tlv triple to generate the SPARQL query statement then, in ontology library, search the relevant information that is complementary with this tlv triple, if search successfully, then change step 504 over to,, then change step 503 over to if search failure.
Step 503, utilize corresponding rule searching to create inference machine, carry out reasoning, and generate corresponding data model, inquire about once more, if successful inquiring then changes step 504 over to; If inquiry is failure once more, then return institute's query contents and can not find.
Step 504, candidate answers is sorted, and the answer after will sorting returns to inquiring user.
As shown in Figure 6, the relevant question sentence below in conjunction with the tour field inquiring user is proposed is described further the embodiment of the invention.Because user's major part in the time of the inquiry related content all is the form input with question sentence, so done the processing of optimizing at the inquiry question sentence especially in the present embodiment, concrete steps comprise:
Step 601, structure Chinese framework knowledge base (CFN).
Step 602, structure tour field ontologies storehouse.
Towards the travel information in somewhere, choose distinctive tourist attractions, all set up corpus at each sight spot, make up the ontology library of tour field.On the basis of sight spot corpus, promptly swim according to tourism six key elements, purchase, joy, food, live, OK, document has been carried out the extraction of term, and with reference to " Chinese classification scheme vocabulary " and " tourist service basis term " (gb/t 16766-1997), " tourism planning general rule " (gb/t 18971-2003), each subject of tourist industry is affiliated classification in the Chinese Library classification, " tourist industry standards system table ", " travel agency's domestic travel quality of service requirement " (lb/t004-1997), " guide service quality " (gb/15971-1995), CNS net (www.chinagb.org), tourism planning general rule (gb/t 18971-2003), tourist resources Classification Count and evaluation (gb/t 18972-2003), tourist service basis term (gb/t 16766-1997) etc. has carried out the Primary Construction of tourism ontology model.Fig. 6 A is sight spot, lodging, the vehicles, amusement, food and drink and the relation model figure between 6 classes (notion) of doing shopping.
System adopts OWL Lite to carry out the coding of ontology model, and has used the body edit tool Prot é g é of Stanford Univ USA.The reverse-power (Inverse Of) of having set up strict difinition between the class of body, transitive relation (Transtive Property), funtcional relationship (Functional Property), symmetric relation (Symmetric Property), inverse function relation (Inverse Functional Property) and to the restriction of attribute.
By Prot é g é, the notion relevant with database, relation and example are showed with OWL and RDF, be stored as the OWL document.
Step 603, the query statement that user search is imported carry out the problem classification.
Problem is carried out the branch time-like, different problem classification can be arranged from different angles.Native system has been taked multi-angle classification form, on the basis of TREC (Text Retrieval Conference) classification, utilizes the thought of body, and problem is classified.
According to the statistics in question sentence storehouse, the question sentence type of being carried for the tour field inquiring user at present is divided into following three classes:
(1) simply asks the main body of body, object.Comprise and refer in particular to interrogative sentence and be non-yet inquiry personage, time, numeral, entity.
As: how is the weather in Wutai Mountain? is there there the hotel near the Wutai Mountain?
(2) method for inquiring belongs to description.
As: is how driving left for Wutai Mountain, to be gone from Beijing?
(3) problem of reason, definition class.
Step 604, the query statement that utilizes Chinese framework knowledge base that user search is imported extract the tlv triple with semantic information, and concrete steps comprise as shown in Figure 7:
S701, utilize Chinese framework knowledge base that query statement is carried out semantic character labeling.
Mark has three layers, and ground floor is the framework element, and the framework element is divided into core frame element and non-core frame
Table 4
The frame element.The core frame element is the mandatory component of a framework on conceptual understanding, and their types in different frameworks are different with quantity, demonstrates the individual character of framework.Non-core framework element is the individual character of display frame not, peripheral semantic components such as expression time, space, environmental baseline, reason, purpose.The second layer is phrase type mark, and the 3rd layer is the syntactic function mark.Provided the frame description of " arrival " framework in the table 4.
Example sentence: " driving from Taiyuan to Wutai Mountain, how to walk recently? "
Carry out behind the CFN mark be:
<mot-vp-va drives〉<src-pp-adva is from Taiyuan〉<tgt=reaches〉<goal-sp-obj Wutai Mountain〉how to walk recently?
S702, question sentence analysis.
Obtain interrogative and query purpose speech.Because question sentence can be determined the search purposes of inquiring user by interrogative and query purpose speech.
The extraction of S703, tlv triple.
At first from the verb of question sentence, obtain semantic predicate, and the semantic predicate that gets access to and the entry relationship in the ontology library are compared.Weigh the subject and object of important, the rule-based scoring back extraction semantic predicate of semantic predicate by semantic index.
Example sentence: drive from Taiyuan to Wutai Mountain, how to walk recently?
At first pass through pre-service, directly extracting framework element<mot-vp-va by the information of CFN mark drives 〉,<src-pp-adva is from Taiyuan 〉,<tgt=reaches 〉,<goal-sp-obj Wutai Mountain 〉, discern, judge the second largest class that belongs in the TREC classification through problem types: the method class in the description, analyzing the comparison composition simultaneously is the route property value.Do you satisfy<self driving?, starting point, Taiyuan 〉,<self driving?, destination, Wutai Mountain〉the example of automobile subclass self driving, the route property value to all examples compares then.
For example: frameworks such as embodiment of the invention utilization " arrival ", " passing through ", " setting out ", " displacement ", " existence ", question sentence to the inquiry traffic route or the vehicles carries out the question sentence analysis, utilizes the lemma in the framework that verb has been carried out the synonym expansion simultaneously.
The CFN ground floor can be the vehicles and very fast the identifying of starting point and destination.Table 5 is the part question sentence mark example in tourist communications field.
Figure A200810224341D00211
Table 5
The extraction of step 605, answer.
The search purposes of described tlv triple and inquiring user is imported as inquiry, generated SPARQL query language and Jena inference machine and carry out searching of answer in described tour field ontologies storehouse, concrete querying flow comprises:
When the user import an inquiry " how going to Wutai Mountain? " from packet header, then system therefrom extracts starting point, verb and destination<packet header by above-mentioned steps, goes to Wutai Mountain 〉, and the search purposes that the question sentence analysis obtains the user is: interrogation link is how to get to.Generate the SPARQL query statement according to tlv triple and search purposes information, in ontology library, search and inquire the relevant information that content is complementary.
If search successfully, then directly generate the candidate answers collection; If search failure, then generate corresponding rule searching, and create inference machine, carry out reasoning, generate corresponding data model then, inquire about once more, search success and then generate corresponding candidate answers collection, and the answer that candidate answers is concentrated is sorted.Result after will sorting at last returns to the user.If still fail after generating corresponding rule searching, then return the sky answer to inquiring user.
The answer of returning of example is:
1, train 1674/1675: packet header---Xinzhou train 2462/2463: packet header---Xinzhou big bus: Xinzhou---Wutai Mountain
2, aircraft MU5690: airport, packet header---Taiyuan Wu Su airport limousine: Taiyuan---Wutai Mountain
3, big bus: packet header---Taiyuan big bus: Taiyuan---Wutai Mountain.
As shown in Figure 8, the embodiment of the invention also provides a kind of Natural Language Search device to comprise memory module 801, analysis module 802, question sentence module 803, semantic predicate module 804, answer generation module 805:
Memory module 801, be used to make up Chinese framework knowledge base CFN and professional domain ontologies storehouse, preserve a plurality of lemmas, framework with identical semanteme and the framework element that constitutes framework in the described Chinese framework knowledge base, wherein said framework is used to explain described identical semanteme, and all the elements in the wherein said Chinese framework knowledge base are all described by the Semantic Web SGML.
Analysis module 802, be used for when inquiring user inputted search statement, lemma at least one verb in the described search statement and the Chinese framework knowledge base is mated, find the affiliated framework of described verb, and described search statement is marked according to the framework element that comprises in the described framework.
As shown in Figure 9, described analysis module comprises framework determining unit and mark unit:
Framework determining unit 901 is used for when inquiring user inputted search statement the lemma in verb in the search statement and the Chinese framework knowledge base being mated, and finds the affiliated framework of described verb.
Mark unit 902, the framework element that is used for comprising according to described framework marks described search statement.
Question sentence module 803 is used for carrying out the question sentence analysis when the search statement of user's input is question sentence, extracts the interrogative and the query purpose speech of described question sentence, obtains the inquiry message of this question sentence;
Semantic predicate module 804, one that is used for selecting described verb as semantic predicate, and generates tlv triple according to described mark extracts described semantic predicate and this semantic predicate from described search statement main body and/or object.
As shown in figure 10, described semantic predicate module comprises selected cell 1001 and extraction unit 1002, wherein:
Described selected cell 1001 is used for when the search statement of user's input has only a verb, and then this verb is the semantic predicate of described search statement.If when a plurality of verb was arranged in the search statement of user input, then the entry relationship in each verb and the ontology library (being attribute) being compared obtained the semantic index of each verb, described semantic index is used to weigh the importance of semantic predicate.Select the semantic predicate of a verb according to semantic index then as this search statement.
Described extraction unit 1002 is used for and extracts the main body of described semantic predicate and this semantic predicate and/or object generates tlv triple according to described mark from described search statement.
Answer generation module 805, be used for extracting tlv triple from described search statement with semantic information according to described mark, described tlv triple comprises the main body and/or the object of verb and verb, and described tlv triple imported as query search, utilize described professional domain ontologies storehouse to generate the candidate answers collection.When described search statement was question sentence, then this answer generation module also was used for described inquiry message and tlv triple are imported as inquiry, utilizes described professional domain ontology library to generate the candidate answers collection.
As shown in figure 11, described answer generation module comprises query unit 1101, reasoning element 1102, sequencing unit 1103:
Query unit 1101 is used for described tlv triple is imported as query search, utilizes described professional domain ontologies storehouse to generate the candidate answers collection.
Reasoning element 1102 is used for searching when failure when enquiry module, utilizes corresponding rule searching to create inference machine and carries out reasoning, and generate corresponding data model and inquire about and generate the candidate answers collection.
Sequencing unit 1103 is used for the answer that candidate answers is concentrated is sorted, and according to this ordering answer is returned to the user.
Because all the elements in the Chinese framework knowledge base all are described with Semantic Web, institute thinks readable, the intelligible semantic dictionary of computer utility, for the semantic knowledge in the realization Semantic Web is shared and intelligent, personalized Web service provides basic resource.
And, corresponding relation between sentence storehouse record semantic role in the Chinese framework knowledge base and phrase type, the syntactic function, replaced from the description of intuition role's selectional restriction, than the result of artificial description more specifically, more accurate, also more with practical value.
Method of the present invention is not limited to the embodiment described in the embodiment, and those skilled in the art's technical scheme according to the present invention draws other embodiment, belongs to technological innovation scope of the present invention equally.
Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, if of the present invention these are revised and modification belongs within the scope of claim of the present invention and equivalent technologies thereof, then the present invention also is intended to comprise these changes and modification interior.

Claims (17)

1, a kind of method of Natural Language Search is characterized in that, comprising:
A plurality of lemmas, framework with identical semanteme and the framework element that constitutes framework are preserved in A, structure Chinese framework knowledge base CFN and professional domain ontologies storehouse in the described Chinese framework knowledge base, wherein said framework is used to explain described identical semanteme;
B, at the search statement of inquiring user input, lemma at least one verb in the described search statement and the Chinese framework knowledge base is mated, find the affiliated framework of described verb, and described search statement is marked according to the framework element that comprises in the described framework;
C, select in the described verb one, and generate tlv triple according to described mark extracts described semantic predicate and this semantic predicate from described search statement main body and/or object as semantic predicate;
D, with described tlv triple as inquiry input, utilize described professional domain ontologies storehouse to generate the candidate answers collection.
2, the method for claim 1 is characterized in that, the content in the described Chinese framework knowledge base is described by the Semantic Web SGML.
3, method as claimed in claim 2 is characterized in that, described Semantic Web SGML comprises expandable mark language XML, resource description framework RDF, body SGML OWL.
4, the method for claim 1 is characterized in that, described Chinese language knowledge framework storehouse comprises framework storehouse, sentence storehouse and lemma storehouse:
Described framework storehouse is to be unit with the framework, preserves the definition of framework, the framework element that constitutes framework and the relation between framework and the framework;
Described sentence storehouse record has the sentence of framework semantic tagger information, and the described sentence that has framework semantic tagger information is the framework that provided according to the framework storehouse and the framework semantic information and the syntactic information of framework element mark sentence;
The involved lemma of each framework is preserved in described lemma storehouse.
5, the method for claim 1 is characterized in that, makes up professional domain ontologies storehouse, comprising:
Make up the ontology model in this field with reference to the taxonomic hierarchies standard relevant with professional domain;
By the body edit tool relation of the notion of each knowledge entry in the ontology library, each knowledge entry and example are represented with the Semantic Web SGML, and be stored as computer-readable document format.
6, the method for claim 1 is characterized in that, after the described step B, further comprises:
When in the search statement a plurality of verb being arranged, entry relationship in each verb and the ontology library compared obtain the semantic index of described verb, and according to the semantic predicate of described semantic index selection verb as described statement, described semantic index is used to weigh the importance of verb.
7, the method for claim 1 is characterized in that, described step D comprises:
From described search statement, extract tlv triple according to described mark with semantic information;
According to described tlv triple generated query statement, in ontology library, search related content with this tlv triple coupling;
If search success then generate the candidate answers collection; If search failure, then utilize corresponding rule searching to create inference machine and carry out reasoning, and generate corresponding data model and inquire about, generate corresponding candidate answers collection after the successful inquiring.
8, as claim 1 or 7 described methods, it is characterized in that, after the described generation candidate answers collection, further comprise:
The answer that candidate answers is concentrated is sorted, and the answer after will sorting returns to inquiring user.
9, the method for claim 1 is characterized in that, when the search statement of user's input is question sentence, after generating tlv triple, further comprises:
Carry out the question sentence analysis, extract the interrogative and the query purpose speech of described question sentence, obtain the inquiry message of this question sentence;
Described inquiry message and tlv triple are imported as inquiry, utilized described professional domain ontology library to generate the candidate answers collection.
10, a kind of Natural Language Search device is characterized in that, comprising:
Memory module, be used to store Chinese framework knowledge base CFN and professional domain ontologies storehouse, preserve a plurality of lemmas, framework with identical semanteme and the framework element that constitutes framework in the described Chinese framework knowledge base, wherein said framework is used to explain described identical semanteme;
Analysis module, be used for when inquiring user inputted search statement, lemma at least one verb in the described search statement and the Chinese framework knowledge base is mated, find the affiliated framework of described verb, and described search statement is marked according to the framework element that comprises in the described framework;
The semantic predicate module, one that is used for selecting described verb as semantic predicate, and generates tlv triple according to described mark extracts described semantic predicate and this semantic predicate from described search statement main body and/or object;
The answer generation module is used for described tlv triple is imported as inquiry, utilizes described professional domain ontologies storehouse to generate the candidate answers collection.
11, device as claimed in claim 10 is characterized in that, described memory module also is used for utilizing the Semantic Web SGML to describe the content of Chinese framework knowledge base.
12, device as claimed in claim 10 is characterized in that, described analysis module comprises:
The framework determining unit is used for when inquiring user inputted search statement the lemma in verb in the search statement and the Chinese framework knowledge base being mated, and finds the affiliated framework of described verb;
The mark unit, the framework element that is used for comprising according to described framework marks described search statement.
13, device as claimed in claim 10 is characterized in that, described semantic predicate module comprises:
Selected cell is used for selecting a verb as semantic predicate from the verb of search statement;
Extraction unit is used for and extracts the main body of described semantic predicate and this semantic predicate and/or object generates tlv triple according to described mark from described search statement.
As right 10 described devices, it is characterized in that 14, described answer generation module comprises:
Query unit is used for described tlv triple is imported as query search, utilizes described professional domain ontologies storehouse to generate the candidate answers collection;
Reasoning element is used for searching when failure when enquiry module, utilizes corresponding rule searching to create inference machine and carries out reasoning, and generate corresponding data model and inquire about and generate the candidate answers collection.
15, device as claimed in claim 14 is characterized in that, described answer generation module also comprises:
Sequencing unit is used for the answer that candidate answers is concentrated is sorted, and according to this ordering answer is returned to the user.
16, device as claimed in claim 13, it is characterized in that, described selected cell also is used for when search statement has a plurality of verb, entry relationship in each verb and the ontology library compared obtain the semantic index of described verb, and according to the semantic predicate of verb of described semantic index selection as described statement, described semantic index is used to weigh the importance of verb.
17, device as claimed in claim 10 is characterized in that, this device also comprises:
The question sentence module is used for carrying out the question sentence analysis when the search statement of user's input is question sentence, extracts the interrogative and the query purpose speech of described question sentence, obtains the inquiry message of this question sentence;
Then described answer generation module also is used for described inquiry message and tlv triple are imported as inquiry, utilizes described professional domain ontology library to generate the candidate answers collection.
CNA2008102243411A 2008-10-17 2008-10-17 Method and apparatus for searching natural language Pending CN101414310A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA2008102243411A CN101414310A (en) 2008-10-17 2008-10-17 Method and apparatus for searching natural language

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA2008102243411A CN101414310A (en) 2008-10-17 2008-10-17 Method and apparatus for searching natural language

Publications (1)

Publication Number Publication Date
CN101414310A true CN101414310A (en) 2009-04-22

Family

ID=40594844

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2008102243411A Pending CN101414310A (en) 2008-10-17 2008-10-17 Method and apparatus for searching natural language

Country Status (1)

Country Link
CN (1) CN101414310A (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101908042A (en) * 2010-08-09 2010-12-08 中国科学院自动化研究所 Tagging method of bilingual combination semantic role
CN102262634A (en) * 2010-05-24 2011-11-30 北京大学深圳研究生院 Automatic questioning and answering method and system
CN102681982A (en) * 2012-03-15 2012-09-19 上海云叟网络科技有限公司 Method for automatically recognizing semanteme of natural language sentences understood by computer
CN102722569A (en) * 2012-05-31 2012-10-10 浙江理工大学 Knowledge discovery device based on path migration of RDF (Resource Description Framework) picture and method
CN102779161A (en) * 2012-06-14 2012-11-14 杜小勇 Semantic labeling method based on resource description framework (RDF) knowledge base
CN103380426A (en) * 2011-02-16 2013-10-30 英派尔科技开发有限公司 Performing queries using semantically restricted relations
CN104133916A (en) * 2014-08-14 2014-11-05 百度在线网络技术(北京)有限公司 Search result information organizational method and device
CN104391969A (en) * 2014-12-04 2015-03-04 百度在线网络技术(北京)有限公司 User query statement syntactic structure determining method and device
CN104462399A (en) * 2014-12-11 2015-03-25 北京百度网讯科技有限公司 Search result processing method and search result processing device
CN105760359A (en) * 2014-11-21 2016-07-13 财团法人工业技术研究院 Question processing system and method thereof
CN106126545A (en) * 2016-06-15 2016-11-16 北京智能管家科技有限公司 Distributed fission querying method and device
CN106446018A (en) * 2016-08-29 2017-02-22 北京百度网讯科技有限公司 Artificial intelligence-based query information processing method and device
CN104111917B (en) * 2013-04-19 2017-04-12 富士通株式会社 Data processing device, data processing method and electronic device
CN107169043A (en) * 2017-04-24 2017-09-15 成都准星云学科技有限公司 A kind of knowledge point extraction method and system based on model answer
CN107391574A (en) * 2017-06-19 2017-11-24 福建工程学院 A kind of Chinese ambiguity partition method based on body and swarm intelligence algorithm
CN107729350A (en) * 2017-08-29 2018-02-23 百度在线网络技术(北京)有限公司 Route quality querying method, device, equipment and storage medium
CN108932278A (en) * 2018-04-28 2018-12-04 厦门快商通信息技术有限公司 Interactive method and system based on semantic frame
CN109074353A (en) * 2016-10-10 2018-12-21 微软技术许可有限责任公司 The combination of language understanding and information retrieval
CN110209781A (en) * 2018-08-13 2019-09-06 腾讯科技(深圳)有限公司 A kind of text handling method, device and relevant device
CN110569335A (en) * 2018-03-23 2019-12-13 百度在线网络技术(北京)有限公司 triple verification method and device based on artificial intelligence and storage medium
CN111078710A (en) * 2019-12-30 2020-04-28 凌祺云 Teaching auxiliary system construction method based on knowledge cross-correlation
CN111651570A (en) * 2020-05-13 2020-09-11 深圳追一科技有限公司 Text sentence processing method and device, electronic equipment and storage medium
CN112765363A (en) * 2021-01-19 2021-05-07 昆明理工大学 Demand map construction method for scientific and technological service demand
CN112823332A (en) * 2018-10-10 2021-05-18 N3有限责任公司 Semantic industry terminology

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102262634A (en) * 2010-05-24 2011-11-30 北京大学深圳研究生院 Automatic questioning and answering method and system
CN102262634B (en) * 2010-05-24 2013-05-29 北京大学深圳研究生院 Automatic questioning and answering method and system
CN101908042A (en) * 2010-08-09 2010-12-08 中国科学院自动化研究所 Tagging method of bilingual combination semantic role
CN101908042B (en) * 2010-08-09 2016-04-13 中国科学院自动化研究所 A kind of mask method of bilingual combination semantic role
CN103380426B (en) * 2011-02-16 2017-09-22 英派尔科技开发有限公司 Inquiry is performed using semantic restriction relation
CN103380426A (en) * 2011-02-16 2013-10-30 英派尔科技开发有限公司 Performing queries using semantically restricted relations
CN102681982A (en) * 2012-03-15 2012-09-19 上海云叟网络科技有限公司 Method for automatically recognizing semanteme of natural language sentences understood by computer
CN102722569A (en) * 2012-05-31 2012-10-10 浙江理工大学 Knowledge discovery device based on path migration of RDF (Resource Description Framework) picture and method
CN102722569B (en) * 2012-05-31 2014-10-22 浙江理工大学 Knowledge discovery device based on path migration of RDF (Resource Description Framework) picture and method
CN102779161B (en) * 2012-06-14 2015-03-04 杜小勇 Semantic labeling method based on resource description framework (RDF) knowledge base
CN102779161A (en) * 2012-06-14 2012-11-14 杜小勇 Semantic labeling method based on resource description framework (RDF) knowledge base
CN104111917B (en) * 2013-04-19 2017-04-12 富士通株式会社 Data processing device, data processing method and electronic device
CN104133916A (en) * 2014-08-14 2014-11-05 百度在线网络技术(北京)有限公司 Search result information organizational method and device
CN105760359B (en) * 2014-11-21 2020-03-20 财团法人工业技术研究院 Question processing system and method thereof
CN105760359A (en) * 2014-11-21 2016-07-13 财团法人工业技术研究院 Question processing system and method thereof
CN104391969B (en) * 2014-12-04 2018-01-30 百度在线网络技术(北京)有限公司 Determine the method and device of user's query statement syntactic structure
CN104391969A (en) * 2014-12-04 2015-03-04 百度在线网络技术(北京)有限公司 User query statement syntactic structure determining method and device
CN104462399A (en) * 2014-12-11 2015-03-25 北京百度网讯科技有限公司 Search result processing method and search result processing device
CN104462399B (en) * 2014-12-11 2018-04-20 北京百度网讯科技有限公司 The processing method and processing device of search result
CN106126545A (en) * 2016-06-15 2016-11-16 北京智能管家科技有限公司 Distributed fission querying method and device
CN106446018B (en) * 2016-08-29 2020-02-04 北京百度网讯科技有限公司 Query information processing method and device based on artificial intelligence
CN106446018A (en) * 2016-08-29 2017-02-22 北京百度网讯科技有限公司 Artificial intelligence-based query information processing method and device
CN109074353A (en) * 2016-10-10 2018-12-21 微软技术许可有限责任公司 The combination of language understanding and information retrieval
CN109074353B (en) * 2016-10-10 2022-11-08 微软技术许可有限责任公司 Method, device and system for information retrieval
CN107169043A (en) * 2017-04-24 2017-09-15 成都准星云学科技有限公司 A kind of knowledge point extraction method and system based on model answer
CN107391574A (en) * 2017-06-19 2017-11-24 福建工程学院 A kind of Chinese ambiguity partition method based on body and swarm intelligence algorithm
CN107391574B (en) * 2017-06-19 2020-10-16 福建工程学院 Chinese ambiguity segmentation method based on ontology and group intelligent algorithm
CN107729350A (en) * 2017-08-29 2018-02-23 百度在线网络技术(北京)有限公司 Route quality querying method, device, equipment and storage medium
CN110569335A (en) * 2018-03-23 2019-12-13 百度在线网络技术(北京)有限公司 triple verification method and device based on artificial intelligence and storage medium
US11275810B2 (en) 2018-03-23 2022-03-15 Baidu Online Network Technology (Beijing) Co., Ltd. Artificial intelligence-based triple checking method and apparatus, device and storage medium
CN108932278B (en) * 2018-04-28 2021-05-18 厦门快商通信息技术有限公司 Man-machine conversation method and system based on semantic framework
CN108932278A (en) * 2018-04-28 2018-12-04 厦门快商通信息技术有限公司 Interactive method and system based on semantic frame
WO2019205705A1 (en) * 2018-04-28 2019-10-31 厦门快商通信息技术有限公司 Semantic-framework-based human-machine conversation method and system
CN110209781A (en) * 2018-08-13 2019-09-06 腾讯科技(深圳)有限公司 A kind of text handling method, device and relevant device
CN110209781B (en) * 2018-08-13 2023-04-07 腾讯科技(深圳)有限公司 Text processing method and device and related equipment
CN112823332A (en) * 2018-10-10 2021-05-18 N3有限责任公司 Semantic industry terminology
CN111078710A (en) * 2019-12-30 2020-04-28 凌祺云 Teaching auxiliary system construction method based on knowledge cross-correlation
CN111078710B (en) * 2019-12-30 2023-10-20 凌祺云 Knowledge cross-correlation-based teaching auxiliary system construction method
CN111651570A (en) * 2020-05-13 2020-09-11 深圳追一科技有限公司 Text sentence processing method and device, electronic equipment and storage medium
CN112765363A (en) * 2021-01-19 2021-05-07 昆明理工大学 Demand map construction method for scientific and technological service demand

Similar Documents

Publication Publication Date Title
CN101414310A (en) Method and apparatus for searching natural language
Ruiz-Casado et al. Automatising the learning of lexical patterns: An application to the enrichment of wordnet by extracting semantic relationships from wikipedia
Habernal et al. SWSNL: semantic web search using natural language
CN104679867B (en) Address method of knowledge processing and device based on figure
US20130232147A1 (en) Generating a taxonomy from unstructured information
Brown et al. Mechanized margin to digitized center: black feminism's contributions to combatting erasure within the digital humanities
Wang et al. NLP-based query-answering system for information extraction from building information models
Sriharee An ontology-based approach to auto-tagging articles
Humbel et al. Named-entity recognition for early modern textual documents: a review of capabilities and challenges with strategies for the future
Al-Zoghby et al. Semantic relations extraction and ontology learning from Arabic texts—a survey
Ardissono et al. Exploration of cultural heritage information via textual search queries
Shi et al. Extraction of geospatial information on the Web for GIS applications
CN114091454A (en) Method for extracting place name information and positioning space in internet text
Saeeda et al. Entity linking and lexico-semantic patterns for ontology learning
Kefalidis et al. Benchmarking geospatial question answering engines using the dataset GeoQuestions1089
Mitkov Computational Phraseology light: automatic translation of multiword expressions without translation resources
García-Pablos et al. OpeNER: open tools to perform natural language processing on accommodation reviews
Chiarcos et al. Building a Linked Open Data cloud of linguistic resources: Motivations and developments
Nys et al. A semantic retrieval system in remote sensing web platforms
Varga et al. LELA-A natural language processing system for Romanian tourism
Garcıa-Pablos et al. OpeNER: Open tools to perform natural language processing on accommodation
Khalil et al. Challenges in information retrieval from unstructured arabic data
Yilahun et al. Ontology expansion based on UWN reusability
Diosteanu et al. Natural language processing applied in itinerary recommender systems
Spector Architecting knowledge middleware

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20090422