CN104850554A - Searching method and system - Google Patents

Searching method and system Download PDF

Info

Publication number
CN104850554A
CN104850554A CN201410051875.4A CN201410051875A CN104850554A CN 104850554 A CN104850554 A CN 104850554A CN 201410051875 A CN201410051875 A CN 201410051875A CN 104850554 A CN104850554 A CN 104850554A
Authority
CN
China
Prior art keywords
word
semantic
entity
string
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410051875.4A
Other languages
Chinese (zh)
Other versions
CN104850554B (en
Inventor
张友书
张坤
张阔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201410051875.4A priority Critical patent/CN104850554B/en
Publication of CN104850554A publication Critical patent/CN104850554A/en
Application granted granted Critical
Publication of CN104850554B publication Critical patent/CN104850554B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a searching method and system. The method comprises the steps of when an inquired string of words is received, performing semantic analysis on the inquired string of words so as to obtain a semantic expression corresponding to the inquired string of words; performing matching analysis in combination with the semantic expression, and determining a semantic tag of each word of the current inquired string of words; rewriting the inquired string of words according to the semantic tags; and searching with the rewritten inquired string of words to obtain matched network information. According to the method, semantic analysis is performed on the inquired string of words so as to obtain the semantic expression, then the semantic tag of each word of the semantic expression conforming to a current context is determined, and the inquired string of words is rewritten according to the semantic tags so as to further conform to the intention of a user, so that the success rate of information matching in searching is improved, and the searching quality and searching efficiency are improved.

Description

A kind of searching method and system
Technical field
The application relates to field of computer technology, particularly relates to a kind of searching method and a kind of search system.
Background technology
Inquiry rewriting is in search engine query process, rewrites the original query word of user's input, to return better Search Results.In prior art, inquiry is rewritten and is mainly corrected user's input error.As: user inputs " walking conclusion ", " zoujielun " or " zhoujielun ", and search engine is difficult to find correct webpage to user.After error correction is carried out to inquiry, namely analyze in error correcting model according to " zoujielun ", analyze in the result obtained, the large percentage of the result of the text matches that " Zhou Jielun " is corresponding, then be modified as the query word " Zhou Jielun " meeting user and be originally intended to, when user's nonintervention, search engine just can return the webpage meeting user view, promotes Consumer's Experience.
Existing Webpage search technology is mainly inquired about based on keyword.When information searched in user input query word, search engine carries out Chinese word segmentation to query word, convert query word to several keyword, then go to search in the inverted index storehouse of webpage, the webpage of hit key word is returned, adopt certain sort algorithm again, the webpage from several aspects such as the degree of correlation, ageing, user views to hit sorts, and in order these web page interlinkages is returned to user.
The existing search technique based on keyword, the i.e. retrieval mode of this dependence string matching of " query word-> key word-> searches ", simply query word is carried out cutting, the a part of information of easy loss, depart from the intention of user, thus effective result cannot be obtained by key word.
Such as, as shown in Figure 1, time search engine retrieving query word " whose son Xie Tingfeng is ", the key word obtained after participle is " Xie Tingfeng ", " who ", " son ", these three key words are used to retrieve, because the frequency of occurrences of " lucas " in network wants high more than the frequency that " Xie Xian " occurs, then with the webpage relying on merely text matches to return be mostly describe " son of Xie Tingfeng ", namely the webpage that lucas is relevant, so simple rely on mate the Search Results that obtains corresponding to be matched to power often lower, be difficult to meet consumers' demand.
Summary of the invention
Technical problems to be solved in this application are to provide a kind of searching method and system, be matched to power low, be difficult to the problem of meeting consumers' demand in solution prior art for the Search Results occurred in the answer search procedure of problem.
In order to solve the problem, this application discloses a kind of searching method, comprising:
When receiving query word string, semantic analysis being carried out to described query word string, obtaining the semantic formula that described query word string is corresponding;
Carry out the matching analysis in conjunction with described semantic formula, determine the semantic label belonging to each word in current described query word string;
According to institute's semantic tags, described query word string is rewritten;
Search for revised query word string, obtain the network information of mating.
Preferably, described when receiving query word string, carry out semantic analysis to described query word string, the step obtaining semantic formula corresponding to described query word string comprises:
Entity word corresponding to described query word string is searched in the entity word list that knowledge base is preset;
Attribute word corresponding to described query word string is searched in the attribute word list that knowledge base is preset.
Preferably, describedly determine that the step of the semantic label in current described query word string belonging to each word comprises:
Extract the preset semantic label of described attribute word;
One or more original semantic label is marked to described entity word;
Judge the described entity word being labeled original semantic label respectively, whether there is predefined incidence relation with the described attribute word being labeled semantic label; If so, the semantic label of original semantic label belonging to current described entity word that there is described predefine incidence relation is then determined.
Preferably, describedly according to institute's semantic tags, the step that described query word string is rewritten to be comprised:
Institute's semantic tags is adopted to search preset identified entities word;
Described entity word is replaced with preset identified entities word;
And/or,
Described attribute word is replaced with preset identity property word;
And/or,
Judge whether described query word string meets the syntactic rule of reverse expression; If so, the corresponding preset expression formula corresponding in the syntactic rule meeting forward expression of server stores is then obtained; Described preset expression formula has frequency of usage;
When the frequency of usage of described preset expression formula is higher than predetermined threshold value, the syntactic rule that described query word string is expressed according to forward is rewritten.
Preferably, described identified entities word is have identical semantic label with described entity word, and the entity word that frequency of usage is maximum;
Described identity property word is for describe same class entity word with described attribute word, and the attribute word that frequency of usage is maximum.
Preferably, the described step judging whether described query word string meets the syntactic rule of reverse expression comprises:
Syntactic analysis is carried out to described query word string, obtains subject and modifier, and, the dependence between described subject and described modifier; Described dependence comprises the dependence that described subject relies on described qualifier;
When described subject be described entity word, described qualifier is described attribute word, and described dependence is described subject when relying on the dependence of described qualifier, then described query word string meets the syntactic rule of reverse expression.
Disclosed herein as well is a kind of search system, comprising:
Part of speech parsing module, for when receiving query word string, carrying out semantic analysis to described query word string, obtaining the semantic formula that described query word string is corresponding;
Semantic label determination module, for carrying out the matching analysis in conjunction with described semantic formula, determines the semantic label belonging to each word in current described query word string;
Rewrite module, for rewriting described query word string according to institute's semantic tags;
Enquiry module, for searching for revised query word string, obtains the network information of mating.
Preferably, described part of speech parsing module comprises:
Entity word searches module, for searching entity word corresponding to described query word string in the entity word list that knowledge base is preset;
Module searched in attribute word, for searching attribute word corresponding to described query word string in the attribute word list that knowledge base is preset.
Preferably, institute's semantic tags determination module comprises:
Extract submodule, for extracting the preset semantic label of described attribute word;
Mark submodule, for marking one or more original semantic label to described entity word;
Whether incidence relation judge module, for sentencing the described entity word being labeled original semantic label respectively, exist predefined incidence relation with the described attribute word being labeled semantic label; If so, then call and determine submodule;
Determine submodule, for determining the semantic label of original semantic label belonging to current described entity word that there is described predefine incidence relation.
Preferably, described rewriting module comprises:
Submodule searched in identified entities word, and for adopting, semantic tags searches preset identified entities word;
Submodule replaced in identified entities word, for described entity word is replaced with preset identified entities word;
And/or,
Submodule replaced in identity property word, for described attribute word is replaced with preset identity property word;
And/or,
Reverse expression judges submodule, for judging whether described query word string meets reverse expression syntactic rule; If so, then call preset expression formula and obtain submodule;
Preset expression formula obtains submodule, for obtaining the corresponding preset expression formula meeting forward expression syntactic rule in server stores; Described preset expression formula has frequency of usage;
Forward is expressed and is rewritten submodule, for when the frequency of usage of described preset expression formula is higher than predetermined threshold value, described query word string is expressed syntactic rule according to forward and rewrites.
Preferably, described identified entities word is have identical semantic label with described entity word, and the entity word that frequency of usage is maximum;
Described identity property word is for describe same class entity word with described attribute word, and the attribute word that frequency of usage is maximum.
Preferably, described reverse expression judges that submodule comprises:
Syntactic analysis submodule, for carrying out syntactic analysis to described query word string, obtains subject and modifier, and, the dependence between described subject and described modifier; Described dependence comprises the dependence that described subject relies on described qualifier;
Decision sub-module, for be described entity word at described subject, described qualifier is described attribute word, and described dependence is described subject when relying on the dependence of described qualifier, then described query word string meets the syntactic rule of reverse expression.
Compared with prior art, the application comprises following advantage:
The application is by carrying out semantic analysis to query word string, obtain semantic formula, and then the semantic label in the semantic formula determining to meet current context belonging to each word, the rewriting of query word string is carried out based on this semantic label, more meet user view, making the success ratio of the information matches when searching for high, improve the quality of search and the efficiency of search.
Entity word, attribute word are rewritten as and the identified entities word of SEF, identity property word by the application, the query word falsification of the reverse expression be of little use is written as the query word string that conventional forward is expressed, improve the coverage rate of search engine search information, further increase the success ratio of information matches.
Accompanying drawing explanation
Fig. 1 is a kind of a kind of search result examples figure of prior art;
Fig. 2 is the flow chart of steps of a kind of searching method embodiment of the application;
Fig. 3 is that a kind of forward of the application expresses the exemplary plot of rewriting;
Fig. 4 is a kind of search result examples figure of the application;
Fig. 5 is the structured flowchart of a kind of search system embodiment of the application.
Embodiment
For enabling above-mentioned purpose, the feature and advantage of the application more become apparent, below in conjunction with the drawings and specific embodiments, the application is described in further detail.
Knowledge base is structuring in knowledge engineering, easy to operate, easy utilization, comprehensive organized knowledge cluster, be for a certain (or the some) needs that field question solves, adopt certain (or some) knowledge representation mode to store in computer memory, organize, the knowledge sheet set interknited of management and.These knowledge sheets comprise the knowwhy relevant to field, factual data, the heuristic knowledge obtained by expertise, as definition relevant in certain field, theorem and algorithm and common sense knowledge etc.
One of core concept of the application is, the rewriting of grammaticalness specification is carried out to query word string in knowledge based storehouse, to obtain the Search Results more comprehensively meeting user view.
With reference to Fig. 2, show the flow chart of steps of a kind of searching method embodiment of the application.
Step 201, when receiving query word string, carrying out semantic analysis to described query word string, obtaining the semantic formula that described query word string is corresponding;
The phrase that query word string can input in client (webpage of such as search engine, search plug-in unit of browser etc.) for user or sentence, search for associated information for asking.
For query word string, need to carry out semantic analysis, specifically can comprise and judge whether query word string exceedes default length, carries out participle etc. to query word string, then identify the entity word in query word string and attribute word.
In a preferred embodiment of the present application, described step 201 specifically can comprise following sub-step:
Sub-step S11, searches attribute word corresponding to described query word string in the attribute word list that knowledge base is preset;
Sub-step S12, searches entity word corresponding to described query word string in the entity word list that knowledge base is preset.
Application the embodiment of the present application, can build knowledge base according to the data analysis captured in the whole network in advance.Particularly, entity word list and the list of attribute word can be stored in knowledge base.
In entity word list, the entity word gathered in advance can be recorded; In the list of attribute word, the attribute word gathered in advance can be recorded.
Based on resource description framework (Resource Description Framework, RDF), the i.e. data model of Internet resources object and relation therebetween, can describe various resource and the relation between them by the tlv triple of employing shape as " entity-attribute-value ".
1, entity: corresponding one concrete individuality, in star's classification, such as Liu De China, Zhang Baizhi, Lin Qingxia etc., also comprise the individuality of some wide in range representative classifications, such as people, film star, singer etc.
2, attribute: be exactly the characteristic that entity comprises, except comprising Property Name, each attribute also has the categorical variable of a reactive nature Value Types, such as [height: length], [age: integer], [date of birth: date] etc.
3, property value: the value corresponding with attribute, such as 168cm(height), 87kg(body weight) etc., this part knowledge namely in knowledge base.Property value also can record the source of knowledge, for helping the reliability of user's judgemental knowledge.
Wherein, attribute word can by obtaining by excavating webpage and searching for daily record.
Based on the tlv triple " entity-attribute-value " of RDF, if entity is " Liu Dehua ", attribute " wife's relation ", value is " Zhu Liqian ", can find out the attribute word of description " conjugal relation " in the following way:
1, by excavating webpage and search daily record, the text fragment between entity and value is obtained.Such as, " the wife Zhu Liqian of Liu Dehua ", " the Mrs Zhu Li of Liu Dehua is pretty ", " the wife Xu Fan of Feng little Gang ".
2, the frequency of usage of text fragment between single " entity-value " is added up.Such as, the frequency of usage of " the wife Zhu Liqian of Liu Dehua " is 2, and the frequency of usage of " the Mrs Zhu Li of Liu Dehua is pretty " is 3,
The frequency of usage of " the wife Xu Fan of Feng little Gang " is 2.
3, the frequency of usage of text fragment between similar " entity-value " is added up.Such as, the frequency of usage of " the wife < value > of < entity > " is 4, and the frequency of usage of " the Mrs < value > of < entity > " is 3.
4, from text fragment, extract the attribute word exceeding default frequency threshold value.Such as, frequency threshold value is 2, and extract the text fragment of frequency of usage more than 2 as attribute word, the attribute word that just can find " wife's relation " corresponding is " wife " and " Mrs ".
Step 202, carries out the matching analysis in conjunction with described semantic formula, determines the semantic label belonging to each word in current described query word string;
To the query word string identifying entity word and attribute word, knowledge based storehouse can use context-free method to carry out syntactic analysis, obtains the incidence relation of entity word and attribute word, and then identifies the semantic label of the entity word meeting current context.
Context-free method, also known as type 2 grammar, is a kind of transformational grammar in Formal Language Theory, is used for describing context-free language.Particular by a set of grammar rule of definition, can be used for carrying out syntactic analysis, obtain the incidence relation between sentence structure and each sentence element.Particularly, grammar rule can be stored in knowledge base.
In a preferred embodiment of the present application, described step 202 specifically can comprise following sub-step:
Sub-step S21, extracts the preset semantic label of described attribute word;
Attribute word can have the semantic label determining implication, is stored in knowledge base.
Sub-step S22, marks one or more original semantic label to described entity word;
Original semantic label can for expressing the information of entity word implication.
Such as, for query word string " when Swordman is shown ", " Swordman " is entity word, and it can have a lot of original semantic labels, such as film, TV play, novel, drama, game etc.
Sub-step S23, judges the described entity word being labeled original semantic label respectively, whether there is predefined incidence relation with the described attribute word being labeled semantic label; If so, then sub-step S24 is performed;
Such as define a grammar rule < entity _ people >< attribute _ wife relation > for having incidence relation, so for query word string " wife of Liu Dehua ", the semantic formula of its correspondence can be " the wife < attribute _ wife relation > of Liu De China < entity _ people > ", by checking that < entity _ people >< attribute _ wife relation > meets grammar rule requirement, legal, namely there is predefined incidence relation, so < attribute _ wife relation > wife can be obtained depend on < entity _ people > Liu De China.
In addition, suppose there is no predefine < entity _ people >< attribute _ height >, so for " the height < attribute _ height > of Liu De China < entity _ people > " that query word string " height of Liu Dehua " identifies, just illegal, there is no predefined incidence relation.
Sub-step S24, determines the semantic label of original semantic label belonging to current described entity word that there is described predefine incidence relation.
For above-mentioned query word string " when Swordman is shown ", obtained by syntactic analysis, " when show " modification " Swordman ", the attribute that " showing " is " film " class instance can be analyzed by grammar rule, therefore can determine herein that " Swordman " is film, but not TV play, novel, game etc.
Step 203, adopts institute's semantic tags to rewrite described query word string;
In the embodiment of the present application, can to determining that the query word string that the band entity attribute after semantic label marks is rewritten, the natural language (query word string) that user inputs is rewritten into the key word to SEF, the semanteme of the Search Results natural language corresponding with query word string is mated more, improve the coverage rate of search, also improve efficiency and the quality of search.
Rewriting can be divided into two classes: a class is entity word, attribute word is replaced and rewritten, and a class is that clause replaces rewriting.
In a preferred embodiment of the present application, described step 203 specifically can comprise following sub-step:
Sub-step S31, adopts institute's semantic tags to search preset identified entities word;
Sub-step S32, replaces with preset identified entities word by described entity word;
In the embodiment of the present application, for the entity word in knowledge base and attribute word, setting up the corresponding relation of natural language querying and search engine language in advance, be documented in translation dictionary in advance, during rewriting, just can replace by looking into translation dictionary the identified entities word obtained SEF.Particularly, translation dictionary can be stored in knowledge base.
Because knowledge base is the knowledge extracted based on internet, so the web standards that can count each entity word and attribute word describes.By carrying out web standards to webpage, the steps such as identification, text extracting, Chinese word segmentation, entity word identification, the identification of attribute word are described, count the number of times that each entity word and attribute word occur in internet, thus during same entity difference is expressed, be defined as to the entity word of SEF and attribute word as identified entities word and identity property word, to improve the coverage of entity word and attribute word the internet frequency of occurrences is the highest.Such as, entity word " Shi Hengxia ", " fiery ice can youngster ", " Miss lotus " be same entities, all represent Miss lotus, add up in conjunction with context the number of times that these entity word occur in internet text, the frequency of usage that can obtain " Miss lotus " is much larger than the frequency of usage of " Shi Hengxia " and " fiery ice can youngster ".So at this time just can think, the entity word of the close friend of the search engine that " Miss lotus " this word is corresponding is " Miss lotus ", entity word " Shi Hengxia " in user's natural language querying ", fiery ice can youngster " is replaced, and can translate into identified entities word " Miss lotus ".
Namely for the embodiment of the present application, described identified entities word can for have identical semantic label with described entity word, and the entity word that frequency of usage is maximum;
And/or,
Sub-step S33, replaces with preset identity property word by described attribute word;
In the embodiment of the present application, the disposal route same with entity word can be adopted to set up the corresponding relation of natural language querying and search engine language for attribute word.
The frequency of usage of (i.e. attribute word) is described by the difference of same attribute corresponding to same class entity in internet, obtain correspondence to SEF key word as identity property word.
Namely for the embodiment of the present application, described identity property word can for describe same class entity word with described attribute word, and the attribute word that frequency of usage is maximum.
The process of rewriting is exactly a process looking into translation dictionary, such as query word string is " where Shi Hengxia is born ", after the semantic label determining current entity word, semantic formula can be " the < attribute _ birthplace the > where chivalrous < entity _ people > of Shi Heng is born ", by query translation dictionary, can obtain identified entities word corresponding to entity word " Shi Hengxia " for " Miss lotus ", where attribute word " is born " corresponding identity property word for " birthplace ".
And/or,
Sub-step S34, judges whether described query word string meets the syntactic rule of reverse expression; If so, then sub-step S35 is performed;
Reverse expression can be expressed relative with forward, both same semanteme, is the description of two opposite angles to same thing.
In a preferred embodiment of the present application, described sub-step S34 can comprise following sub-step further:
Sub-step S341, carries out syntactic analysis to described query word string, obtains subject and modifier, and, the dependence between described subject and described modifier; Described dependence comprises the dependence that described subject relies on described qualifier;
Syntactic analysis, can be according to the prompting of given grammer, derive the syntactic structure of sentence, the relation between the sentence unit that parsing sentence comprises and these sentence unit.
In specific implementation, can obtain syntactic analysis result by statistics, Main Analysis is three steps:
1, adopt the method for artificial mark, syntactic analysis mark is carried out to each sentence in the corpus gathered, and then collects storehouse of forming a complete sentence;
2, on the basis in sentence storehouse, study obtains PCFG(Probabilistic Context-freeGrammar, probability context-free grammar) model;
3, adopt PCFG model to the analysis of sentence, obtain the dependence between corresponding sentence element (subject, predicate, object, ornamental equivalent etc.) and each composition.This dependence can comprise the dependence that subject relies on qualifier, or qualifier relies on the dependence of subject.
Sub-step S342, when described subject be described entity word, described qualifier is described attribute word, and described dependence is described subject when relying on the dependence of described qualifier, then described query word string meets the syntactic rule of reverse expression.
Now, the dependence of subject dependence qualifier is the dependence that entity word depends on attribute word.
In addition, when described subject be described entity word, described qualifier is described attribute word, and described dependence is described qualifier when relying on the dependence of described subject, then described query word string meets the syntactic rule that forward is expressed.
Now, the dependence of qualifier dependence subject is the dependence that attribute word depends on entity word.Such as, in query word string " whom father of Xie Tingfeng is ", attribute word " father " depends on entity word " Xie Tingfeng ", and therefore " whom the father of Xie Xian is " meets the syntactic rule that forward is expressed; And for query word string " whose son Xie Tingfeng is ", entity word " Xie Tingfeng " depends on attribute word " son ", therefore, " whose son Xie Tingfeng is " meets the syntactic rule of reverse expression.So-called to rely on, for things current in PCFG model can not leave certain things and independently exist.Such as, in query word string " whom father of Xie Tingfeng is ", " father " can not leave " Xie Tingfeng " independent existence, then " father " depends on " Xie Tingfeng ", otherwise " Xie Tingfeng " can leave " father " and independently exist.
Sub-step S35, obtains in preset expression formula corresponding to the syntactic rule meeting forward expression of server stores; Described preset expression formula has frequency of usage;
In specific implementation, forward is expressed and the corresponding relation of reverse expression can be obtained by internet web page excavation in knowledge based storehouse.The text pair of knowledge based storehouse entity and property value, the expression formula of the expression formula that all forwards excavating entity attribute by Machine Translation Model in internet are expressed and reverse expression.
Sub-step S36, when the frequency of usage of described preset expression formula is higher than predetermined threshold value, rewrites the syntactic rule that described query word string is expressed according to forward.
In the embodiment of the present application, can count the frequency of usage of expression formula that various forward is expressed, the expression formula expressed higher than the forward of indication threshold value by frequency of usage is as the clause of SEF.
In specific implementation, the dependence that entity word in query word string can be depended on attribute word is rewritten as the dependence that attribute word depends on entity word, and then query word falsification is written as the query word string meeting the syntactic rule that forward is expressed
Such as, as shown in Figure 3, for query word string " whose son Xie Tingfeng is ", entity word " Xie Tingfeng " depends on attribute word " son ", can be found out between entity word and attribute word it is the relation of reverse expression by syntax tree analysis, the reverse expression made in advance in knowledge base and forward are expressed in mapping table, find corresponding forward and express, and the frequency of usage of correspondence.The syntax specification of the reverse expression of this example is " whose < attribute _ people _ son > < entity _ people > is ", and the syntax specification that corresponding forward is expressed is " whom the < attribute _ people _ father > of < entity _ people > is ".Further, can be " Xie Tingfeng " by searching in translation dictionary the identified entities word obtaining search engine corresponding to entity word " Xie Tingfeng ", searching the SEF word that translation translation dictionary obtains attribute word " < attribute _ people _ father > " corresponding is " father " (i.e. identity property word), the syntactic rule adopting identified entities word and identity property word to express according to forward is rewritten, the query word string obtaining final rewriting is " whom the father of Xie Tingfeng is ", original " whose son Xie Tingfeng is " is replaced to search for revised query word string " whom father of Xie Tingfeng is ", obtain and thank to virtuous relevant webpage.
It should be noted that, rewriting (corresponding to sub-step S34, sub-step S35 and sub-step S36) for the rewriting (corresponding to sub-step S31 and sub-step S32) of entity word, the rewriting (corresponding to sub-step S33) of attribute word and clause can used aloned, also can two or three combinationally use, the embodiment of the present application is not limited this.
Step 204, searches for revised query word string, obtains the network information of mating.
After end is write in query word falsification, just can carry out the retrieval coupling of the network information.
As shown in Figure 4, application the embodiment of the present application, can be rewritten as " whom the father of Xie Tingfeng is " the query word string " whose son Xie Tingfeng is " of user's input, and then based on " whom the father of Xie Tingfeng is " search for, compared to the Search Results shown in Fig. 2, the information that the embodiment of the present application returns more meets the demand of user.
The application is by carrying out semantic analysis to the natural language in query word string, obtain semantic formula, and then the semantic label in the semantic formula determining to meet current context belonging to each word, carrying out the rewriting of query word string based on this semantic label, more meet user view, making the success ratio of the information matches when searching for high, improve the quality of search, and search efficiency is high, meet user's request, improve Consumer's Experience.
Entity word, attribute word can be rewritten as and the identified entities word of SEF, identity property word by the application, the query word falsification of the reverse expression be of little use can be written as the query word string that conventional forward is expressed, improve the coverage rate of search engine search information, the success ratio further increasing information matches is high.
For making those skilled in the art understand the application better, below provide an example to illustrate that the embodiment of the present application is applied to the specific implementation process of query word string " Renqiu somewhere ".
1, in conjunction with knowledge base, semantic analysis is carried out to described query word string " Renqiu is somewhere ", comprising:
Entity word is analyzed: by the entity word list in search knowledge base, identify " Renqiu " for entity word, type (original semantic label) is " people ", " place name ", and semantic formula is " Renqiu < entity _ people >< entity _ place > ";
Attribute word is analyzed: by the attribute word list in search knowledge base, identify " somewhere " for attribute word, type is place, after markup semantics label, is expressed as " somewhere < attribute _ place _ position > ",
The semantic formula that then query word string is corresponding is " Renqiu < entity _ people >< entity _ place > is < attribute _ place _ position > somewhere ".
3, carry out the matching analysis in conjunction with described semantic formula: first carry out syntactic analysis, obtain attribute word " somewhere " and depend on entity word " Renqiu ", " Renqiu " has two types: " people " and " place name ".By checking the type consistance of entity word and attribute word, attribute word " somewhere " and the total type of entity word " Renqiu " are < place >, thus determine that the semantic label of current entity word " Renqiu " is for " place ".Can obtain the result after semantic label analysis like this, be " Renqiu < entity _ place > is < attribute _ place _ position > somewhere ";
4, according to institute's semantic tags, described query word string is rewritten:
A) the identified entities word of query entity word and SEF corresponding to attribute word and identity property word.By searching translation dictionary, obtain the identified entities word " Renqiu City " that entity word " Renqiu " is corresponding, the identity property word " geographic position " that attribute word " somewhere " is corresponding;
B) entity in query word string and attribute are replaced with the friendly word (i.e. identified entities word and identity property word) of search engine, obtain revised query word string " Renqiu City geographic position ";
5, use " Renqiu City geographic position " to search for as revised query word string, and return results to user.
Be appreciated that, for embodiment of the method, in order to simple description, therefore it is all expressed as a series of combination of actions, but those skilled in the art should know, the embodiment of the present application is not by the restriction of described sequence of movement, because according to the embodiment of the present application, some step can adopt other orders or carry out simultaneously.Secondly, those skilled in the art also should know, the embodiment described in instructions all belongs to preferred embodiment, and involved action and module might not be that the embodiment of the present application is necessary.
With reference to Fig. 5, it illustrates the structured flowchart of a kind of search system embodiment of the application, specifically can comprise as lower module:
Part of speech parsing module 501, for when receiving query word string, carrying out semantic analysis to described query word string, obtaining the semantic formula that described query word string is corresponding;
Semantic label determination module 502, for carrying out the matching analysis in conjunction with described semantic formula, determines the semantic label belonging to each word in current queries word string;
Rewrite module 503, for rewriting described query word string according to institute's semantic tags;
Enquiry module 504, for searching for revised query word string, obtains the network information of mating.
In a preferred embodiment of the present application, described part of speech parsing module 501 can comprise following submodule:
Entity word searches module, for searching entity word corresponding to described query word string in the entity word list that knowledge base is preset;
Module searched in attribute word, for searching attribute word corresponding to described query word string in the attribute word list that knowledge base is preset.
In a preferred embodiment of the present application, institute's semantic tags determination module 502 can comprise following submodule:
Extract submodule, for extracting the preset semantic label of described attribute word;
Mark submodule, for marking one or more original semantic label to described entity word;
Whether incidence relation judge module, for judging the described entity word being labeled original semantic label respectively, exist predefined incidence relation with the described attribute word being labeled semantic label; If so, then call and determine submodule;
Determine submodule, for determining the semantic label of original semantic label belonging to current described entity word that there is described predefine incidence relation.
In a preferred embodiment of the present application, described rewriting module 503 can comprise following submodule:
Submodule searched in identified entities word, and for adopting, semantic tags searches preset identified entities word;
Submodule replaced in identified entities word, for described entity word is replaced with preset identified entities word;
And/or,
Submodule replaced in identity property word, for described attribute word is replaced with preset identity property word;
And/or,
Reverse expression judges submodule, for judging whether described query word string meets reverse expression syntactic rule; If so, then call preset expression formula and obtain submodule;
Preset expression formula obtains submodule, for obtaining the corresponding preset expression formula meeting forward expression syntactic rule in server stores; Described preset expression formula has frequency of usage;
Forward is expressed and is rewritten submodule, for when the frequency of usage of described preset expression formula is higher than predetermined threshold value, described query word string is expressed syntactic rule according to forward and rewrites.
In a preferred embodiment of the present application, described identified entities word can for have identical semantic label with described entity word, and the entity word that frequency of usage is maximum;
Described identity property word can for describe same class entity word with described attribute word, and the attribute word that frequency of usage is maximum.
In a preferred embodiment of the present application, described reverse expression judges that submodule can comprise following submodule further:
Syntactic analysis submodule, for carrying out syntactic analysis to described query word string, obtains subject and modifier, and, the dependence between described subject and described modifier; Described dependence comprises the dependence that described subject relies on described qualifier;
Decision sub-module, for be described entity word at described subject, described qualifier is described attribute word, and described dependence is described subject when relying on the dependence of described qualifier, then described query word string meets the syntactic rule of reverse expression.
For system embodiment, due to itself and embodiment of the method basic simlarity, so description is fairly simple, relevant part illustrates see the part of embodiment of the method.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, and what each embodiment stressed is the difference with other embodiments, between each embodiment identical similar part mutually see.
The application can be used in numerous general or special purpose computing system environment or configuration.Such as: personal computer, server computer, handheld device or portable set, laptop device, multicomputer system, system, network PC, small-size computer, mainframe computer, the distributed computing environment comprising above any system or equipment etc. based on microprocessor.The application is preferably applied in embedded system.
Finally, also it should be noted that, in this article, the such as relational terms of first and second grades and so on is only used for an entity or operation to separate with another entity or operational zone, and not necessarily requires or imply the relation that there is any this reality between these entities or operation or sequentially.
The application describes with reference to according to the process flow diagram of the method for the embodiment of the present application, equipment (system) and computer program and/or block scheme.Should understand can by the combination of the flow process in each flow process in computer program instructions realization flow figure and/or block scheme and/or square frame and process flow diagram and/or block scheme and/or square frame.These computer program instructions can being provided to the processor of multi-purpose computer, special purpose computer, Embedded Processor or other programmable data processing device to produce a machine, making the instruction performed by the processor of computing machine or other programmable data processing device produce device for realizing the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
These computer program instructions also can be stored in can in the computer-readable memory that works in a specific way of vectoring computer or other programmable data processing device, the instruction making to be stored in this computer-readable memory produces the manufacture comprising command device, and this command device realizes the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
These computer program instructions also can be loaded in computing machine or other programmable data processing device, make on computing machine or other programmable devices, to perform sequence of operations step to produce computer implemented process, thus the instruction performed on computing machine or other programmable devices is provided for the step realizing the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
Although described the preferred embodiment of the application, those skilled in the art once obtain the basic creative concept of cicada, then can make other change and amendment to these embodiments.So claims are intended to be interpreted as comprising preferred embodiment and falling into all changes and the amendment of the application's scope.
A kind of searching method above the application provided and a kind of search system, be described in detail, apply specific case herein to set forth the principle of the application and embodiment, the explanation of above embodiment is just for helping method and the core concept thereof of understanding the application; Meanwhile, for one of ordinary skill in the art, according to the thought of the application, all will change in specific embodiments and applications, in sum, this description should not be construed as the restriction to the application.

Claims (12)

1. a searching method, is characterized in that, comprising:
When receiving query word string, semantic analysis being carried out to described query word string, obtaining the semantic formula that described query word string is corresponding;
Carry out the matching analysis in conjunction with described semantic formula, determine the semantic label belonging to each word in current described query word string;
According to institute's semantic tags, described query word string is rewritten;
Search for revised query word string, obtain the network information of mating.
2. method according to claim 1, is characterized in that, described when receiving query word string, carries out semantic analysis to described query word string, and the step obtaining semantic formula corresponding to described query word string comprises:
Entity word corresponding to described query word string is searched in the entity word list that knowledge base is preset;
Attribute word corresponding to described query word string is searched in the attribute word list that knowledge base is preset.
3. method according to claim 2, is characterized in that, describedly determines that the step of the semantic label in current described query word string belonging to each word comprises:
Extract the preset semantic label of described attribute word;
One or more original semantic label is marked to described entity word;
Judge the described entity word being labeled original semantic label respectively, whether there is predefined incidence relation with the described attribute word being labeled semantic label; If so, the semantic label of original semantic label belonging to current described entity word that there is described predefine incidence relation is then determined.
4. the method according to claim 1 or 2 or 3, is characterized in that, describedly comprises the step that described query word string is rewritten according to institute's semantic tags:
Institute's semantic tags is adopted to search preset identified entities word;
Described entity word is replaced with preset identified entities word;
And/or,
Described attribute word is replaced with preset identity property word;
And/or,
Judge whether described query word string meets the syntactic rule of reverse expression; If so, the corresponding preset expression formula corresponding in the syntactic rule meeting forward expression of server stores is then obtained; Described preset expression formula has frequency of usage;
When the frequency of usage of described preset expression formula is higher than predetermined threshold value, the syntactic rule that described query word string is expressed according to forward is rewritten.
5. method according to claim 4, is characterized in that, described identified entities word is have identical semantic label with described entity word, and the entity word that frequency of usage is maximum;
Described identity property word is for describe same class entity word with described attribute word, and the attribute word that frequency of usage is maximum.
6. method according to claim 4, is characterized in that, the described step judging whether described query word string meets the syntactic rule of reverse expression comprises:
Syntactic analysis is carried out to described query word string, obtains subject and modifier, and, the dependence between described subject and described modifier; Described dependence comprises the dependence that described subject relies on described qualifier;
When described subject be described entity word, described qualifier is described attribute word, and described dependence is described subject when relying on the dependence of described qualifier, then described query word string meets the syntactic rule of reverse expression.
7. a search system, is characterized in that, comprising:
Part of speech parsing module, for when receiving query word string, carrying out semantic analysis to described query word string, obtaining the semantic formula that described query word string is corresponding;
Semantic label determination module, for carrying out the matching analysis in conjunction with described semantic formula, determines the semantic label belonging to each word in current described query word string;
Rewrite module, for rewriting described query word string according to institute's semantic tags;
Enquiry module, for searching for revised query word string, obtains the network information of mating.
8. system according to claim 7, is characterized in that, described part of speech parsing module comprises:
Entity word searches module, for searching entity word corresponding to described query word string in the entity word list that knowledge base is preset;
Module searched in attribute word, for searching attribute word corresponding to described query word string in the attribute word list that knowledge base is preset.
9. system according to claim 8, is characterized in that, institute's semantic tags determination module comprises:
Extract submodule, for extracting the preset semantic label of described attribute word;
Mark submodule, for marking one or more original semantic label to described entity word;
Whether incidence relation judge module, for sentencing the described entity word being labeled original semantic label respectively, exist predefined incidence relation with the described attribute word being labeled semantic label; If so, then call and determine submodule;
Determine submodule, for determining the semantic label of original semantic label belonging to current described entity word that there is described predefine incidence relation.
10. the system described according to Claim 8 or 9 or 10, is characterized in that, described rewriting module comprises:
Submodule searched in identified entities word, and for adopting, semantic tags searches preset identified entities word;
Submodule replaced in identified entities word, for described entity word is replaced with preset identified entities word;
And/or,
Submodule replaced in identity property word, for described attribute word is replaced with preset identity property word;
And/or,
Reverse expression judges submodule, for judging whether described query word string meets reverse expression syntactic rule; If so, then call preset expression formula and obtain submodule;
Preset expression formula obtains submodule, for obtaining the corresponding preset expression formula meeting forward expression syntactic rule in server stores; Described preset expression formula has frequency of usage;
Forward is expressed and is rewritten submodule, for when the frequency of usage of described preset expression formula is higher than predetermined threshold value, described query word string is expressed syntactic rule according to forward and rewrites.
11. systems according to claim 10, is characterized in that, described identified entities word is have identical semantic label with described entity word, and the entity word that frequency of usage is maximum;
Described identity property word is for describe same class entity word with described attribute word, and the attribute word that frequency of usage is maximum.
12. systems according to claim 10, is characterized in that, described reverse expression judges that submodule comprises:
Syntactic analysis submodule, for carrying out syntactic analysis to described query word string, obtains subject and modifier, and, the dependence between described subject and described modifier; Described dependence comprises the dependence that described subject relies on described qualifier;
Decision sub-module, for be described entity word at described subject, described qualifier is described attribute word, and described dependence is described subject when relying on the dependence of described qualifier, then described query word string meets the syntactic rule of reverse expression.
CN201410051875.4A 2014-02-14 2014-02-14 Searching method and system Active CN104850554B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410051875.4A CN104850554B (en) 2014-02-14 2014-02-14 Searching method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410051875.4A CN104850554B (en) 2014-02-14 2014-02-14 Searching method and system

Publications (2)

Publication Number Publication Date
CN104850554A true CN104850554A (en) 2015-08-19
CN104850554B CN104850554B (en) 2020-05-19

Family

ID=53850201

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410051875.4A Active CN104850554B (en) 2014-02-14 2014-02-14 Searching method and system

Country Status (1)

Country Link
CN (1) CN104850554B (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138631A (en) * 2015-08-20 2015-12-09 小米科技有限责任公司 Knowledge base construction method and device
CN106227876A (en) * 2016-08-02 2016-12-14 百度在线网络技术(北京)有限公司 A kind of activity schedule aid decision-making method and device
CN106294638A (en) * 2016-08-02 2017-01-04 百度在线网络技术(北京)有限公司 A kind of aid decision-making method and device
CN106528676A (en) * 2016-10-31 2017-03-22 北京百度网讯科技有限公司 Entity semantic retrieval processing method and device based on artificial intelligence
CN107203548A (en) * 2016-03-17 2017-09-26 阿里巴巴集团控股有限公司 Attribute acquisition methods and device
CN108256070A (en) * 2018-01-17 2018-07-06 北京百度网讯科技有限公司 For generating the method and apparatus of information
CN108388650A (en) * 2018-02-28 2018-08-10 百度在线网络技术(北京)有限公司 Need-based search processing method, device and smart machine
CN108959257A (en) * 2018-06-29 2018-12-07 北京百度网讯科技有限公司 A kind of natural language analytic method, device, server and storage medium
CN109558479A (en) * 2018-11-29 2019-04-02 北京羽扇智信息科技有限公司 Rule matching method, device, equipment and storage medium
CN109684448A (en) * 2018-12-17 2019-04-26 北京北大软件工程股份有限公司 A kind of intelligent answer method
CN109684357A (en) * 2018-12-21 2019-04-26 上海智臻智能网络科技股份有限公司 Information processing method and device, storage medium, terminal
CN109857853A (en) * 2019-01-28 2019-06-07 掌阅科技股份有限公司 Searching method, electronic equipment and computer storage medium based on e-book
CN110059113A (en) * 2018-01-08 2019-07-26 国际商业机器公司 The problem of knowledge based figure, corrects
CN110612563A (en) * 2017-05-18 2019-12-24 三菱电机株式会社 Search device, tag generation device, query generation device, confidential search system, search program, tag generation program, and query generation program
CN111666479A (en) * 2019-03-06 2020-09-15 富士通株式会社 Method for searching web page and computer readable storage medium
CN113807102A (en) * 2021-08-20 2021-12-17 北京百度网讯科技有限公司 Method, device, equipment and computer storage medium for establishing semantic representation model
CN113868312A (en) * 2021-10-13 2021-12-31 上海市研发公共服务平台管理中心 Multi-method fused mechanism matching method, device, equipment and storage medium
CN113919360A (en) * 2020-07-09 2022-01-11 阿里巴巴集团控股有限公司 Semantic understanding method, voice interaction method, device, equipment and storage medium
CN115576435A (en) * 2022-12-12 2023-01-06 深圳市人马互动科技有限公司 Intention processing method and related device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1628298A (en) * 2002-05-28 2005-06-15 弗拉迪米尔·叶夫根尼耶维奇·涅博利辛 Method for synthesising self-learning system for knowledge acquistition for retrieval systems
US7840547B1 (en) * 2004-03-31 2010-11-23 Google Inc. Methods and systems for efficient query rewriting
CN102117285A (en) * 2009-12-30 2011-07-06 安世亚太科技(北京)有限公司 Search method based on semantic indexing
CN102236664A (en) * 2010-04-28 2011-11-09 百度在线网络技术(北京)有限公司 Retrieval system, retrieval method and information processing method based on semantic normalization
CN102622342A (en) * 2011-01-28 2012-08-01 上海肇通信息技术有限公司 Interlanguage system and interlanguage engine and interlanguage translation system and corresponding method
CN103425714A (en) * 2012-05-25 2013-12-04 北京搜狗信息服务有限公司 Query method and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1628298A (en) * 2002-05-28 2005-06-15 弗拉迪米尔·叶夫根尼耶维奇·涅博利辛 Method for synthesising self-learning system for knowledge acquistition for retrieval systems
US7840547B1 (en) * 2004-03-31 2010-11-23 Google Inc. Methods and systems for efficient query rewriting
CN102117285A (en) * 2009-12-30 2011-07-06 安世亚太科技(北京)有限公司 Search method based on semantic indexing
CN102236664A (en) * 2010-04-28 2011-11-09 百度在线网络技术(北京)有限公司 Retrieval system, retrieval method and information processing method based on semantic normalization
CN102622342A (en) * 2011-01-28 2012-08-01 上海肇通信息技术有限公司 Interlanguage system and interlanguage engine and interlanguage translation system and corresponding method
CN103425714A (en) * 2012-05-25 2013-12-04 北京搜狗信息服务有限公司 Query method and system

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138631A (en) * 2015-08-20 2015-12-09 小米科技有限责任公司 Knowledge base construction method and device
CN105138631B (en) * 2015-08-20 2019-10-11 小米科技有限责任公司 The construction method and device of knowledge base
US10331648B2 (en) 2015-08-20 2019-06-25 Xiaomi Inc. Method, device and medium for knowledge base construction
CN107203548A (en) * 2016-03-17 2017-09-26 阿里巴巴集团控股有限公司 Attribute acquisition methods and device
CN106227876B (en) * 2016-08-02 2020-03-10 百度在线网络技术(北京)有限公司 Activity arrangement aided decision-making method and device
CN106294638B (en) * 2016-08-02 2020-01-14 百度在线网络技术(北京)有限公司 Auxiliary decision making method and device
CN106294638A (en) * 2016-08-02 2017-01-04 百度在线网络技术(北京)有限公司 A kind of aid decision-making method and device
CN106227876A (en) * 2016-08-02 2016-12-14 百度在线网络技术(北京)有限公司 A kind of activity schedule aid decision-making method and device
CN106528676A (en) * 2016-10-31 2017-03-22 北京百度网讯科技有限公司 Entity semantic retrieval processing method and device based on artificial intelligence
CN106528676B (en) * 2016-10-31 2019-09-03 北京百度网讯科技有限公司 Entity Semantics search processing method and device based on artificial intelligence
CN110612563A (en) * 2017-05-18 2019-12-24 三菱电机株式会社 Search device, tag generation device, query generation device, confidential search system, search program, tag generation program, and query generation program
CN110612563B (en) * 2017-05-18 2023-05-12 三菱电机株式会社 Search device, hidden search system, and computer-readable storage medium
CN110059113A (en) * 2018-01-08 2019-07-26 国际商业机器公司 The problem of knowledge based figure, corrects
CN108256070A (en) * 2018-01-17 2018-07-06 北京百度网讯科技有限公司 For generating the method and apparatus of information
CN108256070B (en) * 2018-01-17 2022-07-15 北京百度网讯科技有限公司 Method and apparatus for generating information
CN108388650A (en) * 2018-02-28 2018-08-10 百度在线网络技术(北京)有限公司 Need-based search processing method, device and smart machine
CN108959257A (en) * 2018-06-29 2018-12-07 北京百度网讯科技有限公司 A kind of natural language analytic method, device, server and storage medium
CN109558479B (en) * 2018-11-29 2022-12-02 出门问问创新科技有限公司 Rule matching method, device, equipment and storage medium
CN109558479A (en) * 2018-11-29 2019-04-02 北京羽扇智信息科技有限公司 Rule matching method, device, equipment and storage medium
CN109684448B (en) * 2018-12-17 2021-01-12 北京北大软件工程股份有限公司 Intelligent question and answer method
CN109684448A (en) * 2018-12-17 2019-04-26 北京北大软件工程股份有限公司 A kind of intelligent answer method
CN109684357A (en) * 2018-12-21 2019-04-26 上海智臻智能网络科技股份有限公司 Information processing method and device, storage medium, terminal
CN109857853A (en) * 2019-01-28 2019-06-07 掌阅科技股份有限公司 Searching method, electronic equipment and computer storage medium based on e-book
CN109857853B (en) * 2019-01-28 2021-09-14 掌阅科技股份有限公司 Searching method based on electronic book, electronic equipment and computer storage medium
CN111666479A (en) * 2019-03-06 2020-09-15 富士通株式会社 Method for searching web page and computer readable storage medium
CN113919360A (en) * 2020-07-09 2022-01-11 阿里巴巴集团控股有限公司 Semantic understanding method, voice interaction method, device, equipment and storage medium
CN113807102A (en) * 2021-08-20 2021-12-17 北京百度网讯科技有限公司 Method, device, equipment and computer storage medium for establishing semantic representation model
CN113868312A (en) * 2021-10-13 2021-12-31 上海市研发公共服务平台管理中心 Multi-method fused mechanism matching method, device, equipment and storage medium
CN115576435A (en) * 2022-12-12 2023-01-06 深圳市人马互动科技有限公司 Intention processing method and related device

Also Published As

Publication number Publication date
CN104850554B (en) 2020-05-19

Similar Documents

Publication Publication Date Title
CN104850554A (en) Searching method and system
Ling et al. Deep graph matching and searching for semantic code retrieval
US10073840B2 (en) Unsupervised relation detection model training
KR101793222B1 (en) Updating a search index used to facilitate application searches
US9406020B2 (en) System and method for natural language querying
US10229200B2 (en) Linking data elements based on similarity data values and semantic annotations
CN105608232B (en) A kind of bug knowledge modeling method based on graphic data base
CN103838833A (en) Full-text retrieval system based on semantic analysis of relevant words
CN109408578B (en) Monitoring data fusion method for heterogeneous environment
CN114218400A (en) Semantic-based data lake query system and method
WO2014054052A2 (en) Context based co-operative learning system and method for representing thematic relationships
WO2022174552A1 (en) Method and apparatus for obtaining poi state information
CN103488759A (en) Method and device for searching application programs according to key words
Tajbakhsh et al. Semantic knowledge LDA with topic vector for recommending hashtags: Twitter use case
CN104281702A (en) Power keyword segmentation based data retrieval method and device
CN110765761A (en) Contract sensitive word checking method and device based on artificial intelligence and storage medium
Franzoni et al. Heuristics for semantic path search in wikipedia
CN114911893A (en) Method and system for automatically constructing knowledge base based on knowledge graph
KR101446154B1 (en) System and method for searching semantic contents using user query expansion
CN112948573B (en) Text label extraction method, device, equipment and computer storage medium
Thasleena et al. Enhanced associative classification of XML documents supported by semantic concepts
Kumar et al. Efficient structuring of data in big data
CN112100323B (en) Hidden association mining method based on representation learning
Zhang et al. Improving semi-supervised text classification by using Wikipedia knowledge
Zhang et al. An improved ontology-based web information extraction

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant