CN102999495A - Method and device for determining synonym semantics mapping relations - Google Patents

Method and device for determining synonym semantics mapping relations Download PDF

Info

Publication number
CN102999495A
CN102999495A CN2011102667849A CN201110266784A CN102999495A CN 102999495 A CN102999495 A CN 102999495A CN 2011102667849 A CN2011102667849 A CN 2011102667849A CN 201110266784 A CN201110266784 A CN 201110266784A CN 102999495 A CN102999495 A CN 102999495A
Authority
CN
China
Prior art keywords
synonym
mapping
mapping relations
word
leaf node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011102667849A
Other languages
Chinese (zh)
Other versions
CN102999495B (en
Inventor
方高林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201110266784.9A priority Critical patent/CN102999495B/en
Publication of CN102999495A publication Critical patent/CN102999495A/en
Application granted granted Critical
Publication of CN102999495B publication Critical patent/CN102999495B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a method and device for determining synonym semantics mapping relations. The method for determining the synonym semantics mapping relations comprises obtaining each group of synonym word pairs obtained by excavating document resources; determining mapping directions of two synonyms for each group of synonym word pairs; traversing all synonym word pairs, building a synonym mapping relation tree according to a determined mapping direction, and enabling a starting point and a terminal point of each group of mapping to correspond to a father node and a child node of a tree structure respectively; judging whether the convergence degree of the synonym mapping relation tree meets the preset requirements, and if the preset requirements are met, determining leaf nodes converged by the synonym mapping relation tree, and determining that the leaf nodes and other non-leaf nodes are in synonym mapping relations. According to the technical scheme, more synonym mapping relations within clusters can be obtained from limited document resources, and accordingly comprehensiveness of utilizing synonym mapping relations to recall search results is improved.

Description

A kind of synonym Semantic mapping relation is determined method and device
Technical field
The application relates to the Computer Applied Technology field, particularly relates to a kind of synonym Semantic mapping relation and determines method and device.
Background technology
Along with the development of search engine, traditional strategy based on the keyword coupling can't satisfy user's search need, and the semantic matches strategy is widely used in the modern search engines.Synonym refers to that article name is different but the identical entry of implication that implication that these entries refer to or certain senses of a dictionary entry refer to.As a kind of semantic matches resource, synonym in modern search engines in occupation of consequence.For example, " Peking University " and " Beijing University " consists of synonym, then when the user with keyword " Peking University " when searching for, search engine can also represent the resource that comprises " Beijing University " related content to the user as Search Results.
The synonym relation is often only for being present between two words, for example " Harbin Engineering University ", " Harbin Institute of Engineering institute ", " Harbin military project university ", " Harbin shipping institute ", " Harbin Institute of Technology ", " Kazakhstan military project ", " Kazakhstan boats and ships " ... these words can consist of synonym each other, for this situation, these a plurality of synonyms with same meaning of a word are called a synonym bunch.
According to existing synonym method for processing resource, be known synonym between set up in twos mapping relations.For example for above-mentioned 7 synonyms, should exist in theory
Figure BDA0000090201470000011
The group mapping relations, yet in actual applications, the synonym resource need to be excavated acquisition from a large amount of language material resources, a lot of synonym relations may be difficult to excavation and arrive, for example " Harbin military project university-Ha military project " is one group of easy synonym relation that arrives of excavating, " but Harbin shipping institute-Ha military project " such relation may be difficult to be excavated, and causes the disappearance of mapping relations in the synonym bunch, and then affects the comprehensive of Search Results.
Summary of the invention
For solving the problems of the technologies described above, the embodiment of the present application provides a kind of synonym Semantic mapping relation to determine method and device, and to improve the integrality of synonym mapping resource, technical scheme is as follows:
The application provides a kind of synonym Semantic mapping relation to determine method, comprising:
Obtain to document resources excavate obtain respectively organize synonym word pair;
For each group synonym word pair, determine two synon mapping directions;
Travel through all synonym words pair, according to determined mapping direction, set up synonym mapping relations tree, wherein, father node and child node that the starting point of every group of mapping and terminal point correspond respectively to tree structure;
Whether the degree of convergence of judging described synonym mapping relations tree satisfies preset requirement, if so, then determines the leaf node that this synonym mapping relations tree converges to, and determines that there are the synonym mapping relations in this leaf node and other nodes.
In a kind of implementation of the application, described document resources is excavated obtains synonym word pair, comprising:
In User action log, the searching key word that the user uses continuously obtains synonym word pair.
In a kind of implementation of the application, described document resources is excavated obtains synonym word pair, comprising:
In User action log, the corresponding relation of searching request and webpage clicking content obtains synonym word pair.
In a kind of implementation of the application, described document resources is excavated obtains synonym word pair, comprising:
In User action log, click and to enter the corresponding different searching request of same webpage, obtain synonym word pair.
In a kind of implementation of the application, described document resources is excavated obtains synonym word pair, comprising:
Synonym template and document content that utilization is preset mate, and obtain synonym word pair.
In a kind of implementation of the application, obtain the synonym word to after, determine synonym mapping direction before, also comprise:
The synonym word that obtains is verified carrying out the synonym relation.
In a kind of implementation of the application, described to the synonym word to carrying out synonym relation checking, comprising:
Utilize respectively two synon contextual feature word constitutive characteristic vectors, according to the similarity checking synonym relation of two proper vectors.
In a kind of implementation of the application, described definite two synon mapping directions comprise:
For two-way replaceable synonym, add up the frequency of occurrences of two synonyms in document resources, be described two synon mapping directions with low-frequency word to the orientation determination of high frequency words, described two-way replaceable synonym is: the synonym that can excavate two-way fallback relationship according to document resources.
In a kind of implementation of the application, described definite two synon mapping directions comprise:
For unidirectional replaceable synonym, be described two synon mapping directions with synon replacement orientation determination; Described unidirectional replaceable synonym is: the synonym that only can excavate unidirectional fallback relationship according to document resources.
In a kind of implementation of the application, whether the described degree of convergence of judging that described synonym mapping relations are set satisfies preset requirement, comprising:
Judge whether described synonym mapping relations tree converges on identical leaf node, if so, determine that then this synonym mapping relations tree converges to this leaf node, and determine that there are the synonym mapping relations in this leaf node and other nodes.
In a kind of implementation of the application, whether the described degree of convergence of judging that described synonym mapping relations are set satisfies preset requirement, comprising:
Judge that whether the ratio of leaf node number that occurrence number is maximum and leaf node sum is greater than the threshold value that presets; If so, then further this leaf node and other leaf nodes are carried out respectively the checking of synonym relation, if satisfy verification condition, determine that then this synonym mapping relations tree converges to the maximum leaf node of described occurrence number.
The application also provides a kind of synonym Semantic mapping relation to determine device, it is characterized in that, comprising:
The synonym word is to acquisition module, be used for obtaining to document resources excavate obtain respectively organize synonym word pair;
Mapping orientation determination module is used for determining two synon mapping directions for each group synonym word pair;
Relational tree makes up module, is used for all synonym words of traversal pair, according to determined mapping direction, sets up synonym mapping relations tree, wherein, and father node and child node that the starting point of every group of mapping and terminal point correspond respectively to tree structure;
The mapping relations determination module, be used for judging whether the degree of convergence of described synonym mapping relations tree satisfies preset requirement, if so, then determine the leaf node that this synonym mapping relations tree converges to, and determine that there are the synonym mapping relations in this leaf node and other nodes.
In a kind of implementation of the application, described synonym word is to acquisition module, and concrete configuration is:
Be used for according to User action log, the searching key word that the user uses continuously obtains synonym word pair.
In a kind of implementation of the application, described synonym word is to acquisition module, and concrete configuration is:
Be used for according to User action log the corresponding relation of searching request and webpage clicking content, acquisition synonym word pair.
In a kind of implementation of the application, described synonym word is to acquisition module, and concrete configuration is:
Be used for according to User action log, click and enter the corresponding different searching request of same webpage, obtain synonym word pair.
In a kind of implementation of the application, described synonym word is to acquisition module, and concrete configuration is:
Be used for to utilize the synonym template and the document content that preset to mate, acquisition synonym word pair.
In a kind of implementation of the application, described device also comprises:
The synonymy authentication module, be used for described synonym word acquisition module is obtained the synonym word to after, before described mapping orientation determination module determines synonym mapping direction, the synonym word that described synonym word is obtained acquisition module is to carrying out the checking of synonym relation.
In a kind of implementation of the application, described synonymy authentication module, concrete configuration is:
Be used for utilizing respectively two synon contextual feature word constitutive characteristic vectors, according to the similarity checking synonym relation of two proper vectors.
In a kind of implementation of the application, described mapping orientation determination module, concrete configuration is:
For two-way replaceable synonym, add up the frequency of occurrences of two synonyms in document resources, be described two synon mapping directions with low-frequency word to the orientation determination of high frequency words, described two-way replaceable synonym is: the synonym that can excavate two-way fallback relationship according to document resources.
In a kind of implementation of the application, described mapping orientation determination module, concrete configuration is:
For unidirectional replaceable synonym, be described two synon mapping directions with synon replacement orientation determination; Described unidirectional replaceable synonym is: the synonym that only can excavate unidirectional fallback relationship according to document resources.
In a kind of implementation of the application, described mapping relations determination module, concrete configuration is:
Be used for judging whether described synonym mapping relations tree converges on identical leaf node, if so, determine that then this synonym mapping relations tree converges to this leaf node.
In a kind of implementation of the application, described mapping relations determination module, concrete configuration is:
Be used for judging that whether the ratio of leaf node number that occurrence number is maximum and leaf node sum is greater than the threshold value that presets; If so, then further this leaf node and other leaf nodes are carried out respectively the checking of synonym relation, if satisfy verification condition, determine that then this synonym mapping relations tree converges to the maximum leaf node of described occurrence number.
The technical scheme that the application provides is set up synonym mapping relations trees according to synon mapping direction, with organizing the synonym word mode with tree structure is organized more, thereby potential mapping relations in the synonym bunch are excavated.Use the present techniques scheme, can from limited document resources, obtain mapping relations in the more synonym bunch, utilize the synonym mapping relations to recall the comprehensive of Search Results thereby improve.
Description of drawings
In order to be illustrated more clearly in the embodiment of the present application or technical scheme of the prior art, the below will do to introduce simply to the accompanying drawing of required use in embodiment or the description of the Prior Art, apparently, the accompanying drawing that the following describes only is some embodiment that put down in writing among the application, for those of ordinary skills, can also obtain according to these accompanying drawings other accompanying drawing.
Fig. 1 is the first schematic flow sheet that the embodiment of the present application synonym Semantic mapping relation is determined method;
Fig. 2 is the first synoptic diagram of the embodiment of the present application synonym mapping relations tree;
Fig. 3 is the second schematic flow sheet that the embodiment of the present application synonym Semantic mapping relation is determined method;
Fig. 4 is the second synoptic diagram of the embodiment of the present application synonym mapping relations tree;
Fig. 5 is the third synoptic diagram of the embodiment of the present application synonym mapping relations tree;
Fig. 6 is the first structural representation that the embodiment of the present application synonym Semantic mapping relation is determined device;
Fig. 7 is the second structural representation that the embodiment of the present application synonym Semantic mapping relation is determined device.
Embodiment
A kind of synonym Semantic mapping relation that at first the application is provided determines that method describes, and the method can may further comprise the steps:
Obtain to document resources excavate obtain respectively organize synonym word pair;
For each group synonym word pair, determine two synon mapping directions;
Travel through all synonym words pair, according to determined mapping direction, set up synonym mapping relations tree, wherein, father node and child node that the starting point of every group of mapping and terminal point correspond respectively to tree structure;
Whether the degree of convergence of judging described synonym mapping relations tree satisfies preset requirement, if so, then determines the leaf node that this synonym mapping relations tree converges to, and determines that there are the synonym mapping relations in this leaf node and other nodes.
The technical scheme that the application provides is set up synonym mapping relations trees according to synon mapping direction, with organizing the synonym word mode with tree structure is organized more, thereby potential mapping relations in the synonym bunch are excavated.Use the present techniques scheme, can from limited document resources, obtain mapping relations in the more synonym bunch, utilize the synonym mapping relations to recall the comprehensive of Search Results thereby improve.
In order to make those skilled in the art person understand better technical scheme among the application, below in conjunction with the accompanying drawing in the embodiment of the present application, technical scheme in the embodiment of the present application is described in detail, obviously, described embodiment only is the application's part embodiment, rather than whole embodiment.Based on the embodiment among the application, the every other embodiment that those of ordinary skills obtain should belong to the scope that the application protects.
Figure 1 shows that the process flow diagram of the definite method of a kind of synonym Semantic mapping relation of the embodiment of the present application, may further comprise the steps:
S101, obtain to document resources excavate obtain respectively organize synonym word pair;
The technical scheme that the application provides is at first obtained the synonym word to resource from existing document resources.The document resources here can be the content on the webpage, also can be the content in the text, also can be user's user behaviors log, etc.By the excavation to these document contents, can obtain a large amount of synonym word pair.
The embodiment of the present application provides following and several document content is excavated, thereby automatically obtains the right implementation of synonym word:
1) according in the User action log, the searching key word that the user uses continuously obtains synonym word pair.
In general, the user in order to obtain more Search Results, may attempt using multi-form searching key word to search for for same subject in search procedure, so, in these multi-form searching key words, just may have synonym.User's this search behavior is recorded in showing as of user behaviors log: the time of scouting interval is shorter, and searching key word itself has a same or analogous part.For example, the user adopts keyword " Nike sport footwear " and " NIKE sport footwear " to search for continuously, just can think that in this case " NIKE " and " Nike " can consist of one group of synonym.By these class data in the User action log are excavated, just can obtain a large amount of potential synonym word pair.
2) according in the User action log, the corresponding relation of searching request and webpage clicking content obtains synonym word pair.
After the user submitted a searching request to system, system can represent corresponding Search Results to the user, and the user represents then that to the click behavior of Search Results the user is to the approval of searching request and search result relevance.Can think this moment and exist synonym in user's searching request and the web page contents.For example, the user uses keyword " BJ Univ Hospital " to search for, and has further clicked the webpage that themes as " hospital of Peking University ", just can think that in this case " BJ Univ Hospital " and " hospital of Peking University " can consist of one group of synonym.Therefore, in actual applications, by collecting the content of searching request and webpage clicking specific part (for example title division), and then carry out the word alignment processing, just can obtain potential synonym word pair.
3) according in the User action log, click and to enter the corresponding different searching request of same webpage, obtain synonym word pair.
For a webpage, it may be by different approach that the user accesses this webpage.Wherein, having access in the situation of this webpage by search, thereby different users adopts different searching request to obtain this web page interlinkage and click to enter.So, in these different searching request, also may exist synonym.
For example, the webpage of corresponding Baidupedia " Sai Er number ", user behaviors log by the statistics a large number of users, can obtain the user clicks the employed high-frequency searching key word of this webpage and comprises " Sai Er number ", " Sai Er number ", " plug inferior number " etc., this class searching key word that so, frequency of utilization can be surpassed certain threshold value is all regarded potential synonym as.
4) utilize synonym template and the document content preset to mate, acquisition synonym word pair.
Except carry out the synonym excavation based on user's behavior, can also more pre-definedly be usually used in representing synon template, for example: " A is called for short B ", " the A full name is B " etc., then utilize respectively the content of these templates and document to mate, thereby obtain synonym word pair.
More than exemplified and severally carried out the schemes that synonym excavates according to existing document resources, certainly, those skilled in the art can also adopt other modes to obtain synonym word pair, and the application does not need this to limit.
In a kind of embodiment of the application, obtain the synonym word to after, can also further verify the synonym relation.Wherein, checking can be adopted artificial mode, with obvious undesirable word to getting rid of.Also can by the similarity degree of two word place language environments of contrast, realize the automatic Verification to the synonym relation.
In specific implementation process, can utilize respectively two synon contextual feature word constitutive characteristic vectors, then utilize the similarity of two vectors of cosine angle formulae calculating, if similarity greater than default threshold value, is then verified and passed through.For example, the high-frequency characteristic word that above occurs at " Nike " comprises and { likes, buy, certified products, on the net }, the high-frequency characteristic word that occurs hereinafter comprises { sport footwear, plate footwear, basketball shoes, brand, company, brand shop }, and " NIKE " is corresponding above substantially similar with context information and " Nike ", can think that therefore the synonym relation checking of " NIKE " and " Nike " is passed through.
Except the contrast contextual feature, those skilled in the art can also adopt other mode that the synonym relation is verified.For example, utilize two candidate word to search for, by the similarity of Feature Words in the contrast Search Results synonym relation is verified, etc.
S102 for each group synonym word pair, determines two synon mapping directions;
The application's scheme is to represent a plurality of synon relations with tree structure, because tree is directive, therefore, for each group synonym word pair, needs at first to determine two synon mapping directions in tree structure.
In general, from the synonym of step acquisition that S101 excavates, have majority can excavate two-way fallback relationship according to document resources, this class synonym is called two-way replaceable synonym, for example " NIKE " and " Nike ".For two-way replaceable synonym, at first add up the frequency of occurrences of these two synonyms in document resources, be two synon mapping directions with low-frequency word to the orientation determination of high frequency words then.
If in step S101, only can excavate unidirectional fallback relationship according to document resources, for example, adopt synonym word that " abbreviation ", " full name " this class template excavate pair, and this synonym word then is called unidirectional replaceable synonym with this class synonym to there not being corresponding resource to support its reverse fallback relationship.For, unidirectional replaceable synonym directly is described two synon mapping directions with synon replacement orientation determination; For example, can determine the fallback relationship of " Peking University hospital → BJ Univ Hospital " according to " hospital of Peking University is called for short the BJ Univ Hospital ", simultaneously in document resources, there are not again other information can support reverse fallback relationship, therefore, think that these two words consist of unidirectional replaceable synonym, and " Peking University hospital → BJ Univ Hospital " is defined as two synon mapping directions.
In actual applications, some synonym exists ambiguity to replace, and for example " Shandong University " and " University Of Shanxi " can replace with " mountain is large ", but conversely, " mountain is large " carries out just having ambiguity when synonym is replaced, and this class synonym can affect the accuracy of Search Results.Therefore, the application a kind of preferred embodiment in, can also judge whether the synonym that excavates exists ambiguity to replace, judge namely whether a candidate word exists a plurality of interchangeable synonyms, if, then abandon this synonym word pair, in the process of follow-up generation synonym mapping relations tree, do not use.
S103 travels through all synonym words pair, according to determined mapping direction, sets up synonym mapping relations tree, wherein, and father node and child node that the starting point of every group of mapping and terminal point correspond respectively to tree structure;
In this step, for a plurality of synonym words of determining the mapping direction pair, according to the mode of synonym cascade, adopt the traversal method of depth-first to set up synonym mapping relations tree.Concrete grammar is: at first choose any one group of synonym word pair that does not still belong at present synonym mapping relations tree, according to fixed mapping direction, will shine upon starting point as root node, shine upon terminal point as the child node of this root node.If also there are other synonyms in root node, and root node be in the mapping direction start position, then continue to set up other branches.In like manner, if also there are other synonyms in child node, and this child node is in the start position of mapping direction, then continues for this child node tree to be extended ... repeat above step, until travel through all synonym words pair.Wherein the starting point of every group of mapping and the terminal point father node and the child node that all correspond respectively to tree structure finally form a synonym Semantic mapping relational tree.
Suppose that the following eight groups of synonym words of current existence are to (the mapping direction is from front to back):
A-B,B-C,B-D,C-E,C-F,D-G,F-E,G-E,
At first select this group of A-B, according to mapping relations, as root node, B is as the child node of A with A.Then, according to B-C, B-D can determine again the set membership of B and C, B and D ... by that analogy, and the synonym mapping relations tree that finally builds up as shown in Figure 2.
S104 judges whether described synonym mapping relations tree converges on identical leaf node, if so, determines that then this synonym mapping relations tree converges to this leaf node, and determines that there are the synonym mapping relations in this leaf node and other nodes.
This step is to judge whether the leaf node of the synonym mapping relations tree that builds up at S103 is unique, and if so, then tree converges on unique leaf node, and can determine that there are the synonym mapping relations in this leaf node and other nodes this moment.
Take Fig. 2 as example, synonym mapping relations trees has 3 leaf nodes, and all is E, so this tree converges on leaf node E, can determine A-E, B-E, and C-E, D-E, there are the synonym mapping relations in F-E, G-E.And can determine further that this is set and all has each other mapping relations between all nodes.As seen, for these 7 synonyms of A~E, should exist in theory
Figure BDA0000090201470000111
The group mapping relations are removed 8 groups of relations that can arrive according to existing text mining, use the present techniques scheme, can also further excavate other implicit synonyms and concern, A-C for example, A-E, A-D, B-E, B-F, B-G etc.
Use said method, build up different trees and judge and for every synonym mapping relations tree, can preserve the corresponding relation of every group " non-leaf node-leaf node " after the convergence for all synonyms, then heavily process by going, finally generate whole semantic mapping file.For example, the synonym mapping relations tree for shown in Figure 2 will preserve " A-E, B-E, C-E, D-E, F-E, G-E " these 6 groups relations.As seen, can only pass through " A-E, B-E, C-E, D-E, F-E, G-E " these 6 groups for 21 groups of relations and concern complete description, also can effectively reduce the volume of semantic mapping file.
For the webpage of occurring A, B, C, D, F or G, the semantic indexing that system can correspondence establishment E.When search, if the user searches for keyword A, system at first can be according to the relation of A-E so, search is mapped on the E, then further other synonyms of E concern, thereby all resources that comprise A, B, C, D, E, F or G related content are all represented to the user as Search Results, thereby improved the comprehensive of Search Results.
Figure 3 shows that the another kind of schematic flow sheet of the definite method of synonym Semantic mapping relation that the application provides, may further comprise the steps:
S201, obtain to document resources excavate obtain respectively organize synonym word pair;
S202 for each group synonym word pair, determines two synon mapping directions;
S203 travels through all synonym words pair, according to determined mapping direction, sets up synonym mapping relations tree, wherein, and father node and child node that the starting point of every group of mapping and terminal point correspond respectively to tree structure;
Wherein step S201-S203 and step S101-S103 are similar, no longer are repeated in this description here.
S204 judges that whether described synonym mapping relations tree converges on identical leaf node, if so, carries out S205, otherwise carries out S206.
S205 determines that this synonym mapping relations tree converges to this leaf node, and determines that there are the synonym mapping relations in this leaf node and other nodes.
Whether S206 judges the ratio of leaf node number that occurrence number is maximum and leaf node sum greater than the threshold value that presets, and if so, carries out 207;
As shown in Figure 4, synonym mapping relations tree does not converge to identical node, occupies certain ratio if judge in all leaf nodes of the maximum leaf node of occurrence number this moment, then can do further processing to the node of not restraining.If do not meet the requirements of ratio, can abandon current synonym mapping relations tree.
Suppose that the threshold value that sets in advance is 0.7, synonym mapping relations tree as shown in Figure 3 has 4 leaf nodes, the number of times maximum (3 times) of wherein E appearance, and the ratio that accounts in all nodes has reached 0.75, satisfy the threshold value requirement, therefore can further carry out S207.
S207 carries out respectively the checking of synonym relation to this leaf node and other leaf nodes.
As shown in Figure 3, leaf node and other leaf nodes maximum to occurrence number carry out respectively the checking of synonym relation, and as shown in Figure 3, the leaf node that occurrence number is maximum is E, and other leaf nodes are H.Here can adopt with step S101 in similar synonym verification mode, if satisfy verification condition, then further carry out S208.
S208 determines that this synonym mapping relations tree converges to the maximum leaf node of occurrence number, and determines that there are the synonym mapping relations in this leaf node and other nodes.
If in S207, all satisfy the synonym verification condition between the leaf node that occurrence number is maximum and other leaf nodes, can determine that there are the synonym mapping relations in this leaf node and other leaf nodes this moment, and determine that there are the synonym mapping relations in this leaf node and other non-leaf nodes.As shown in Figure 4, if leaf node E and H satisfy the synonym verification condition, can determine that so there are the synonym mapping relations in E and H, can determine A-E in addition, B-E, C-E, D-E, there are the synonym mapping relations in F-E, G-E.As seen, this scheme is actual to be the non-convergence synonym mapping relations tree of satisfying certain condition also to be used as converging tree process, and determine the synonym mapping relations between non-convergence leaf node (for example H) and the convergence leaf node (for example E), and the relation of confirming to hold back leaf node (for example E) and non-leaf node (A, B, C, D, E, F, G).
In the application's another kind of implementation, if in step S206, the leaf node number that occurrence number is maximum does not meet the requirements of ratio in all leaf nodes, can upwards recall this moment to the subtree that leaf node comparatively disperses, usually can recall 1~2 layer, then with the node after recalling as leaf node, rejudge and whether satisfy threshold value, if satisfy, can further carry out subsequent step with the form of recalling rear tree, process and the node that discards in trace-back process can be used as independent synonym.
Shown in Fig. 5 left hand view, the number of times that E occurs is maximum, and the ratio that accounts in all nodes reached 0.6, does not satisfy the threshold value requirement.So leaf node I and the J that comparatively disperses recalled, become shown in Fig. 5 right part of flg, the ratio that account in all nodes of E this moment has reached 0.75, has satisfied the threshold value requirement, therefore can further carry out subsequent step.And can be used as independent synonym, H-I and H-J process.
Corresponding to top embodiment of the method, the application also provides a kind of synonym Semantic mapping relation to determine device, and referring to shown in Figure 6, this device can comprise:
The synonym word is to acquisition module 610, be used for obtaining to document resources excavate obtain respectively organize synonym word pair;
Mapping orientation determination module 620 is used for determining two synon mapping directions for each group synonym word pair;
Relational tree makes up module 630, is used for all synonym words of traversal pair, according to determined mapping direction, sets up synonym mapping relations tree, wherein, and father node and child node that the starting point of every group of mapping and terminal point correspond respectively to tree structure;
Mapping relations determination module 640, be used for judging whether the degree of convergence of described synonym mapping relations tree satisfies preset requirement, if so, then determine the leaf node that this synonym mapping relations tree converges to, and determine that there are the synonym mapping relations in this leaf node and other nodes.
Below synonym Semantic mapping relation that the application is provided determine that the principle of work of device describes in detail:
The synonym word at first obtains the synonym word to resource from existing document resources to acquisition module 610.The document resources here can be the content on the webpage, also can be the content in the text, also can be user's user behaviors log, etc.By the excavation to these document contents, can obtain a large amount of synonym word pair.
In a kind of embodiment of the application, described synonym word to acquisition module 610 can concrete configuration be:
Be used for according to User action log, the searching key word that the user uses continuously obtains synonym word pair.
In general, the user in order to obtain more Search Results, may attempt using multi-form searching key word to search for for same subject in search procedure, so, in these multi-form searching key words, just may have synonym.User's this search behavior is recorded in showing as of user behaviors log: the time of scouting interval is shorter, and searching key word itself has a same or analogous part.For example, the user adopts keyword " Nike sport footwear " and " NIKE sport footwear " to search for continuously, just can think that in this case " NIKE " and " Nike " can consist of one group of synonym.By these class data in the User action log are excavated, just can obtain a large amount of potential synonym word pair.
In a kind of embodiment of the application, described synonym word to acquisition module 610 can concrete configuration be:
Be used for according to User action log the corresponding relation of searching request and webpage clicking content, acquisition synonym word pair.
After the user submitted a searching request to system, system can represent corresponding Search Results to the user, and the user represents then that to the click behavior of Search Results the user is to the approval of searching request and search result relevance.Can think this moment and exist synonym in user's searching request and the web page contents.For example, the user uses keyword " BJ Univ Hospital " to search for, and has further clicked the webpage that themes as " hospital of Peking University ", just can think that in this case " BJ Univ Hospital " and " hospital of Peking University " can consist of one group of synonym.Therefore, in actual applications, by collecting the content of searching request and webpage clicking specific part (for example title division), and then carry out the word alignment processing, just can obtain potential synonym word pair.
In a kind of embodiment of the application, described synonym word to acquisition module 610 can concrete configuration be:
Be used for according to User action log, click and enter the corresponding different searching request of same webpage, obtain synonym word pair.
For a webpage, it may be by different approach that the user accesses this webpage.Wherein, having access in the situation of this webpage by search, thereby different users adopts different searching request to obtain this web page interlinkage and click to enter.So, in these different searching request, also may exist synonym.
For example, the webpage of corresponding Baidupedia " Sai Er number ", user behaviors log by the statistics a large number of users, can obtain the user clicks the employed high-frequency searching key word of this webpage and comprises " Sai Er number ", " Sai Er number ", " plug inferior number " etc., this class searching key word that so, frequency of utilization can be surpassed certain threshold value is all regarded potential synonym as.
In a kind of embodiment of the application, described synonym word to acquisition module 610 can concrete configuration be:
Be used for to utilize the synonym template and the document content that preset to mate, acquisition synonym word pair.
Except carry out the synonym excavation based on user's behavior, can also more pre-definedly be usually used in representing synon template, for example: " A is called for short B ", " the A full name is B " etc., then utilize respectively the content of these templates and document to mate, thereby obtain synonym word pair.
Referring to shown in Figure 7, the synonym Semantic mapping relation that the application provides is determined device, can further include:
Synonymy authentication module 650, be used for described synonym word acquisition module 610 is obtained the synonym words to after, before described mapping orientation determination module 620 determines synonyms mapping directions, the synonym word that described synonym word is obtained acquisition module is to carrying out the checking of synonym relation.
Wherein, described synonymy authentication module 650 can concrete configuration be:
Be used for utilizing respectively two synon contextual feature word constitutive characteristic vectors, according to the similarity checking synonym relation of two proper vectors.
In specific implementation process, can utilize respectively two synon contextual feature word constitutive characteristic vectors, then utilize the similarity of two vectors of cosine angle formulae calculating, if similarity greater than default threshold value, is then verified and passed through.For example, the high-frequency characteristic word that above occurs at " Nike " comprises and { likes, buy, certified products, on the net }, the high-frequency characteristic word that occurs hereinafter comprises { sport footwear, plate footwear, basketball shoes, brand, company, brand shop }, and " NIKE " is corresponding above substantially similar with context information and " Nike ", can think that therefore the synonym relation checking of " NIKE " and " Nike " is passed through.
Except the contrast contextual feature, those skilled in the art can also adopt other mode that the synonym relation is verified.For example, utilize two candidate word to search for, by the similarity of Feature Words in the contrast Search Results synonym relation is verified, etc.
In addition, synonymy authentication module 650 also can adopt other mode that the synonym relation is verified.For example, utilize two candidate word to search for, by the similarity of Feature Words in the contrast Search Results synonym relation is verified, etc.
The synonym word acquisition module 610 is obtained the synonym word to after, for each group synonym word pair, determine two synon mapping directions in tree structure by mapping orientation determination module 620.
In a kind of embodiment of the application, described mapping orientation determination module 620 can concrete configuration be:
For two-way replaceable synonym, add up the frequency of occurrences of two synonyms in document resources, be described two synon mapping directions with low-frequency word to the orientation determination of high frequency words, described two-way replaceable synonym is: the synonym that can excavate two-way fallback relationship according to document resources.
In general, excavate in the synonym that obtains, have majority can excavate two-way fallback relationship according to document resources, this class synonym is called two-way replaceable synonym, for example " NIKE " and " Nike ".For two-way replaceable synonym, mapping orientation determination module 620 is at first added up the frequency of occurrences of these two synonyms in document resources, is two synon mapping directions with low-frequency word to the orientation determination of high frequency words then.
In a kind of embodiment of the application, described mapping orientation determination module 620 can also concrete configuration be:
For unidirectional replaceable synonym, be described two synon mapping directions with synon replacement orientation determination; Described unidirectional replaceable synonym is: the synonym that only can excavate unidirectional fallback relationship according to document resources.
Only can excavate unidirectional fallback relationship according to document resources, for example, adopt synonym word that " abbreviations ", " full name " this class template excavate pair, and this synonym word supports its reverse fallback relationship to the resource that does not have correspondence, then this class synonym is called unidirectional replaceable synonym.For, unidirectional replaceable synonym directly is described two synon mapping directions with synon replacement orientation determination; For example, can determine the fallback relationship of " Peking University hospital → BJ Univ Hospital " according to " hospital of Peking University is called for short the BJ Univ Hospital ", simultaneously in document resources, there are not again other information can support reverse fallback relationship, therefore, think that these two words consist of unidirectional replaceable synonym, and " Peking University hospital → BJ Univ Hospital " is defined as two synon mapping directions.
In a kind of embodiment of the application, described mapping orientation determination module 620, can also judge further whether the synonym that excavates exists ambiguity to replace, judge namely whether a candidate word exists a plurality of interchangeable synonyms, if, then abandon this synonym word pair, in the process of follow-up generation synonym mapping relations tree, do not use.
Relational tree makes up module 630, for a plurality of synonym words of determining the mapping direction pair, according to the mode of synonym cascade, adopts the traversal method of depth-first to set up synonym mapping relations tree.Concrete grammar is: at first choose any one group of synonym word pair that does not still belong at present synonym mapping relations tree, according to fixed mapping direction, will shine upon starting point as root node, shine upon terminal point as the child node of this root node.If also there are other synonyms in root node, and root node be in the mapping direction start position, then continue to set up other branches.In like manner, if also there are other synonyms in child node, and this child node is in the start position of mapping direction, then continues for this child node tree to be extended ... repeat above step, until travel through all synonym words pair.Wherein the starting point of every group of mapping and the terminal point father node and the child node that all correspond respectively to tree structure finally form a synonym Semantic mapping relational tree.
Mapping relations determination module 640, be further used for judging whether the degree of convergence of described synonym mapping relations tree satisfies preset requirement, if so, then determine the leaf node that this synonym mapping relations tree converges to, and determine that there are the synonym mapping relations in this leaf node and other nodes.
In a kind of embodiment of the application, described mapping relations determination module 640 can concrete configuration be:
Be used for judging whether described synonym mapping relations tree converges on identical leaf node, if so, determine that then this synonym mapping relations tree converges to this leaf node.That is to say judge whether the leaf node of the synonym mapping relations tree that builds up is unique, if so, then tree converges on unique leaf node, and can determine that there are the synonym mapping relations in this leaf node and other nodes this moment.
In the application's another kind of embodiment, described mapping relations determination module 640, also concrete configuration is:
Be used for judging that whether the ratio of leaf node number that occurrence number is maximum and leaf node sum is greater than the threshold value that presets; If so, then further this leaf node and other leaf nodes are carried out respectively the checking of synonym relation, if satisfy verification condition, determine that then this synonym mapping relations tree converges to the maximum leaf node of described occurrence number.
Occupy certain ratio if mapping relations determination module 640 is judged in all leaf nodes of the maximum leaf node of occurrence number, if do not meet the requirements of ratio, can abandon current synonym mapping relations tree.If reached the ratio that requires, then further occurrence number maximum leaf node and other leaf nodes are carried out respectively the checking of synonym relation, if satisfy verification condition, can determine that then this synonym mapping relations tree converges to the maximum leaf node of occurrence number, and determine that there are the synonym mapping relations in this leaf node and other nodes.
In the application's another kind of implementation, the leaf node numbers maximum if there is number of times do not meet the requirements of ratio in all leaf nodes, the mapping relations determination module can also upwards be recalled the subtree that leaf node comparatively disperses, usually can recall 1~2 layer, then with the node after recalling as leaf node, rejudge and whether satisfy threshold value, if satisfy, can further carry out subsequent step with the form of recalling rear tree, process and the node that discards in trace-back process can be used as independent synonym.
The synonym Semantic mapping relation that using the application provides is determined device, set up synonym mapping relations tree according to synon mapping direction, with organizing the synonym word mode with tree structure is organized more, thereby potential mapping relations in the synonym bunch are excavated.Can from limited document resources, obtain mapping relations in the more synonym bunch, utilize the synonym mapping relations to recall the comprehensive of Search Results thereby improve.
For the convenience of describing, be divided into various modules with function when describing above device and describe respectively.Certainly, when implementing the application, can in same or a plurality of softwares and/or hardware, realize the function of each module.
As seen through the above description of the embodiments, those skilled in the art can be well understood to the application and can realize by the mode that software adds essential general hardware platform.Based on such understanding, the part that the application's technical scheme contributes to prior art in essence in other words can embody with the form of software product, this computer software product can be stored in the storage medium, such as ROM/RAM, magnetic disc, CD etc., comprise that some instructions are with so that a computer equipment (can be personal computer, server, the perhaps network equipment etc.) carry out the described method of some part of each embodiment of the application or embodiment.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, and identical similar part is mutually referring to getting final product between each embodiment, and each embodiment stresses is difference with other embodiment.Especially, for device or system embodiment, because its basic simlarity is in embodiment of the method, so describe fairly simplely, relevant part gets final product referring to the part explanation of embodiment of the method.Apparatus and system embodiment described above only is schematic, wherein said module as the separating component explanation can or can not be physically to separate also, the parts that show as module can be or can not be physical modules also, namely can be positioned at a place, perhaps also can be distributed on a plurality of mixed-media network modules mixed-medias.Can select according to the actual needs wherein some or all of module to realize the purpose of present embodiment scheme.Those of ordinary skills namely can understand and implement in the situation of not paying creative work.
The application can describe in the general context of the computer executable instructions of being carried out by computing machine, for example program module.Usually, program module comprises the routine carrying out particular task or realize particular abstract data type, program, object, assembly, data structure etc.Also can in distributed computing environment, put into practice the application, in these distributed computing environment, be executed the task by the teleprocessing equipment that is connected by communication network.In distributed computing environment, program module can be arranged in the local and remote computer-readable storage medium that comprises memory device.
The above only is the application's embodiment; should be pointed out that for those skilled in the art, under the prerequisite that does not break away from the application's principle; can also make some improvements and modifications, these improvements and modifications also should be considered as the application's protection domain.

Claims (22)

1. a synonym Semantic mapping relation is determined method, it is characterized in that, comprising:
Obtain to document resources excavate obtain respectively organize synonym word pair;
For each group synonym word pair, determine two synon mapping directions;
Travel through all synonym words pair, according to determined mapping direction, set up synonym mapping relations tree, wherein, father node and child node that the starting point of every group of mapping and terminal point correspond respectively to tree structure;
Whether the degree of convergence of judging described synonym mapping relations tree satisfies preset requirement, if so, then determines the leaf node that this synonym mapping relations tree converges to, and determines that there are the synonym mapping relations in this leaf node and other nodes.
2. method according to claim 1 is characterized in that, described document resources is excavated obtains synonym word pair, comprising:
In User action log, the searching key word that the user uses continuously obtains synonym word pair.
3. method according to claim 1 is characterized in that, described document resources is excavated obtains synonym word pair, comprising:
In User action log, the corresponding relation of searching request and webpage clicking content obtains synonym word pair.
4. method according to claim 1 is characterized in that, described document resources is excavated obtains synonym word pair, comprising:
In User action log, click and to enter the corresponding different searching request of same webpage, obtain synonym word pair.
5. method according to claim 1 is characterized in that, described document resources is excavated obtains synonym word pair, comprising:
Synonym template and document content that utilization is preset mate, and obtain synonym word pair.
6. according to claim 1 to 5 each described methods, it is characterized in that, obtain the synonym word to after, determine synonym mapping direction before, also comprise:
The synonym word that obtains is verified carrying out the synonym relation.
7. method according to claim 6 is characterized in that, described to the synonym word to carrying out synonym relation checking, comprising:
Utilize respectively two synon contextual feature word constitutive characteristic vectors, according to the similarity checking synonym relation of two proper vectors.
8. method according to claim 1 is characterized in that, described definite two synon mapping directions comprise:
For two-way replaceable synonym, add up the frequency of occurrences of two synonyms in document resources, be described two synon mapping directions with low-frequency word to the orientation determination of high frequency words, described two-way replaceable synonym is: the synonym that can excavate two-way fallback relationship according to document resources.
9. method according to claim 1 is characterized in that, described definite two synon mapping directions comprise:
For unidirectional replaceable synonym, be described two synon mapping directions with synon replacement orientation determination; Described unidirectional replaceable synonym is: the synonym that only can excavate unidirectional fallback relationship according to document resources.
10. method according to claim 1 is characterized in that, whether the described degree of convergence of judging that described synonym mapping relations are set satisfies preset requirement, comprising:
Judge whether described synonym mapping relations tree converges on identical leaf node, if so, determine that then this synonym mapping relations tree converges to this leaf node.
11. method according to claim 1 is characterized in that, does not converge in the situation of identical leaf node in described synonym mapping relations tree, whether the described degree of convergence of judging that described synonym mapping relations are set satisfies preset requirement, comprising:
Judge that whether the ratio of leaf node number that occurrence number is maximum and leaf node sum is greater than the threshold value that presets; If so, then further this leaf node and other leaf nodes are carried out respectively the checking of synonym relation, if satisfy verification condition, determine that then this synonym mapping relations tree converges to the maximum leaf node of described occurrence number.
12. a synonym Semantic mapping relation is determined device, it is characterized in that, comprising:
The synonym word is to acquisition module, be used for obtaining to document resources excavate obtain respectively organize synonym word pair;
Mapping orientation determination module is used for determining two synon mapping directions for each group synonym word pair;
Relational tree makes up module, is used for all synonym words of traversal pair, according to determined mapping direction, sets up synonym mapping relations tree, wherein, and father node and child node that the starting point of every group of mapping and terminal point correspond respectively to tree structure;
The mapping relations determination module, be used for judging whether the degree of convergence of described synonym mapping relations tree satisfies preset requirement, if so, then determine the leaf node that this synonym mapping relations tree converges to, and determine that there are the synonym mapping relations in this leaf node and other nodes.
13. device according to claim 12 is characterized in that, described synonym word is to acquisition module, and concrete configuration is:
Be used for according to User action log, the searching key word that the user uses continuously obtains synonym word pair.
14. device according to claim 12 is characterized in that, described synonym word is to acquisition module, and concrete configuration is:
Be used for according to User action log the corresponding relation of searching request and webpage clicking content, acquisition synonym word pair.
15. device according to claim 12 is characterized in that, described synonym word is to acquisition module, and concrete configuration is:
Be used for according to User action log, click and enter the corresponding different searching request of same webpage, obtain synonym word pair.
16. device according to claim 12 is characterized in that, described synonym word is to acquisition module, and concrete configuration is:
Be used for to utilize the synonym template and the document content that preset to mate, acquisition synonym word pair.
17. to 16 each described devices, it is characterized in that described device also comprises according to claim 12:
The synonymy authentication module, be used for described synonym word acquisition module is obtained the synonym word to after, before described mapping orientation determination module determines synonym mapping direction, the synonym word that described synonym word is obtained acquisition module is to carrying out the checking of synonym relation.
18. device according to claim 17 is characterized in that, described synonymy authentication module, and concrete configuration is:
Be used for utilizing respectively two synon contextual feature word constitutive characteristic vectors, according to the similarity checking synonym relation of two proper vectors.
19. device according to claim 12 is characterized in that, described mapping orientation determination module, and concrete configuration is:
For two-way replaceable synonym, add up the frequency of occurrences of two synonyms in document resources, be described two synon mapping directions with low-frequency word to the orientation determination of high frequency words, described two-way replaceable synonym is: the synonym that can excavate two-way fallback relationship according to document resources.
20. device according to claim 12 is characterized in that, described mapping orientation determination module, and concrete configuration is:
For unidirectional replaceable synonym, be described two synon mapping directions with synon replacement orientation determination; Described unidirectional replaceable synonym is: the synonym that only can excavate unidirectional fallback relationship according to document resources.
21. device according to claim 12 is characterized in that, described mapping relations determination module, and concrete configuration is:
Be used for judging whether described synonym mapping relations tree converges on identical leaf node, if so, determine that then this synonym mapping relations tree converges to this leaf node.
22. device according to claim 12 is characterized in that, described mapping relations determination module, and concrete configuration is:
Be used for judging that whether the ratio of leaf node number that occurrence number is maximum and leaf node sum is greater than the threshold value that presets; If so, then further this leaf node and other leaf nodes are carried out respectively the checking of synonym relation, if satisfy verification condition, determine that then this synonym mapping relations tree converges to the maximum leaf node of described occurrence number.
CN201110266784.9A 2011-09-09 2011-09-09 A kind of synonym Semantic mapping relation determines method and device Active CN102999495B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110266784.9A CN102999495B (en) 2011-09-09 2011-09-09 A kind of synonym Semantic mapping relation determines method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110266784.9A CN102999495B (en) 2011-09-09 2011-09-09 A kind of synonym Semantic mapping relation determines method and device

Publications (2)

Publication Number Publication Date
CN102999495A true CN102999495A (en) 2013-03-27
CN102999495B CN102999495B (en) 2016-08-03

Family

ID=47928076

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110266784.9A Active CN102999495B (en) 2011-09-09 2011-09-09 A kind of synonym Semantic mapping relation determines method and device

Country Status (1)

Country Link
CN (1) CN102999495B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202038A (en) * 2016-06-29 2016-12-07 北京智能管家科技有限公司 Synonym method for digging based on iteration and device
CN106294784A (en) * 2016-08-12 2017-01-04 合智能科技(深圳)有限公司 Resource search method and device
CN106354715A (en) * 2016-09-28 2017-01-25 医渡云(北京)技术有限公司 Method and device for medical word processing
CN106446018A (en) * 2016-08-29 2017-02-22 北京百度网讯科技有限公司 Artificial intelligence-based query information processing method and device
CN106777283A (en) * 2016-12-29 2017-05-31 北京奇虎科技有限公司 The method for digging and device of a kind of synonym
CN111428476A (en) * 2019-01-09 2020-07-17 百度在线网络技术(北京)有限公司 Synonym generation method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070239742A1 (en) * 2006-04-06 2007-10-11 Oracle International Corporation Determining data elements in heterogeneous schema definitions for possible mapping
CN101079026A (en) * 2007-07-02 2007-11-28 北京百问百答网络技术有限公司 Text similarity, acceptation similarity calculating method and system and application system
CN101630314A (en) * 2008-07-16 2010-01-20 中国科学院自动化研究所 Semantic query expansion method based on domain knowledge
US7890521B1 (en) * 2007-02-07 2011-02-15 Google Inc. Document-based synonym generation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070239742A1 (en) * 2006-04-06 2007-10-11 Oracle International Corporation Determining data elements in heterogeneous schema definitions for possible mapping
US7890521B1 (en) * 2007-02-07 2011-02-15 Google Inc. Document-based synonym generation
CN101079026A (en) * 2007-07-02 2007-11-28 北京百问百答网络技术有限公司 Text similarity, acceptation similarity calculating method and system and application system
CN101630314A (en) * 2008-07-16 2010-01-20 中国科学院自动化研究所 Semantic query expansion method based on domain knowledge

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴云芳等: ""基于图的同义词集自动获取方法"", 《计算机研究与发展》, vol. 48, no. 4, 15 April 2011 (2011-04-15), pages 610 - 616 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202038A (en) * 2016-06-29 2016-12-07 北京智能管家科技有限公司 Synonym method for digging based on iteration and device
CN106294784A (en) * 2016-08-12 2017-01-04 合智能科技(深圳)有限公司 Resource search method and device
CN106294784B (en) * 2016-08-12 2019-12-17 合一智能科技(深圳)有限公司 resource searching method and device
CN106446018A (en) * 2016-08-29 2017-02-22 北京百度网讯科技有限公司 Artificial intelligence-based query information processing method and device
CN106354715A (en) * 2016-09-28 2017-01-25 医渡云(北京)技术有限公司 Method and device for medical word processing
CN106354715B (en) * 2016-09-28 2019-04-16 医渡云(北京)技术有限公司 Medical vocabulary processing method and processing device
CN106777283A (en) * 2016-12-29 2017-05-31 北京奇虎科技有限公司 The method for digging and device of a kind of synonym
CN111428476A (en) * 2019-01-09 2020-07-17 百度在线网络技术(北京)有限公司 Synonym generation method and device, electronic equipment and storage medium
CN111428476B (en) * 2019-01-09 2023-03-31 百度在线网络技术(北京)有限公司 Synonym generation method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN102999495B (en) 2016-08-03

Similar Documents

Publication Publication Date Title
CN102799647B (en) Method and device for webpage reduplication deletion
CN102063469B (en) Method and device for acquiring relevant keyword message and computer equipment
US20100228742A1 (en) Categorizing Queries and Expanding Keywords with a Coreference Graph
US20110179002A1 (en) System and Method for a Vector-Space Search Engine
CN102999495A (en) Method and device for determining synonym semantics mapping relations
Reinanda et al. Mining, ranking and recommending entity aspects
CN103631929A (en) Intelligent prompt method, module and system for search
Li et al. Bursty event detection from microblog: a distributed and incremental approach
CN103530402A (en) Method for identifying microblog key users based on improved Page Rank
CN102163226A (en) Adjacent sorting repetition-reducing method based on Map-Reduce and segmentation
KR102600018B1 (en) Method and apparatus for mining entity relationship, electronic device, storage medium and program
US20130066898A1 (en) Matching target strings to known strings
CN112115232A (en) Data error correction method and device and server
Elshater et al. godiscovery: Web service discovery made efficient
CN104834736A (en) Method and device for establishing index database and retrieval method, device and system
CN103927177A (en) Characteristic-interface digraph establishment method based on LDA model and PageRank algorithm
Yun et al. An efficient approach for mining weighted approximate closed frequent patterns considering noise constraints
CN104281275A (en) Method and device for inputting English
Yang et al. On characterizing and computing the diversity of hyperlinks for anti-spamming page ranking
Nguyen et al. A method for mining top-rank-k frequent closed itemsets
US10235432B1 (en) Document retrieval using multiple sort orders
Setayesh et al. Presentation of an Extended Version of the PageRank Algorithm to Rank Web Pages Inspired by Ant Colony Algorithm
Wang et al. Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task.
Wang et al. Robust word-network topic model for short texts
CN113420219A (en) Method and device for correcting query information, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant