CN107451130A - A kind of Chinese word semantic relation recognition methods of combination China and Britain knowledge resource and device - Google Patents

A kind of Chinese word semantic relation recognition methods of combination China and Britain knowledge resource and device Download PDF

Info

Publication number
CN107451130A
CN107451130A CN201710706832.9A CN201710706832A CN107451130A CN 107451130 A CN107451130 A CN 107451130A CN 201710706832 A CN201710706832 A CN 201710706832A CN 107451130 A CN107451130 A CN 107451130A
Authority
CN
China
Prior art keywords
word
chinese
english
words
recognition unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710706832.9A
Other languages
Chinese (zh)
Other versions
CN107451130B (en
Inventor
鹿文鹏
孟凡擎
张玉腾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qilu University of Technology
Original Assignee
Qilu University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qilu University of Technology filed Critical Qilu University of Technology
Priority to CN201710706832.9A priority Critical patent/CN107451130B/en
Publication of CN107451130A publication Critical patent/CN107451130A/en
Application granted granted Critical
Publication of CN107451130B publication Critical patent/CN107451130B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/55Rule-based translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Abstract

Chinese word semantic relation recognition methods and device the invention discloses a kind of combination China and Britain knowledge resource.Method includes:Antisense set of words is obtained with reference to a variety of Chinese knowledge resources, whether semantic relation has antonymy between judging word according to antisense set of words;Using a variety of Chinese knowledge resource extraction unit participle set, judge whether there is integral part relation between word according to part set of words;Using a variety of Chinese knowledge resource extraction TongYiCi CiLins, judge whether there is synonymy between word based on TongYiCi CiLin;The next set of words is extracted by means of a variety of Chinese knowledge resources, judges whether there is hyponymy between word according to the next set of words;Chinese word is converted into English to translation using Chinese-English Dictionary;Using English knowledge resource to English word obtained by Chinese-English translation to carrying out phrase semantic relation recognition, to determine the semantic relation of former Chinese word pair.Using the present invention, the effect of a variety of Chinese knowledge resources can be given full play to, more accurately and effectively identifies Chinese word semantic relation.

Description

A kind of Chinese word semantic relation recognition methods of combination China and Britain knowledge resource and device
Technical field
The present invention relates to natural language processing technique field, and in particular to a kind of Chinese word of combination China and Britain knowledge resource Semantic relation recognition methods and device.
Background technology
Semantic relation identification refers to given word possessed semantic relation automatic judgement word.Typical semantic pass System includes:Antonymy, integral part relation, synonymy, hyponymy etc..Semantic relation identification is natural language processing The basic task in field, have to word sense disambiguation, ontologies structure, machine translation, information retrieval, text classification etc. direct Influence.
Current most of semantic relation Study of recognition work are typically based on one or more knowledge moneys mainly for English Source, complete the classification of English semantic relation using the method for the statistical learnings such as SVMs, Bayes classifier or identification is appointed Business, achieves preferable effect.Research work in terms of the semantic relation recognition of Chinese is relatively fewer, and most related works are usual Using a certain knowledge resource, the identification of semantic relation is carried out by means of statistical learning method.Existing research work is only adopted With a certain knowledge resource, and it have ignored the digging utilization to other Languages knowledge resource;Statistical learning method is marked unavoidably The restriction of the scale of language material is noted, accuracy rate is also difficult to ensure that.Construction and perfect, these resources with all kinds of linguistry resources Complement one another, more reliable knowledge is provided for the identification of semantic relation.
The above technical problem existing in face of the identification of Chinese word semantic relation, patent of the present invention, which is fully excavated, a variety of to be known Know resource inherent semantic relation, realize a kind of Chinese word semantic relation recognition methods based on a variety of Chinese knowledge resources and Device, make every effort to promote the solution of these problems to a certain extent.
The content of the invention
To solve the shortcomings of the prior art, the invention discloses a kind of Chinese word language of combination China and Britain knowledge resource Adopted relation recognition method and apparatus, more accurately and effectively to judge the semantic relation between Chinese word.
Therefore, the present invention provides following technical scheme:
A kind of Chinese word semantic relation recognition methods of combination China and Britain knowledge resource, comprises the following steps:
Step 1: antisense set of words is obtained with reference to a variety of Chinese knowledge resources, according to language between antisense set of words judgement word Whether adopted relation has antonymy;
Step 2: using a variety of Chinese knowledge resource extraction unit participle set, it is according between part set of words judgement word It is no that there is integral part relation;
Step 3: extract TongYiCi CiLins using a variety of Chinese knowledge resources, judged based on TongYiCi CiLin be between word It is no that there is synonymy;
Step 4: the next set of words is extracted by means of a variety of Chinese knowledge resources, according between the next set of words judgement word Whether there is hyponymy;
Step 5: Chinese word is converted into English to translation using Chinese-English Dictionary;
Step 6: using English knowledge resource to English word obtained by step 5 to carrying out phrase semantic relation recognition, with It is determined that the semantic relation of former Chinese word pair.
Further, in the step 1, when judging antisense semantic relation, it is specially:
Step 1-1) using the antonymy of explicit definition in HowNet, the antisense to giving word A and B progress word A Set of words ASETAExtraction operation, if B ∈ ASETA, then there is antonymy in two words, otherwise go to step 1-2), in addition A kind of antonymy processing is also served as to adopted relation defined in HowNet;
Step 1-2) use the Baidu given word A of Chinese extraction antisense set of words ASETA, utilize Harbin Institute of Technology's synonym word Woods extended edition extraction word A TongYiCi CiLin SSETA, for each word W ∈ SSETAExtract its antonym and be merged into ASETAIf word B ∈ ASETA, then word A and B antonymy be present, otherwise go to step 1-3);
Step 1-3) use Baidupedia extraction word A antisense set of words ASETAIf word B ∈ ASETA, then two word There is antonymy in language, otherwise go to step 2-1).
Further, in the step 2, when judging integral part relation, it is specially:
Step 2-1) using HowNet word A and B part set of words MSET is extracted respectivelyAAnd MSETBIf B ∈ MSETAOr A ∈ MSETB, then there is integral part relation in two words, otherwise go to step 2-2);
Step 2-2) handled using the original justice of HowNet justice, the word containing justice former " part | part " in definition Part word (part) of the word as some word is represented, the value of " whole " attribute in definition indicates the justice of its overall word Original justice, word A and B adopted former definition set DEFSET are extracted accordinglyAAnd DEFSETB, if there is DEFA∈DEFSETAWith DEFB∈DEFSETB, meet DEFAIt is DEF containing " whole " attribute and its valueB, or DEFBIt is containing " whole " attribute and its value DEFA, then word A and B integral part relation be present, otherwise go to step 3-1);
In addition, some words are to directly utilizing justice, originally justice can not effectively identify integral part relation, can be by general The mode of change is handled, will be above-mentioned in the value of " whole " attribute be generalized for its upperseat concept, remaining operation is constant.
Further, in the step 3, when judging synonymy, it is specially:
Step 3-1) according to the row expression synonym that "=" is indicated in Harbin Institute of Technology's Chinese thesaurus extended edition, obtain word A TongYiCi CiLin SSETAIf B ∈ SSETA, then word A and B synonymy be present, otherwise go to step 3-2);
Step 3-2) utilize HowNet extraction words A TongYiCi CiLin SSETAIf word B ∈ SSETA, then word A Synonymy be present with B, otherwise go to step 3-3);
Step 3-3) utilize Baidu Chinese extraction word A TongYiCi CiLin SSETAIf word B ∈ SSETA, then word There is synonymy in A and B, otherwise go to step 3-4);
Step 3-4) according to the page link of Baidupedia, the encyclopaedia for obtaining word A and B respectively links page set PSETAAnd PSETBIf meetThen there is synonymy in word A and B, otherwise go to step 4-1).
Further, in the step 4, when judging hyponymy, it is specially:
Step 4-1) using HowNet word A and B the next set of words HSET is extracted respectivelyAAnd HSETBIf B ∈ HSETAOr A ∈ HSETB, then word A and B hyponymy be present, otherwise go to step 4-2);
Step 4-2) according to the adopted original justice of the HowNet justice hyponymy that originally meaning contains, respectively extraction word A and B Set DEFSETAAnd DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈DEFSETB, meet main justice it is former consistent andOrThen word A and B has hyponymy, otherwise goes to step 5-1).
Further, in the step 5, when translating word pair, it is specially:
Step 5-1) word A and B translated respectively using Chinese-English Dictionary be converted to corresponding English set ENSET (A) and ENSET(B)。
Further, in the step 6, when carrying out semantics recognition using English knowledge resource, it is specially:
Step 6-1) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), are carried according to English knowledge resource Take word ENAAntisense set of words ENASETAIf word ENB∈ENASETA, then English word ENAAnd ENBAntisense pass be present Otherwise system, namely former Chinese word go to step 6-2 to antonymy be present;
Step 6-2) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), according to English knowledge resource point Indescribably take word ENAAnd ENBPart set of words ENMSETAAnd ENMSETBIf word ENB∈ENMSETAOr ENA∈ ENMSETB, then English word ENAAnd ENBIntegral part relation be present, namely to integral part relation be present in former Chinese word, it is no Then go to step 6-3);
Step 6-3) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), are carried according to English knowledge resource Take word ENATongYiCi CiLin ENSSETAIf word ENB∈ENSSETA, then English word ENAAnd ENBSynonymous pass be present Otherwise system, namely former Chinese word go to step 6-4 to synonymy be present);
Step 6-4) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), according to English knowledge resource point Indescribably take word ENAAnd ENBThe next set of words ENHSETAAnd ENHSETBIf word ENB∈ENHSETAOr ENA∈ ENHSETB, then English word ENAAnd ENBHyponymy, namely former Chinese word be present to hyponymy be present.
A kind of Chinese word semantic relation identification device of combination China and Britain knowledge resource, including:
Antonymy recognition unit, for obtaining antisense set of words using a variety of Chinese knowledge resources, according to antisense word set Close whether semantic relation between judging word has antonymy;
Integral part relation recognition unit, for being gathered using a variety of Chinese knowledge resource extraction unit participles, according to part Set of words judges whether there is integral part relation between word;
Synonymy recognition unit, for extracting TongYiCi CiLins using a variety of Chinese knowledge resources, based on synset Close and judge whether there is synonymy between word;
Hyponymy recognition unit, for extracting the next set of words by means of a variety of Chinese knowledge resources, according to bottom Set of words judges whether there is hyponymy between word;
Chinese-English translation unit, for Chinese word to be converted into English to translation using Chinese-English Dictionary;
English word semantic relation recognition unit, for utilizing English knowledge resource to English words obtained by Chinese-English translation unit Language is to carrying out phrase semantic relation recognition, to determine the semantic relation of former Chinese word pair.
Further, the antonymy recognition unit also includes:
HowNet antonymy recognition units, for the antonymy using explicit definition in HowNet, to giving word A Word A antisense set of words ASET is carried out with BAExtraction operation, if B ∈ ASETA, then there is antonymy in two words, otherwise Turn Baidu's Chinese antonymy recognition unit, a kind of antonymy processing is also served as to adopted relation defined in HowNet in addition;
Baidu's Chinese antonymy recognition unit, for the antisense set of words using the Baidu given word A of Chinese extraction ASETA, utilize Harbin Institute of Technology Chinese thesaurus extended edition extraction word A TongYiCi CiLin SSETA, for each word W ∈ SSETAExtract its antonym and be merged into ASETAIf word B ∈ ASETA, then word A and B antonymy be present, otherwise turn Baidupedia antonymy recognition unit;
Baidupedia antonymy recognition unit, for the antisense set of words ASET using Baidupedia extraction word AA, If word B ∈ ASETA, then two words antonymy be present, otherwise turn integral part relation recognition unit.
Further, the integral part relation recognition unit also includes:
HowNet integral part relation recognition units, for extracting word A and B part set of words respectively using HowNet MSETAAnd MSETBIf B ∈ MSETAOr A ∈ MSETB, then there is integral part relation in two words, otherwise escape is originally adopted Integral part relation recognition unit;
Adopted original adopted integral part relation recognition unit, for being handled using the original justice of HowNet justice, in definition The word for containing adopted former " part | part " represents part word (part) of the word as some word, " whole " category in definition Property value indicate the original justice of justice of its overall word, extract word A and B adopted former definition set DEFSET accordinglyAAnd DEFSETB, If there is DEFA∈DEFSETAAnd DEFB∈DEFSETB, meet DEFAIt is DEF containing " whole " attribute and its valueB, or DEFB It is DEF containing " whole " attribute and its valueA, then word A and B integral part relation be present, it is single otherwise to turn synonymy identification Member;
In addition, in adopted original adopted integral part relation recognition unit, some words are to directly using justice, originally justice can not Effectively identify integral part relation, can be handled by extensive mode, will be above-mentioned in " whole " attribute value it is extensive It is constant for its upperseat concept, remaining operation.
Further, the synonymy recognition unit also includes:
Word woods synonymy recognition unit, for being represented according to the row that "=" is indicated in Harbin Institute of Technology's Chinese thesaurus extended edition Synonym, obtain word A TongYiCi CiLin SSETAIf B ∈ SSETA, then word A and B synonymy be present, otherwise turn HowNet synonymy recognition units;
HowNet synonymy recognition units, for the TongYiCi CiLin SSET using HowNet extraction words AAIf Word B ∈ SSETA, then word A and B synonymy be present, otherwise turn Baidu's Chinese synonymy recognition unit;
Baidu's Chinese synonymy recognition unit, for the TongYiCi CiLin SSET using Baidu Chinese extraction word AA, If word B ∈ SSETA, then word A and B synonymy be present, otherwise turn Baidupedia synonymy recognition unit;
Baidupedia synonymy recognition unit, for the page link according to Baidupedia, word A and B are obtained respectively Encyclopaedia link page set PSETAAnd PSETBIf meetThen there is synonymous pass in word A and B System, otherwise turns hyponymy recognition unit.
Further, the hyponymy recognition unit also includes:
HowNet hyponymy recognition units, for extracting word A and B the next set of words respectively using HowNet HSETAAnd HSETBIf B ∈ HSETAOr A ∈ HSETB, then word A and B hyponymy be present, otherwise in the original justice of escape The next relation recognition unit;
Adopted original adopted hyponymy recognition unit, for the hyponymy contained according to the original meaning of HowNet justice, divide Indescribably take word A and B adopted former definition set DEFSETAAnd DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈ DEFSETB, meet main justice it is former consistent andOrThen word A and B has upper the next pass System, otherwise turns Chinese-English translation unit.
Further, the Chinese-English translation unit, in addition to:
Chinese-English translation unit, corresponding English set is converted to for translating word A and B respectively using Chinese-English Dictionary ENSET (A) and ENSET (B).
Further, the English word semantic relation recognition unit also includes:
English antonymy recognition unit, for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), root According to English knowledge resource extraction word ENAAntisense set of words ENASETAIf word ENB∈ENASETA, then English word ENA And ENBAntonymy be present, namely otherwise former Chinese word turns English integral part relation recognition list to antonymy be present Member;
English integral part relation recognition unit, for each English word ENA∈ ENSET (A), ENB∈ENSET (B) word EN, is extracted according to English knowledge resource respectivelyAAnd ENBPart set of words ENMSETAAnd ENMSETBIf word ENB∈ENMSETAOr ENA∈ENMSETB, then English word ENAAnd ENBIntegral part relation, namely former Chinese word pair be present Integral part relation be present, otherwise turn English synonymy recognition unit;
English synonymy recognition unit, for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), root According to English knowledge resource extraction word ENATongYiCi CiLin ENSSETAIf word ENB∈ENSSETA, then English word ENA And ENBSynonymy be present, namely otherwise former Chinese word turns English hyponymy recognition unit to synonymy be present;
English hyponymy recognition unit, for for each English word ENA∈ ENSET (A), ENB∈ENSET (B) word EN, is extracted according to English knowledge resource respectivelyAAnd ENBThe next set of words ENHSETAAnd ENHSETBIf word ENB∈ENHSETAOr ENA∈ENHSETB, then English word ENAAnd ENBHyponymy, namely former Chinese word be present to depositing In hyponymy.
Beneficial effects of the present invention:
1st, the present invention carries out phrase semantic relation recognition using a variety of different Chinese knowledge resources, takes full advantage of every kind of Knowledge resource.
2nd, in integral part relation recognition operation is carried out, for HowNet justice it is original adopted the characteristics of, the present invention passes through Extensive method is supplemented, and recognition methods adaptability is got a promotion.
3rd, in hyponymy identification operation is carried out, the present invention fully excavates the letter contained of the original justice of justice in HowNet Breath, it is effectively improved the accuracy of identification.
4th, present invention incorporates Chinese and English knowledge resource, it is unlapped to supplement Chinese knowledge resource using English knowledge resource Phrase semantic relation, improves discrimination.
5th, the Chinese word semantic relation recognition methods of combination China and Britain proposed by the present invention knowledge resource and device, can be certainly The semantic relation of the dynamic given word pair of identification, including antonymy, integral part relation, synonymy, hyponymy, have Higher recognition correct rate.
Brief description of the drawings
Fig. 1 is the stream for the Chinese word semantic relation recognition methods that Sino-British knowledge resource is combined according to embodiment of the present invention Cheng Tu;
Fig. 2 is the knot for the Chinese word semantic relation identification device that Sino-British knowledge resource is combined according to embodiment of the present invention Structure schematic diagram;
Fig. 3 is the structural representation according to embodiment of the present invention antonymy recognition unit;
Fig. 4 is the structural representation according to embodiment of the present invention integral part relation recognition unit;
Fig. 5 is the structural representation according to embodiment of the present invention synonymy recognition unit;
Fig. 6 is the structural representation according to embodiment of the present invention hyponymy recognition unit;
Fig. 7 is the structural representation according to the Chinese-English translation unit of embodiment of the present invention;
Fig. 8 is the structural representation according to embodiment of the present invention English word semantic relation recognition unit;
Embodiment:
In order that those skilled in the art more fully understand the scheme of the embodiment of the present invention, below in conjunction with the accompanying drawings and implement Mode is described in further detail to inventive embodiments.
By to by the word that word A " motor vehicle " and word B " truck " are formed to carry out semantics recognition processing exemplified by.
The embodiment of the present invention combines the flow chart of the Chinese word semantic relation recognition methods of Sino-British knowledge resource, such as Fig. 1 It is shown, comprise the following steps:
Step 101, antonymy identifies.
Antisense set of words is obtained with reference to a variety of Chinese knowledge resources, semantic relation is between judging word according to antisense set of words It is no that there is antonymy, be specially:
Step 1-1) using the antonymy of explicit definition in HowNet, the antisense to giving word A and B progress word A Set of words ASETAExtraction operation, if B ∈ ASETA, then there is antonymy in two words, otherwise go to step 1-2), in addition A kind of antonymy processing is also served as to adopted relation defined in HowNet;
The antonym (including to adopted word) for extracting word A " motor vehicle " in HowNet gathers to obtain ASETA=" wooden handcart ", " large flatbed tricycle ", " cart ", " bicycle ", " old rickshaw ", " wheelbarrow ", " rickshaw ", " yellow croaker car ", " handcart ", " rubber car ", " bicycle ", " donkey cart ", " carriage ", " donkey car ", " ox cart ", " large handcart ", " flat car ", " flat board three-wheel ", " flatcar ", " rickshaw ", " three-wheel ", " tricycle ", " mountain bike ", " handcart ", " trolley ", " animal-drawn vehicle ", " stroller ", " dolly ", " ocean Car ", " vehicle using motor ", " bicycle ", " a light horse cart ", " chariot ", " hub ", " halter strap " }, it is clear that word B " truck "Therefore go to step 1- 2)。
Step 1-2) use the Baidu given word A of Chinese extraction antisense set of words ASETA, utilize Harbin Institute of Technology's synonym word Woods extended edition extraction word A TongYiCi CiLin SSETA, for each word W ∈ SSETAExtract its antonym and be merged into ASETAIf word B ∈ ASETA, then word A and B antonymy be present, otherwise go to step 1-3);
The antisense set of words that word A " motor vehicle " is extracted in Baidu's Chinese obtainsDue to word B " truck " Therefore go to step 1-3).
Step 1-3) use Baidupedia extraction word A antisense set of words ASETAIf word B ∈ ASETA, then two word There is antonymy in language, otherwise go to step 2-1).
The antisense set of words that word A " motor vehicle " is extracted in Baidupedia obtainsDue to word B " truck " Therefore go to step 2-1).
Step 102, integral part relation recognition.
It is whole according to whether having between part set of words judgement word using a variety of Chinese knowledge resource extraction unit participle set Body portion relation, it is specially:
Step 2-1) using HowNet word A and B part set of words MSET is extracted respectivelyAAnd MSETBIf B ∈ MSETAOr A ∈ MSETB, then there is integral part relation in two words, otherwise go to step 2-2);
Word A " motor vehicle " is extracted in HowNet and B " truck " part set of words obtains MSETA={ " headlight ", " side To disk ", " sun visor ", " boot ", " vehicle rear window ", " rear baffle ", " back light ", " rearview mirror ", " driver's cabin ", " sidecar ", " across Bucket ", " automobile engine ", " car horn ", " auto parts machinery ", " cylinder ", " headlight ", " fuel gauge ", " taillight ", " luggage Case ", " throttle ", " sunshading board ", " sun proof " },Because of B " truck "" motor vehicle "So going to step 2-2).
Step 2-2) handled using the original justice of HowNet justice, the word containing justice former " part | part " in definition Part word (part) of the word as some word is represented, the value of " whole " attribute in definition indicates the justice of its overall word Original justice, word A and B adopted former definition set DEFSET are extracted accordinglyAAnd DEFSETB, if there is DEFA∈DEFSETAWith DEFB∈DEFSETB, meet DEFAIt is DEF containing " whole " attribute and its valueB, or DEFBIt is containing " whole " attribute and its value DEFA, then word A and B integral part relation be present, otherwise go to step 3-1);
In addition, some words are to directly utilizing justice, originally justice can not effectively identify integral part relation, can be by general The mode of change is handled, will be above-mentioned in the value of " whole " attribute be generalized for its upperseat concept, remaining operation is constant.
Utilize HowNet extraction word A " motor vehicle " and B " truck " adopted former definition set DEFSETA= " LandVehicle | car:Modifier=automatic | automatic " and DEFSETB=" LandVehicle | car: Modifier=automatic | automatic, transport | transport:Instrument={~}, patient= Physical | material } ", it is clear that in the absence of DEFA∈DEFSETAOr DEFB∈DEFSETBContaining adopted former " part | part ", Therefore 3-1 is gone to step).
Step 103, synonymy identifies.
Using a variety of Chinese knowledge resource extraction TongYiCi CiLins, judge whether have together between word based on TongYiCi CiLin Adopted relation, it is specially:
Step 3-1) according to the row expression synonym that "=" is indicated in Harbin Institute of Technology's Chinese thesaurus extended edition, obtain word A TongYiCi CiLin SSETAIf B ∈ SSETA, then word A and B synonymy be present, otherwise go to step 3-2);
In Harbin Institute of Technology's Chinese thesaurus extended edition, extraction word A " motor vehicle " TongYiCi CiLin obtainsB " truck "Therefore go to step 3-2).
Step 3-2) utilize HowNet extraction words A TongYiCi CiLin SSETAIf word B ∈ SSETA, then word A Synonymy be present with B, otherwise go to step 3-3);
In HowNet, extraction word A " motor vehicle " TongYiCi CiLin obtains SSETA={ " motor vehicle ", " motor vehicle ", " automobile ", " car ", " automobile ", " dolly ", " car ", " car ", " small sleeping carriage " }, because of B " truck " So going to step 3-3).
Step 3-3) utilize Baidu Chinese extraction word A TongYiCi CiLin SSETAIf word B ∈ SSETA, then word There is synonymy in A and B, otherwise go to step 3-4);
In Baidu's Chinese, extraction word A " motor vehicle " TongYiCi CiLin obtainsBecause of B " truck "So going to step 3-4).
Step 3-4) according to the page link of Baidupedia, the encyclopaedia for obtaining word A and B respectively links page set PSETAAnd PSETBIf meetThen there is synonymy in word A and B, otherwise go to step 4-1).
In Baidupedia, the encyclopaedia link page set for extracting word A " motor vehicle " and B " truck " respectively obtains PSETA ={ " https://baike.baidu.com/item/ motor vehicles " } and PSETB={ " https://baike.baidu.com/ Item/ truck/4339 ", " https://baike.baidu.com/item/ truck/15281831 ", " https:// Baike.baidu.com/item/ truck/622401 ", " https://baike.baidu.com/item/ trucies/ 3697802 ", " https://baike.baidu.com/item/ truck/7109303 ", " https:// Baike.baidu.com/item/ truck/3697784 " }, due toTherefore go to step 4-1).
Step 104, hyponymy identifies.
The next set of words is extracted by means of a variety of Chinese knowledge resources, judges whether have between word according to the next set of words Hyponymy, it is specially:
Step 4-1) using HowNet word A and B the next set of words HSET is extracted respectivelyAAnd HSETBIf B ∈ HSETAOr A ∈ HSETB, then word A and B hyponymy be present, otherwise go to step 4-2);
The next set of words for extracting word A " motor vehicle " and B " truck " respectively in HowNet obtains HSETA=" Audi ", " bus ", " regular bus ", " hired car ", " BMW ", " benz ", " Honda ", " Buick ", " long-distance bus ", " taxi ", " hire out vapour Car ", " bus ", " motor bus ", " Daewoo ", " taxi ", " electric car ", " Toyota ", " Ford ", " bus ", " bus " is " public Hand over car ", " illegal vehicle ", " lorry ", " container car ", " Shuttle Bus ", " emergency tender ", " taxi ", " special bus ", " learner-driven vehicle ", " police car ", " ambulance ", " fire engine ", " old-fashioned automobile ", " truck ", " Cadillac ", " empty wagons ", " Lincoln ", " hopper wagon ", " station wagon ", " touring car ", " Mercedes ", " minibus ", " local train ", " double deck bus ", " private car ", " private savings Car ", " scheduled bus ", " round bus ", " trolleybus ", " modern times ", " fire fighting truck ", " minibus ", " mini-bus ", " mini-bus vapour Car ", " small visitor ", " minibus ", " cruiser ", " tourist coach ", " offroad vehicle ", " transport vehicle ", " truck ", " self-dumping Truck ", " dumper ", " dump truck " } andBecause of word B " truck " ∈ HSETA, so word A " motor vehicle " Hyponymy be present with B " truck ", namely so far complete semantic relation identification operation.
Step 4-2) according to the adopted original justice of the HowNet justice hyponymy that originally meaning contains, respectively extraction word A and B Set DEFSETAAnd DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈DEFSETB, meet main justice it is former consistent andOrThen word A and B have hyponymy.
Step 105, Chinese-English translation.
Chinese word is converted into English to translation using Chinese-English Dictionary, is specially:
Step 5-1) word A and B translated respectively using Chinese-English Dictionary be converted to corresponding English set ENSET (A) and ENSET(B)。
Step 106, English word semantic relation identification.
Using English knowledge resource to English word obtained by step 5 to carrying out phrase semantic relation recognition, to determine in original The semantic relation of cliction language pair, it is specially:
Step 6-1) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), are carried according to English knowledge resource Take word ENAAntisense set of words ENASETAIf word ENB∈ENASETA, then English word ENAAnd ENBAntisense pass be present Otherwise system, namely former Chinese word go to step 6-2 to antonymy be present;
Step 6-2) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), according to English knowledge resource point Indescribably take word ENAAnd ENBPart set of words ENMSETAAnd ENMSETBIf word ENB∈ENMSETAOr ENA∈ ENMSETB, then English word ENAAnd ENBIntegral part relation be present, namely to integral part relation be present in former Chinese word, it is no Then go to step 6-3);
Step 6-3) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), are carried according to English knowledge resource Take word ENATongYiCi CiLin ENSSETAIf word ENB∈ENSSETA, then English word ENAAnd ENBSynonymous pass be present Otherwise system, namely former Chinese word go to step 6-4 to synonymy be present);
Step 6-4) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), according to English knowledge resource point Indescribably take word ENAAnd ENBThe next set of words ENHSETAAnd ENHSETBIf word ENB∈ENHSETAOr ENA∈ ENHSETB, then English word ENAAnd ENBHyponymy, namely former Chinese word be present to hyponymy be present.
Similarly, it can complete to identify the semantic relation of " people " and " head " word and operate, it is specific extensive in order to illustrate Operation, is transitioned into step 2-2 directly below):
In HowNet, word A " people " is extracted respectively and B " head " adopted former definition set obtains DEFSETA= " Behavior | manner:Host=human | people } ", " Physique | physique:Host=AnimalHuman | it is dynamic Thing } } ", " Strength | strength:Host=community | group } ", " human | people } ", " human | people: PersonPro=3rdPerson | he ", " human | people:PersonPro=3rdPerson | he, quantity= Mass | many ", " human | people:Modifier=adult | adult } ", " human | people:Quantity=mass | It is many } } ", " human | people:Engage | it is engaged in:Agent={~}, content=fact | thing:Modifier= Specific | specific ", DEFSETB=" part | part:PartPosition=head | and head }, whole= AnimalHuman | animal } ", it is clear that in the absence of DEFA∈DEFSETAAnd DEFB∈DEFSETBSo that DEFAContain " whole " attribute and its value are DEFB, or DEFBIt is DEF containing " whole " attribute and its valueA, therefore extensive operation is carried out, it is extensive DEFA=" { human | people } " is the original justice of justice " { AnimalHuman | animal } } " of its upperseat concept, now meets DEF be presentB ∈DEFSETBIt is " { AnimalHuman | animal } } " containing " whole " attribute and its value, therefore word " people " and " head " are deposited In integral part relation.
The word pair that can not be identified for Chinese knowledge resource, by step 5-1) to step 6-4) complete its semantic relation Identification operation, illustrates by taking word A " seasoning " and B " vinegar " as an example below:
According to step 5-1) utilize Iciba Chinese-English Dictionaries and HowNet that word A " seasoning " and B " vinegar " is translated to conversion respectively For corresponding English gather ENSET (A)={ " accessory food ", " condiments ", " seasoning " } and ENSET (B)={ " vinegar ", " jealousy " }.
According to step 6-1) word EN is extracted in English knowledge resource BabelNetA" accessory food " antisense Set of words obtainsExtract ENAThe antisense set of words of " condiments " obtainsExtract ENA The antisense set of words of " seasoning " obtainsObvious wordTherefore 6-2 is gone to step.
According to step 6-2) in English knowledge resource BabelNet, extract word ENA" accessory food " portion Participle is gatheredExtract word ENAThe part set of words of " condiments " obtainsCarry Take word ENAThe part set of words of " seasoning " obtainsExtract ENBThe part set of words of " vinegar " Extract ENBThe part set of words of " jealousy " obtainsObviously Therefore go to step 6-3).
According to step 6-3) in English knowledge resource BabelNet, extract word ENA" accessory food's " is same Adopted set of words obtainsExtract word ENAThe TongYiCi CiLin of " condiments " obtainsCarry Take word ENAThe TongYiCi CiLin of " seasoning " obtains ENSSETA=" flavorer ", " flavourer ", " flavoring ", " flavouring ", " seasoner ", " seasoning " }, because of wordTherefore go to step 6-4);
According to step 6-4) in English knowledge resource BabelNet, extract word ENA" under accessory food " Position word obtainsExtract word ENAThe hyponym of " condiments " obtains ENHSETA=" relish ", " dip ", " mustard ", " table mustard ", " catsup ", " ketchup ", " cetchup ", " tomato Ketchup ", " chili sauce ", " chutney ", " Indian relish ", " steak sauce ", " taco sauce ", " salsa ", " mint sauce ", " cranberry sauce ", " duck sauce ", " hoisin sauce ", " horseradish ", " marinade ", " soy sauce ", " soy ", " vinegar ", " acetum ", " sauce ", " spread ", " paste ", " wasabi " }, now meet ENB“vinegar”∈ENHSETA, therefore English word ENAAnd ENBDeposit In hyponymy, namely to there is hyponymy in former Chinese word.
Pass through above operating procedure, you can complete the semantic relation identification work of given word pair.
Correspondingly, the embodiment of the present invention also provides a kind of Chinese word semantic relation identification dress of combination China and Britain knowledge resource Put, its structural representation is as shown in Figure 2.
In this embodiment, described device includes:
Antonymy recognition unit 201, for obtaining antisense set of words using a variety of Chinese knowledge resources, according to antonym Whether semantic relation has antonymy between set judges word;
Integral part relation recognition unit 202, for being gathered using a variety of Chinese knowledge resource extraction unit participles, according to portion Participle set judges whether there is integral part relation between word;
Synonymy recognition unit 203, for extracting TongYiCi CiLins using a variety of Chinese knowledge resources, based on synonym Set judges whether there is synonymy between word;
Hyponymy recognition unit 204, for extracting the next set of words by means of a variety of Chinese knowledge resources, under Position set of words judges whether there is hyponymy between word;
Chinese-English translation unit 205, for Chinese word to be converted into English to translation using Chinese-English Dictionary;
English word semantic relation recognition unit 206, for utilizing English knowledge resource to English obtained by Chinese-English translation unit Cliction language is to carrying out phrase semantic relation recognition, to determine the semantic relation of former Chinese word pair.
The structural representation of the antonymy recognition unit 201 of Fig. 2 shown devices is as shown in figure 3, it includes:
HowNet antonymies recognition unit 301, for the antonymy using explicit definition in HowNet, to giving word Language A and B carry out word A antisense set of words ASETAExtraction operation, if B ∈ ASETA, then antonymy be present in two words, Otherwise turn Baidu's Chinese antonymy recognition unit, adopted relation is also served as at a kind of antonymy defined in HowNet in addition Reason;
Baidu's Chinese antonymy recognition unit 302, for the antisense set of words using the Baidu given word A of Chinese extraction ASETA, utilize Harbin Institute of Technology Chinese thesaurus extended edition extraction word A TongYiCi CiLin SSETA, for each word W ∈ SSETAExtract its antonym and be merged into ASETAIf word B ∈ ASETA, then word A and B antonymy be present, otherwise turn Baidupedia antonymy recognition unit;
Baidupedia antonymy recognition unit 303, for the antisense set of words using Baidupedia extraction word A ASETAIf word B ∈ ASETA, then two words antonymy be present, otherwise turn integral part relation recognition unit.
The structural representation of the integral part relation recognition unit 202 of Fig. 2 shown devices is as shown in figure 4, it includes:
HowNet integral part relation recognitions unit 401, for extracting word A and B part word respectively using HowNet Set MSETAAnd MSETBIf B ∈ MSETAOr A ∈ MSETB, then there is integral part relation in two words, otherwise escape is former Define integral part relation recognition unit;
Adopted original adopted integral part relation recognition unit 402, for being handled using the original justice of HowNet justice, defining In contain adopted former " part | part " word represent part word (part) of the word as some word, " whole " in definition The value of attribute indicates the original justice of justice of its overall word, extracts word A and B adopted former definition set DEFSET accordinglyAWith DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈DEFSETB, meet DEFAIt is containing " whole " attribute and its value DEFB, or DEFBIt is DEF containing " whole " attribute and its valueA, then word A and B integral part relation be present, otherwise turn synonymous pass It is recognition unit;
In addition, in adopted original adopted integral part relation recognition unit, some words are to directly using justice, originally justice can not Effectively identify integral part relation, can be handled by extensive mode, will be above-mentioned in " whole " attribute value it is extensive It is constant for its upperseat concept, remaining operation.
The structural representation of the synonymy recognition unit 203 of Fig. 2 shown devices is as shown in figure 5, it includes:
Word woods synonymy recognition unit 501, for according to the row that "=" is indicated in Harbin Institute of Technology's Chinese thesaurus extended edition Synonym is represented, obtains word A TongYiCi CiLin SSETAIf B ∈ SSETA, then word A and B synonymy be present, otherwise Turn HowNet synonymy recognition units;
HowNet synonymies recognition unit 502, for the TongYiCi CiLin SSET using HowNet extraction words AA, such as Fruit word B ∈ SSETA, then word A and B synonymy be present, otherwise turn Baidu's Chinese synonymy recognition unit;
Baidu's Chinese synonymy recognition unit 503, for the TongYiCi CiLin using Baidu Chinese extraction word A SSETAIf word B ∈ SSETA, then word A and B synonymy be present, otherwise turn Baidupedia synonymy recognition unit;
Baidupedia synonymy recognition unit 504, for the page link according to Baidupedia, word A is obtained respectively With B encyclopaedia link page set PSETAAnd PSETBIf meetThen word A and B exists same Adopted relation, otherwise turn hyponymy recognition unit.
The structural representation of the hyponymy recognition unit 204 of Fig. 2 shown devices is as shown in fig. 6, it includes:
HowNet hyponymies recognition unit 601, for extracting word A and B the next word set respectively using HowNet Close HSETAAnd HSETBIf B ∈ HSETAOr A ∈ HSETB, then word A and B hyponymy be present, otherwise escape is originally adopted Hyponymy recognition unit;
Adopted original adopted hyponymy recognition unit 602, for the hyponymy contained according to the original meaning of HowNet justice, Word A and B adopted former definition set DEFSET are extracted respectivelyAAnd DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈ DEFSETB, meet main justice it is former consistent andOrThen word A and B has upper the next pass System.
The structural representation of the Chinese-English translation unit 205 of Fig. 2 shown devices is as shown in fig. 7, it includes:
Chinese-English translation unit 701, corresponding English collection is converted to for translating word A and B respectively using Chinese-English Dictionary Close ENSET (A) and ENSET (B).
The structural representation of the English word semantic relation recognition unit 206 of Fig. 2 shown devices is as shown in figure 8, it includes:
English antonymy recognition unit 801, for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), Word EN is extracted according to English knowledge resourceAAntisense set of words ENASETAIf word ENB∈ENASETA, then English word ENAAnd ENBAntonymy be present, namely otherwise former Chinese word turns English integral part relation recognition list to antonymy be present Member;
English integral part relation recognition unit 802, for each English word ENA∈ ENSET (A), ENB∈ENSET (B) word EN, is extracted according to English knowledge resource respectivelyAAnd ENBPart set of words ENMSETAAnd ENMSETBIf word ENB∈ENMSETAOr ENA∈ENMSETB, then English word ENAAnd ENBIntegral part relation, namely former Chinese word pair be present Integral part relation be present, otherwise turn English synonymy recognition unit;
English synonymy recognition unit 803, for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), Word EN is extracted according to English knowledge resourceATongYiCi CiLin ENSSETAIf word ENB∈ENSSETA, then English word ENAAnd ENBSynonymy, namely former Chinese word being present to synonymy be present, it is single otherwise to turn English hyponymy identification Member;
English hyponymy recognition unit 804, for for each English word ENA∈ ENSET (A), ENB∈ENSET (B) word EN, is extracted according to English knowledge resource respectivelyAAnd ENBThe next set of words ENHSETAAnd ENHSETBIf word ENB∈ENHSETAOr ENA∈ENHSETB, then English word ENAAnd ENBHyponymy, namely former Chinese word be present to depositing In hyponymy.
The Chinese word semantic relation identification device of combination China and Britain knowledge resource shown in Fig. 2~Fig. 8 can be integrated into In various hardware entities.For example the Chinese word semantic relation identification device for combining Sino-British knowledge resource can be integrated into:It is individual Among the equipment such as people's computer, smart mobile phone, work station.
The combination China and Britain that embodiment of the present invention is proposed can be known by the storing mode that instruction or instruction set store The Chinese word semantic relation recognition methods for knowing resource is stored in various storage mediums.These storage mediums include but not limited to In:Floppy disk, CD, hard disk, internal memory, USB flash disk, CF cards, SM cards etc..
In summary, in embodiments of the present invention, antisense set of words, root are obtained by combining a variety of Chinese knowledge resources Whether semantic relation has antonymy between judging word according to antisense set of words;Segmented using a variety of Chinese knowledge resource extraction units Set, judge whether there is integral part relation between word according to part set of words;It is same using a variety of Chinese knowledge resource extractions Adopted set of words, judge whether there is synonymy between word based on TongYiCi CiLin;Extracted by means of a variety of Chinese knowledge resources The next set of words, judge whether there is hyponymy between word according to the next set of words;Using Chinese-English Dictionary by Chinese word English is converted to translation;Using English knowledge resource to English word obtained by Chinese-English translation to carrying out phrase semantic relation knowledge Not, with the semantic relation of the former Chinese word pair of determination.As can be seen here, after using embodiment of the present invention, realize in combination The Chinese word semantic relation identification of English knowledge resource.Embodiment of the present invention can utilize a variety of different Chinese knowledge resources Phrase semantic relation recognition is carried out, takes full advantage of every kind of knowledge resource;In integral part identification process is carried out, for The characteristics of HowNet justice is original adopted, the present invention is supplemented by extensive method, recognition methods adaptability is got a promotion; In hyponymy identification operation is carried out, the present invention fully excavates the information contained of the original justice of justice in HowNet, effectively Improve the accuracy of identification;Present invention incorporates Chinese and English knowledge resource, supplements Chinese knowledge using English knowledge resource and provides The unlapped phrase semantic relation in source, improves discrimination;The Chinese word language of combination China and Britain proposed by the present invention knowledge resource Adopted relation recognition method and apparatus, it is capable of the semantic relation of the given word pair of automatic identification, including antonymy, integral part close System, synonymy, hyponymy, there is higher recognition correct rate.
Embodiment in this specification is described by the way of progressive, mutually the same similar part mutually referring to. For device embodiment, because it is substantially similar to embodiment of the method, so describe fairly simple, correlation Place illustrates referring to the part of embodiment of the method.
The embodiment of the present invention is described in detail above, embodiment used herein is carried out to the present invention Illustrate, the explanation of above example is only intended to help and understands methods and apparatus of the present invention;Meanwhile for the one of this area As technical staff, according to the thought of the present invention, there will be changes in specific embodiments and applications, therefore this explanation Book should not be construed as limiting the invention.

Claims (7)

  1. A kind of 1. Chinese word semantic relation recognition methods of combination China and Britain knowledge resource, it is characterised in that this method include with Lower step:
    Step 1: obtaining antisense set of words with reference to a variety of Chinese knowledge resources, semanteme closes between judging word according to antisense set of words Whether system has antonymy;
    Step 2: using a variety of Chinese knowledge resource extraction unit participle set, judge whether have between word according to part set of words There is integral part relation;
    Step 3: extracting TongYiCi CiLin using a variety of Chinese knowledge resources, judge whether have between word based on TongYiCi CiLin There is synonymy;
    Step 4: extract the next set of words by means of a variety of Chinese knowledge resources, according to the next set of words judge between word whether With hyponymy;
    Step 5: Chinese word is converted into English to translation using Chinese-English Dictionary;
    Step 6: using English knowledge resource to English word obtained by step 5 to carrying out phrase semantic relation recognition, to determine The semantic relation of former Chinese word pair.
  2. 2. the Chinese word semantic relation recognition methods of combination China and Britain according to claim 1 knowledge resource, its feature exist In in the step 1, when judging antisense semantic relation, specially:
    Step 1-1) using the antonymy of explicit definition in HowNet, the antisense word set to giving word A and B progress word A Close ASETAExtraction operation, if B ∈ ASETA, then there is antonymy in two words, otherwise go to step 1-2), HowNet in addition Defined in a kind of processing of antonymy is also served as to adopted relation;
    Step 1-2) use the Baidu given word A of Chinese extraction antisense set of words ASETA, expanded using Harbin Institute of Technology's Chinese thesaurus Open up version extraction word A TongYiCi CiLin SSETA, for each word W ∈ SSETAExtract its antonym and be merged into ASETA, If word B ∈ ASETA, then word A and B antonymy be present, otherwise go to step 1-3);
    Step 1-3) use Baidupedia extraction word A antisense set of words ASETAIf word B ∈ ASETA, then two words deposit In antonymy, 2-1 is otherwise gone to step).
  3. 3. the Chinese word semantic relation recognition methods of combination China and Britain according to claim 1 knowledge resource, its feature exist In in the step 2, when judging integral part relation, specially:
    Step 2-1) using HowNet word A and B part set of words MSET is extracted respectivelyAAnd MSETBIf B ∈ MSETAOr A ∈MSETB, then there is integral part relation in two words, otherwise go to step 2-2);
    Step 2-2) handled using the original justice of HowNet justice, the word containing justice former " part | part " in definition represents Part word (part) of the word as some word, the value of " whole " attribute in definition indicate the justice of its overall word originally Justice, word A and B adopted former definition set DEFSET are extracted accordinglyAAnd DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈ DEFSETB, meet DEFAIt is DEF containing " whole " attribute and its valueB, or DEFBIt is DEF containing " whole " attribute and its valueA, Then there is integral part relation in word A and B, otherwise go to step 3-1);
    In addition, some words are to directly utilizing justice, originally justice can not effectively identify integral part relation, can be by extensive Mode is handled, will be above-mentioned in the value of " whole " attribute be generalized for its upperseat concept, remaining operation is constant.
  4. 4. the Chinese word semantic relation recognition methods of combination China and Britain according to claim 1 knowledge resource, its feature exist In in the step 3, when judging synonymy, specially:
    Step 3-1) according to the row expression synonym that "=" is indicated in Harbin Institute of Technology's Chinese thesaurus extended edition, obtain the same of word A Adopted set of words SSETAIf B ∈ SSETA, then word A and B synonymy be present, otherwise go to step 3-2);
    Step 3-2) utilize HowNet extraction words A TongYiCi CiLin SSETAIf word B ∈ SSETA, then word A and B are deposited In synonymy, 3-3 is otherwise gone to step);
    Step 3-3) utilize Baidu Chinese extraction word A TongYiCi CiLin SSETAIf word B ∈ SSETA, then word A and B Synonymy be present, otherwise go to step 3-4);
    Step 3-4) according to the page link of Baidupedia, the encyclopaedia for obtaining word A and B respectively links page set PSETAWith PSETBIf meetThen there is synonymy in word A and B, otherwise go to step 4-1).
  5. 5. the Chinese word semantic relation recognition methods of combination China and Britain according to claim 1 knowledge resource, its feature exist In in the step 4, when judging hyponymy, specially:
    Step 4-1) using HowNet word A and B the next set of words HSET is extracted respectivelyAAnd HSETBIf B ∈ HSETAOr A ∈HSETB, then word A and B hyponymy be present, otherwise go to step 4-2);
    Step 4-2) hyponymy that is contained according to the original meaning of HowNet justice, extracts word A and B adopted former definition set respectively DEFSETAAnd DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈DEFSETB, meet main justice it is former consistent andOrThen word A and B have hyponymy.
  6. 6. the Chinese word semantic relation identification device of a kind of combination China and Britain knowledge resource, it is characterised in that the device includes anti- Adopted relation recognition unit, integral part relation recognition unit, synonym relation recognition unit, hyponymy recognition unit, its In:
    Antonymy recognition unit, for obtaining antisense set of words using a variety of Chinese knowledge resources, sentenced according to antisense set of words Whether semantic relation has antonymy between determining word;
    Integral part relation recognition unit, for being gathered using a variety of Chinese knowledge resource extraction unit participles, according to part word set Close and judge whether there is integral part relation between word;
    Synonymy recognition unit, for using a variety of Chinese knowledge resource extraction TongYiCi CiLins, being sentenced based on TongYiCi CiLin Determine whether there is synonymy between word;
    Hyponymy recognition unit, for extracting the next set of words by means of a variety of Chinese knowledge resources, according to the next word set Close and judge whether there is hyponymy between word;
    Chinese-English translation unit, for Chinese word to be converted into English to translation using Chinese-English Dictionary;
    English word semantic relation recognition unit, for utilizing English knowledge resource to English word pair obtained by Chinese-English translation unit Phrase semantic relation recognition is carried out, to determine the semantic relation of former Chinese word pair.
  7. 7. the Chinese word semantic relation identification device of combination China and Britain according to claim 6 knowledge resource, its feature exist In, it is described including:
    HowNet antonymy recognition units, for the antonymy using explicit definition in HowNet, to giving word A and B Carry out word A antisense set of words ASETAExtraction operation, if B ∈ ASETA, then be present antonymy in two words, otherwise turn Baidu's Chinese antonymy recognition unit, a kind of antonymy processing is also served as to adopted relation defined in HowNet in addition;
    Baidu's Chinese antonymy recognition unit, for the antisense set of words ASET using the Baidu given word A of Chinese extractionA, profit With Harbin Institute of Technology Chinese thesaurus extended edition extraction word A TongYiCi CiLin SSETA, for each word W ∈ SSETAExtract it Antonym is simultaneously merged into ASETAIf word B ∈ ASETA, then word A and B antonymy be present, otherwise turn Baidupedia antisense Relation recognition unit;
    Baidupedia antonymy recognition unit, for the antisense set of words ASET using Baidupedia extraction word AAIf word Language B ∈ ASETA, then two words antonymy be present, otherwise turn integral part relation recognition unit.
    HowNet integral part relation recognition units, for extracting word A and B part set of words MSET respectively using HowNetA And MSETBIf B ∈ MSETAOr A ∈ MSETB, then there is integral part relation in two words, otherwise the original overall portion of justice of escape Divide relation recognition unit;
    Adopted original adopted integral part relation recognition unit, for being handled using the original justice of HowNet justice, contain in definition The word of adopted former " part | part " represents part word (part) of the word as some word, " whole " attribute in definition Value indicates the original justice of justice of its overall word, extracts word A and B adopted former definition set DEFSET accordinglyAAnd DEFSETBIf DEF be presentA∈DEFSETAAnd DEFB∈DEFSETB, meet DEFAIt is DEF containing " whole " attribute and its valueB, or DEFBContain " whole " attribute and its value are DEFA, then word A and B integral part relation be present, otherwise turn synonymy recognition unit;
    In addition, in adopted original adopted integral part relation recognition unit, some words are to directly using justice, originally justice can not be effective Identify integral part relation, can be handled by extensive mode, will be above-mentioned in the value of " whole " attribute be generalized for it Upperseat concept, remaining operation are constant.
    Word woods synonymy recognition unit, for representing synonymous according to the row that "=" is indicated in Harbin Institute of Technology's Chinese thesaurus extended edition Word, obtain word A TongYiCi CiLin SSETAIf B ∈ SSETA, then word A and B synonymy be present, otherwise turn HowNet Synonymy recognition unit;
    HowNet synonymy recognition units, for the TongYiCi CiLin SSET using HowNet extraction words AAIf word B ∈SSETA, then word A and B synonymy be present, otherwise turn Baidu's Chinese synonymy recognition unit;
    Baidu's Chinese synonymy recognition unit, for the TongYiCi CiLin SSET using Baidu Chinese extraction word AAIf word Language B ∈ SSETA, then word A and B synonymy be present, otherwise turn Baidupedia synonymy recognition unit;
    Baidupedia synonymy recognition unit, for the page link according to Baidupedia, the hundred of word A and B is obtained respectively Section link page set PSETAAnd PSETBIf meetThen there is synonymy in word A and B, Otherwise hyponymy recognition unit is turned.
    HowNet hyponymy recognition units, for extracting word A and B the next set of words HSET respectively using HowNetAWith HSETBIf B ∈ HSETAOr A ∈ HSETB, then word A and B hyponymy be present, the otherwise original adopted hyponymy of escape Recognition unit;
    Adopted original adopted hyponymy recognition unit, for the hyponymy contained according to the original meaning of HowNet justice, carry respectively Take word A and B adopted former definition set DEFSETAAnd DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈DEFSETB, Meet main justice it is former consistent andOrThen word A and B have hyponymy.
    Chinese-English translation unit, corresponding English set ENSET is converted to for translating word A and B respectively using Chinese-English Dictionary And ENSET (B) (A).
    English antonymy recognition unit, for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), according to English Literary knowledge resource extraction word ENAAntisense set of words ENASETAIf word ENB∈ENASETA, then English word ENAWith ENBAntonymy be present, namely otherwise former Chinese word turns English integral part relation recognition unit to antonymy be present;
    English integral part relation recognition unit, for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), root Word EN is extracted respectively according to English knowledge resourceAAnd ENBPart set of words ENMSETAAnd ENMSETBIf word ENB∈ ENMSETAOr ENA∈ENMSETB, then English word ENAAnd ENBIntegral part relation be present, namely former Chinese word is whole to existing Body portion relation, otherwise turn English synonymy recognition unit;
    English synonymy recognition unit, for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), according to English Literary knowledge resource extraction word ENATongYiCi CiLin ENSSETAIf word ENB∈ENSSETA, then English word ENAWith ENBSynonymy be present, namely otherwise former Chinese word turns English hyponymy recognition unit to synonymy be present;
    English hyponymy recognition unit, for for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), root Word EN is extracted respectively according to English knowledge resourceAAnd ENBThe next set of words ENHSETAAnd ENHSETBIf word ENB∈ ENHSETAOr ENA∈ENHSETB, then English word ENAAnd ENBHyponymy, namely former Chinese word be present to existing up and down Position relation.
CN201710706832.9A 2017-08-17 2017-08-17 Chinese word semantic relation recognition method and device combining Chinese and English knowledge resources Active CN107451130B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710706832.9A CN107451130B (en) 2017-08-17 2017-08-17 Chinese word semantic relation recognition method and device combining Chinese and English knowledge resources

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710706832.9A CN107451130B (en) 2017-08-17 2017-08-17 Chinese word semantic relation recognition method and device combining Chinese and English knowledge resources

Publications (2)

Publication Number Publication Date
CN107451130A true CN107451130A (en) 2017-12-08
CN107451130B CN107451130B (en) 2021-04-02

Family

ID=60492720

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710706832.9A Active CN107451130B (en) 2017-08-17 2017-08-17 Chinese word semantic relation recognition method and device combining Chinese and English knowledge resources

Country Status (1)

Country Link
CN (1) CN107451130B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109902673A (en) * 2019-01-28 2019-06-18 北京明略软件系统有限公司 Table Header information identification and method for sorting, system, terminal and storage medium in table

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2280159A1 (en) * 2009-07-31 2011-02-02 International Engine Intellectual Property Company, LLC. Exhaust gas cooler
CN103473222A (en) * 2013-09-16 2013-12-25 中央民族大学 Semantic ontology creation and vocabulary expansion method for Tibetan language
CN104484411A (en) * 2014-12-16 2015-04-01 中国科学院自动化研究所 Building method for semantic knowledge base based on a dictionary
CN106202034A (en) * 2016-06-29 2016-12-07 齐鲁工业大学 A kind of adjective word sense disambiguation method based on interdependent constraint and knowledge and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2280159A1 (en) * 2009-07-31 2011-02-02 International Engine Intellectual Property Company, LLC. Exhaust gas cooler
CN103473222A (en) * 2013-09-16 2013-12-25 中央民族大学 Semantic ontology creation and vocabulary expansion method for Tibetan language
CN104484411A (en) * 2014-12-16 2015-04-01 中国科学院自动化研究所 Building method for semantic knowledge base based on a dictionary
CN106202034A (en) * 2016-06-29 2016-12-07 齐鲁工业大学 A kind of adjective word sense disambiguation method based on interdependent constraint and knowledge and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
IRIS HENDRICKX等: "SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals", 《SEW "09: PROCEEDINGS OF THE WORKSHOP ON SEMANTIC EVALUATIONS: RECENT ACHIEVEMENTS AND FUTURE DIRECTIONS》 *
SHUTIAN MA等: "NLPCC 2016 Shared Task Chinese Words Similarity Measure via Ensemble Learning Based on Multiple Resources", 《NLPCC 2016: NATURAL LANGUAGE UNDERSTANDING AND INTELLIGENT APPLICATIONS》 *
范庆虎等: "基于词典和Web的词汇关系抽取", 《HTTP://WWW.DOC88.COM/P-1146077617476.HTML》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109902673A (en) * 2019-01-28 2019-06-18 北京明略软件系统有限公司 Table Header information identification and method for sorting, system, terminal and storage medium in table

Also Published As

Publication number Publication date
CN107451130B (en) 2021-04-02

Similar Documents

Publication Publication Date Title
Amsler The structure of the Merriam-Webster pocket dictionary
CN104516947B (en) A kind of Chinese microblog emotional analysis method for merging dominant and recessive character
CN105988990A (en) Device and method for resolving zero anaphora in Chinese language, as well as training method
CN105787461A (en) Text-classification-and-condition-random-field-based adverse reaction entity identification method in traditional Chinese medicine literature
CN107451130A (en) A kind of Chinese word semantic relation recognition methods of combination China and Britain knowledge resource and device
CN111488427A (en) Vehicle interaction method, vehicle interaction system, computing device and storage medium
CN113723528A (en) Vehicle-mounted voice-video fusion multi-mode interaction method, system, device and storage medium
CN111784526A (en) Personalized recommendation method for personal accident risk
CN107451123A (en) A kind of Chinese word semantic relation recognition methods and device based on a variety of Chinese knowledge resources
CN108563647A (en) A kind of automobile Method for Sales Forecast method based on comment sentiment analysis
CN114419589A (en) Road target detection method based on attention feature enhancement module
Coxon et al. Urban mobility design
Sytsma Ordinary meaning and consilience of evidence
CN102929863A (en) Method for intelligently analyzing Chinese character emotional tendency through computer
CN109670480A (en) Image discriminating method, device, equipment and storage medium
van Dulken Do you know English? The challenge of the English language for patent searchers
CN202911888U (en) Folding luggage box type bicycle
Williams Motoring: Swanky, safe, and more affordable
CN105955993B (en) Search result ordering method and device
CN109241013A (en) A kind of method of book content audit in shared book system
Lee Random Forest with Transfer Learning: An Application to Vehicle Valuation
Belletti Felicio et al. Classification of Motorcycles using Extracted Images of Traffic Monitoring Videos
Cialdai et al. Motorcycle-to-car impact: influence of the mass of the rider in the calculation of the relative impact velocity
JP2002117401A (en) Adult and sex image detection system
Vadivel et al. FINE-GRAINED MULTI-CLASS ROAD SEGMENTATION USING MULTISCALE PROBABILITY LEARNING

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant