CN107451130A - A kind of Chinese word semantic relation recognition methods of combination China and Britain knowledge resource and device - Google Patents
A kind of Chinese word semantic relation recognition methods of combination China and Britain knowledge resource and device Download PDFInfo
- Publication number
- CN107451130A CN107451130A CN201710706832.9A CN201710706832A CN107451130A CN 107451130 A CN107451130 A CN 107451130A CN 201710706832 A CN201710706832 A CN 201710706832A CN 107451130 A CN107451130 A CN 107451130A
- Authority
- CN
- China
- Prior art keywords
- word
- chinese
- english
- words
- recognition unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/55—Rule-based translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Abstract
Chinese word semantic relation recognition methods and device the invention discloses a kind of combination China and Britain knowledge resource.Method includes:Antisense set of words is obtained with reference to a variety of Chinese knowledge resources, whether semantic relation has antonymy between judging word according to antisense set of words;Using a variety of Chinese knowledge resource extraction unit participle set, judge whether there is integral part relation between word according to part set of words;Using a variety of Chinese knowledge resource extraction TongYiCi CiLins, judge whether there is synonymy between word based on TongYiCi CiLin;The next set of words is extracted by means of a variety of Chinese knowledge resources, judges whether there is hyponymy between word according to the next set of words;Chinese word is converted into English to translation using Chinese-English Dictionary;Using English knowledge resource to English word obtained by Chinese-English translation to carrying out phrase semantic relation recognition, to determine the semantic relation of former Chinese word pair.Using the present invention, the effect of a variety of Chinese knowledge resources can be given full play to, more accurately and effectively identifies Chinese word semantic relation.
Description
Technical field
The present invention relates to natural language processing technique field, and in particular to a kind of Chinese word of combination China and Britain knowledge resource
Semantic relation recognition methods and device.
Background technology
Semantic relation identification refers to given word possessed semantic relation automatic judgement word.Typical semantic pass
System includes:Antonymy, integral part relation, synonymy, hyponymy etc..Semantic relation identification is natural language processing
The basic task in field, have to word sense disambiguation, ontologies structure, machine translation, information retrieval, text classification etc. direct
Influence.
Current most of semantic relation Study of recognition work are typically based on one or more knowledge moneys mainly for English
Source, complete the classification of English semantic relation using the method for the statistical learnings such as SVMs, Bayes classifier or identification is appointed
Business, achieves preferable effect.Research work in terms of the semantic relation recognition of Chinese is relatively fewer, and most related works are usual
Using a certain knowledge resource, the identification of semantic relation is carried out by means of statistical learning method.Existing research work is only adopted
With a certain knowledge resource, and it have ignored the digging utilization to other Languages knowledge resource;Statistical learning method is marked unavoidably
The restriction of the scale of language material is noted, accuracy rate is also difficult to ensure that.Construction and perfect, these resources with all kinds of linguistry resources
Complement one another, more reliable knowledge is provided for the identification of semantic relation.
The above technical problem existing in face of the identification of Chinese word semantic relation, patent of the present invention, which is fully excavated, a variety of to be known
Know resource inherent semantic relation, realize a kind of Chinese word semantic relation recognition methods based on a variety of Chinese knowledge resources and
Device, make every effort to promote the solution of these problems to a certain extent.
The content of the invention
To solve the shortcomings of the prior art, the invention discloses a kind of Chinese word language of combination China and Britain knowledge resource
Adopted relation recognition method and apparatus, more accurately and effectively to judge the semantic relation between Chinese word.
Therefore, the present invention provides following technical scheme:
A kind of Chinese word semantic relation recognition methods of combination China and Britain knowledge resource, comprises the following steps:
Step 1: antisense set of words is obtained with reference to a variety of Chinese knowledge resources, according to language between antisense set of words judgement word
Whether adopted relation has antonymy;
Step 2: using a variety of Chinese knowledge resource extraction unit participle set, it is according between part set of words judgement word
It is no that there is integral part relation;
Step 3: extract TongYiCi CiLins using a variety of Chinese knowledge resources, judged based on TongYiCi CiLin be between word
It is no that there is synonymy;
Step 4: the next set of words is extracted by means of a variety of Chinese knowledge resources, according between the next set of words judgement word
Whether there is hyponymy;
Step 5: Chinese word is converted into English to translation using Chinese-English Dictionary;
Step 6: using English knowledge resource to English word obtained by step 5 to carrying out phrase semantic relation recognition, with
It is determined that the semantic relation of former Chinese word pair.
Further, in the step 1, when judging antisense semantic relation, it is specially:
Step 1-1) using the antonymy of explicit definition in HowNet, the antisense to giving word A and B progress word A
Set of words ASETAExtraction operation, if B ∈ ASETA, then there is antonymy in two words, otherwise go to step 1-2), in addition
A kind of antonymy processing is also served as to adopted relation defined in HowNet;
Step 1-2) use the Baidu given word A of Chinese extraction antisense set of words ASETA, utilize Harbin Institute of Technology's synonym word
Woods extended edition extraction word A TongYiCi CiLin SSETA, for each word W ∈ SSETAExtract its antonym and be merged into
ASETAIf word B ∈ ASETA, then word A and B antonymy be present, otherwise go to step 1-3);
Step 1-3) use Baidupedia extraction word A antisense set of words ASETAIf word B ∈ ASETA, then two word
There is antonymy in language, otherwise go to step 2-1).
Further, in the step 2, when judging integral part relation, it is specially:
Step 2-1) using HowNet word A and B part set of words MSET is extracted respectivelyAAnd MSETBIf B ∈
MSETAOr A ∈ MSETB, then there is integral part relation in two words, otherwise go to step 2-2);
Step 2-2) handled using the original justice of HowNet justice, the word containing justice former " part | part " in definition
Part word (part) of the word as some word is represented, the value of " whole " attribute in definition indicates the justice of its overall word
Original justice, word A and B adopted former definition set DEFSET are extracted accordinglyAAnd DEFSETB, if there is DEFA∈DEFSETAWith
DEFB∈DEFSETB, meet DEFAIt is DEF containing " whole " attribute and its valueB, or DEFBIt is containing " whole " attribute and its value
DEFA, then word A and B integral part relation be present, otherwise go to step 3-1);
In addition, some words are to directly utilizing justice, originally justice can not effectively identify integral part relation, can be by general
The mode of change is handled, will be above-mentioned in the value of " whole " attribute be generalized for its upperseat concept, remaining operation is constant.
Further, in the step 3, when judging synonymy, it is specially:
Step 3-1) according to the row expression synonym that "=" is indicated in Harbin Institute of Technology's Chinese thesaurus extended edition, obtain word A
TongYiCi CiLin SSETAIf B ∈ SSETA, then word A and B synonymy be present, otherwise go to step 3-2);
Step 3-2) utilize HowNet extraction words A TongYiCi CiLin SSETAIf word B ∈ SSETA, then word A
Synonymy be present with B, otherwise go to step 3-3);
Step 3-3) utilize Baidu Chinese extraction word A TongYiCi CiLin SSETAIf word B ∈ SSETA, then word
There is synonymy in A and B, otherwise go to step 3-4);
Step 3-4) according to the page link of Baidupedia, the encyclopaedia for obtaining word A and B respectively links page set
PSETAAnd PSETBIf meetThen there is synonymy in word A and B, otherwise go to step 4-1).
Further, in the step 4, when judging hyponymy, it is specially:
Step 4-1) using HowNet word A and B the next set of words HSET is extracted respectivelyAAnd HSETBIf B ∈
HSETAOr A ∈ HSETB, then word A and B hyponymy be present, otherwise go to step 4-2);
Step 4-2) according to the adopted original justice of the HowNet justice hyponymy that originally meaning contains, respectively extraction word A and B
Set DEFSETAAnd DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈DEFSETB, meet main justice it is former consistent andOrThen word A and B has hyponymy, otherwise goes to step 5-1).
Further, in the step 5, when translating word pair, it is specially:
Step 5-1) word A and B translated respectively using Chinese-English Dictionary be converted to corresponding English set ENSET (A) and
ENSET(B)。
Further, in the step 6, when carrying out semantics recognition using English knowledge resource, it is specially:
Step 6-1) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), are carried according to English knowledge resource
Take word ENAAntisense set of words ENASETAIf word ENB∈ENASETA, then English word ENAAnd ENBAntisense pass be present
Otherwise system, namely former Chinese word go to step 6-2 to antonymy be present;
Step 6-2) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), according to English knowledge resource point
Indescribably take word ENAAnd ENBPart set of words ENMSETAAnd ENMSETBIf word ENB∈ENMSETAOr ENA∈
ENMSETB, then English word ENAAnd ENBIntegral part relation be present, namely to integral part relation be present in former Chinese word, it is no
Then go to step 6-3);
Step 6-3) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), are carried according to English knowledge resource
Take word ENATongYiCi CiLin ENSSETAIf word ENB∈ENSSETA, then English word ENAAnd ENBSynonymous pass be present
Otherwise system, namely former Chinese word go to step 6-4 to synonymy be present);
Step 6-4) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), according to English knowledge resource point
Indescribably take word ENAAnd ENBThe next set of words ENHSETAAnd ENHSETBIf word ENB∈ENHSETAOr ENA∈
ENHSETB, then English word ENAAnd ENBHyponymy, namely former Chinese word be present to hyponymy be present.
A kind of Chinese word semantic relation identification device of combination China and Britain knowledge resource, including:
Antonymy recognition unit, for obtaining antisense set of words using a variety of Chinese knowledge resources, according to antisense word set
Close whether semantic relation between judging word has antonymy;
Integral part relation recognition unit, for being gathered using a variety of Chinese knowledge resource extraction unit participles, according to part
Set of words judges whether there is integral part relation between word;
Synonymy recognition unit, for extracting TongYiCi CiLins using a variety of Chinese knowledge resources, based on synset
Close and judge whether there is synonymy between word;
Hyponymy recognition unit, for extracting the next set of words by means of a variety of Chinese knowledge resources, according to bottom
Set of words judges whether there is hyponymy between word;
Chinese-English translation unit, for Chinese word to be converted into English to translation using Chinese-English Dictionary;
English word semantic relation recognition unit, for utilizing English knowledge resource to English words obtained by Chinese-English translation unit
Language is to carrying out phrase semantic relation recognition, to determine the semantic relation of former Chinese word pair.
Further, the antonymy recognition unit also includes:
HowNet antonymy recognition units, for the antonymy using explicit definition in HowNet, to giving word A
Word A antisense set of words ASET is carried out with BAExtraction operation, if B ∈ ASETA, then there is antonymy in two words, otherwise
Turn Baidu's Chinese antonymy recognition unit, a kind of antonymy processing is also served as to adopted relation defined in HowNet in addition;
Baidu's Chinese antonymy recognition unit, for the antisense set of words using the Baidu given word A of Chinese extraction
ASETA, utilize Harbin Institute of Technology Chinese thesaurus extended edition extraction word A TongYiCi CiLin SSETA, for each word W ∈
SSETAExtract its antonym and be merged into ASETAIf word B ∈ ASETA, then word A and B antonymy be present, otherwise turn
Baidupedia antonymy recognition unit;
Baidupedia antonymy recognition unit, for the antisense set of words ASET using Baidupedia extraction word AA,
If word B ∈ ASETA, then two words antonymy be present, otherwise turn integral part relation recognition unit.
Further, the integral part relation recognition unit also includes:
HowNet integral part relation recognition units, for extracting word A and B part set of words respectively using HowNet
MSETAAnd MSETBIf B ∈ MSETAOr A ∈ MSETB, then there is integral part relation in two words, otherwise escape is originally adopted
Integral part relation recognition unit;
Adopted original adopted integral part relation recognition unit, for being handled using the original justice of HowNet justice, in definition
The word for containing adopted former " part | part " represents part word (part) of the word as some word, " whole " category in definition
Property value indicate the original justice of justice of its overall word, extract word A and B adopted former definition set DEFSET accordinglyAAnd DEFSETB,
If there is DEFA∈DEFSETAAnd DEFB∈DEFSETB, meet DEFAIt is DEF containing " whole " attribute and its valueB, or DEFB
It is DEF containing " whole " attribute and its valueA, then word A and B integral part relation be present, it is single otherwise to turn synonymy identification
Member;
In addition, in adopted original adopted integral part relation recognition unit, some words are to directly using justice, originally justice can not
Effectively identify integral part relation, can be handled by extensive mode, will be above-mentioned in " whole " attribute value it is extensive
It is constant for its upperseat concept, remaining operation.
Further, the synonymy recognition unit also includes:
Word woods synonymy recognition unit, for being represented according to the row that "=" is indicated in Harbin Institute of Technology's Chinese thesaurus extended edition
Synonym, obtain word A TongYiCi CiLin SSETAIf B ∈ SSETA, then word A and B synonymy be present, otherwise turn
HowNet synonymy recognition units;
HowNet synonymy recognition units, for the TongYiCi CiLin SSET using HowNet extraction words AAIf
Word B ∈ SSETA, then word A and B synonymy be present, otherwise turn Baidu's Chinese synonymy recognition unit;
Baidu's Chinese synonymy recognition unit, for the TongYiCi CiLin SSET using Baidu Chinese extraction word AA,
If word B ∈ SSETA, then word A and B synonymy be present, otherwise turn Baidupedia synonymy recognition unit;
Baidupedia synonymy recognition unit, for the page link according to Baidupedia, word A and B are obtained respectively
Encyclopaedia link page set PSETAAnd PSETBIf meetThen there is synonymous pass in word A and B
System, otherwise turns hyponymy recognition unit.
Further, the hyponymy recognition unit also includes:
HowNet hyponymy recognition units, for extracting word A and B the next set of words respectively using HowNet
HSETAAnd HSETBIf B ∈ HSETAOr A ∈ HSETB, then word A and B hyponymy be present, otherwise in the original justice of escape
The next relation recognition unit;
Adopted original adopted hyponymy recognition unit, for the hyponymy contained according to the original meaning of HowNet justice, divide
Indescribably take word A and B adopted former definition set DEFSETAAnd DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈
DEFSETB, meet main justice it is former consistent andOrThen word A and B has upper the next pass
System, otherwise turns Chinese-English translation unit.
Further, the Chinese-English translation unit, in addition to:
Chinese-English translation unit, corresponding English set is converted to for translating word A and B respectively using Chinese-English Dictionary
ENSET (A) and ENSET (B).
Further, the English word semantic relation recognition unit also includes:
English antonymy recognition unit, for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), root
According to English knowledge resource extraction word ENAAntisense set of words ENASETAIf word ENB∈ENASETA, then English word ENA
And ENBAntonymy be present, namely otherwise former Chinese word turns English integral part relation recognition list to antonymy be present
Member;
English integral part relation recognition unit, for each English word ENA∈ ENSET (A), ENB∈ENSET
(B) word EN, is extracted according to English knowledge resource respectivelyAAnd ENBPart set of words ENMSETAAnd ENMSETBIf word
ENB∈ENMSETAOr ENA∈ENMSETB, then English word ENAAnd ENBIntegral part relation, namely former Chinese word pair be present
Integral part relation be present, otherwise turn English synonymy recognition unit;
English synonymy recognition unit, for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), root
According to English knowledge resource extraction word ENATongYiCi CiLin ENSSETAIf word ENB∈ENSSETA, then English word ENA
And ENBSynonymy be present, namely otherwise former Chinese word turns English hyponymy recognition unit to synonymy be present;
English hyponymy recognition unit, for for each English word ENA∈ ENSET (A), ENB∈ENSET
(B) word EN, is extracted according to English knowledge resource respectivelyAAnd ENBThe next set of words ENHSETAAnd ENHSETBIf word
ENB∈ENHSETAOr ENA∈ENHSETB, then English word ENAAnd ENBHyponymy, namely former Chinese word be present to depositing
In hyponymy.
Beneficial effects of the present invention:
1st, the present invention carries out phrase semantic relation recognition using a variety of different Chinese knowledge resources, takes full advantage of every kind of
Knowledge resource.
2nd, in integral part relation recognition operation is carried out, for HowNet justice it is original adopted the characteristics of, the present invention passes through
Extensive method is supplemented, and recognition methods adaptability is got a promotion.
3rd, in hyponymy identification operation is carried out, the present invention fully excavates the letter contained of the original justice of justice in HowNet
Breath, it is effectively improved the accuracy of identification.
4th, present invention incorporates Chinese and English knowledge resource, it is unlapped to supplement Chinese knowledge resource using English knowledge resource
Phrase semantic relation, improves discrimination.
5th, the Chinese word semantic relation recognition methods of combination China and Britain proposed by the present invention knowledge resource and device, can be certainly
The semantic relation of the dynamic given word pair of identification, including antonymy, integral part relation, synonymy, hyponymy, have
Higher recognition correct rate.
Brief description of the drawings
Fig. 1 is the stream for the Chinese word semantic relation recognition methods that Sino-British knowledge resource is combined according to embodiment of the present invention
Cheng Tu;
Fig. 2 is the knot for the Chinese word semantic relation identification device that Sino-British knowledge resource is combined according to embodiment of the present invention
Structure schematic diagram;
Fig. 3 is the structural representation according to embodiment of the present invention antonymy recognition unit;
Fig. 4 is the structural representation according to embodiment of the present invention integral part relation recognition unit;
Fig. 5 is the structural representation according to embodiment of the present invention synonymy recognition unit;
Fig. 6 is the structural representation according to embodiment of the present invention hyponymy recognition unit;
Fig. 7 is the structural representation according to the Chinese-English translation unit of embodiment of the present invention;
Fig. 8 is the structural representation according to embodiment of the present invention English word semantic relation recognition unit;
Embodiment:
In order that those skilled in the art more fully understand the scheme of the embodiment of the present invention, below in conjunction with the accompanying drawings and implement
Mode is described in further detail to inventive embodiments.
By to by the word that word A " motor vehicle " and word B " truck " are formed to carry out semantics recognition processing exemplified by.
The embodiment of the present invention combines the flow chart of the Chinese word semantic relation recognition methods of Sino-British knowledge resource, such as Fig. 1
It is shown, comprise the following steps:
Step 101, antonymy identifies.
Antisense set of words is obtained with reference to a variety of Chinese knowledge resources, semantic relation is between judging word according to antisense set of words
It is no that there is antonymy, be specially:
Step 1-1) using the antonymy of explicit definition in HowNet, the antisense to giving word A and B progress word A
Set of words ASETAExtraction operation, if B ∈ ASETA, then there is antonymy in two words, otherwise go to step 1-2), in addition
A kind of antonymy processing is also served as to adopted relation defined in HowNet;
The antonym (including to adopted word) for extracting word A " motor vehicle " in HowNet gathers to obtain ASETA=" wooden handcart ",
" large flatbed tricycle ", " cart ", " bicycle ", " old rickshaw ", " wheelbarrow ", " rickshaw ", " yellow croaker car ", " handcart ", " rubber car ",
" bicycle ", " donkey cart ", " carriage ", " donkey car ", " ox cart ", " large handcart ", " flat car ", " flat board three-wheel ", " flatcar ",
" rickshaw ", " three-wheel ", " tricycle ", " mountain bike ", " handcart ", " trolley ", " animal-drawn vehicle ", " stroller ", " dolly ", " ocean
Car ", " vehicle using motor ", " bicycle ", " a light horse cart ", " chariot ", " hub ", " halter strap " }, it is clear that word B " truck "Therefore go to step 1-
2)。
Step 1-2) use the Baidu given word A of Chinese extraction antisense set of words ASETA, utilize Harbin Institute of Technology's synonym word
Woods extended edition extraction word A TongYiCi CiLin SSETA, for each word W ∈ SSETAExtract its antonym and be merged into
ASETAIf word B ∈ ASETA, then word A and B antonymy be present, otherwise go to step 1-3);
The antisense set of words that word A " motor vehicle " is extracted in Baidu's Chinese obtainsDue to word B " truck " Therefore go to step 1-3).
Step 1-3) use Baidupedia extraction word A antisense set of words ASETAIf word B ∈ ASETA, then two word
There is antonymy in language, otherwise go to step 2-1).
The antisense set of words that word A " motor vehicle " is extracted in Baidupedia obtainsDue to word B " truck " Therefore go to step 2-1).
Step 102, integral part relation recognition.
It is whole according to whether having between part set of words judgement word using a variety of Chinese knowledge resource extraction unit participle set
Body portion relation, it is specially:
Step 2-1) using HowNet word A and B part set of words MSET is extracted respectivelyAAnd MSETBIf B ∈
MSETAOr A ∈ MSETB, then there is integral part relation in two words, otherwise go to step 2-2);
Word A " motor vehicle " is extracted in HowNet and B " truck " part set of words obtains MSETA={ " headlight ", " side
To disk ", " sun visor ", " boot ", " vehicle rear window ", " rear baffle ", " back light ", " rearview mirror ", " driver's cabin ", " sidecar ", " across
Bucket ", " automobile engine ", " car horn ", " auto parts machinery ", " cylinder ", " headlight ", " fuel gauge ", " taillight ", " luggage
Case ", " throttle ", " sunshading board ", " sun proof " },Because of B " truck "" motor vehicle "So going to step 2-2).
Step 2-2) handled using the original justice of HowNet justice, the word containing justice former " part | part " in definition
Part word (part) of the word as some word is represented, the value of " whole " attribute in definition indicates the justice of its overall word
Original justice, word A and B adopted former definition set DEFSET are extracted accordinglyAAnd DEFSETB, if there is DEFA∈DEFSETAWith
DEFB∈DEFSETB, meet DEFAIt is DEF containing " whole " attribute and its valueB, or DEFBIt is containing " whole " attribute and its value
DEFA, then word A and B integral part relation be present, otherwise go to step 3-1);
In addition, some words are to directly utilizing justice, originally justice can not effectively identify integral part relation, can be by general
The mode of change is handled, will be above-mentioned in the value of " whole " attribute be generalized for its upperseat concept, remaining operation is constant.
Utilize HowNet extraction word A " motor vehicle " and B " truck " adopted former definition set DEFSETA=
" LandVehicle | car:Modifier=automatic | automatic " and DEFSETB=" LandVehicle | car:
Modifier=automatic | automatic, transport | transport:Instrument={~}, patient=
Physical | material } ", it is clear that in the absence of DEFA∈DEFSETAOr DEFB∈DEFSETBContaining adopted former " part | part ",
Therefore 3-1 is gone to step).
Step 103, synonymy identifies.
Using a variety of Chinese knowledge resource extraction TongYiCi CiLins, judge whether have together between word based on TongYiCi CiLin
Adopted relation, it is specially:
Step 3-1) according to the row expression synonym that "=" is indicated in Harbin Institute of Technology's Chinese thesaurus extended edition, obtain word A
TongYiCi CiLin SSETAIf B ∈ SSETA, then word A and B synonymy be present, otherwise go to step 3-2);
In Harbin Institute of Technology's Chinese thesaurus extended edition, extraction word A " motor vehicle " TongYiCi CiLin obtainsB
" truck "Therefore go to step 3-2).
Step 3-2) utilize HowNet extraction words A TongYiCi CiLin SSETAIf word B ∈ SSETA, then word A
Synonymy be present with B, otherwise go to step 3-3);
In HowNet, extraction word A " motor vehicle " TongYiCi CiLin obtains SSETA={ " motor vehicle ", " motor vehicle
", " automobile ", " car ", " automobile ", " dolly ", " car ", " car ", " small sleeping carriage " }, because of B " truck "
So going to step 3-3).
Step 3-3) utilize Baidu Chinese extraction word A TongYiCi CiLin SSETAIf word B ∈ SSETA, then word
There is synonymy in A and B, otherwise go to step 3-4);
In Baidu's Chinese, extraction word A " motor vehicle " TongYiCi CiLin obtainsBecause of B " truck "So going to step 3-4).
Step 3-4) according to the page link of Baidupedia, the encyclopaedia for obtaining word A and B respectively links page set
PSETAAnd PSETBIf meetThen there is synonymy in word A and B, otherwise go to step 4-1).
In Baidupedia, the encyclopaedia link page set for extracting word A " motor vehicle " and B " truck " respectively obtains PSETA
={ " https://baike.baidu.com/item/ motor vehicles " } and PSETB={ " https://baike.baidu.com/
Item/ truck/4339 ", " https://baike.baidu.com/item/ truck/15281831 ", " https://
Baike.baidu.com/item/ truck/622401 ", " https://baike.baidu.com/item/ trucies/
3697802 ", " https://baike.baidu.com/item/ truck/7109303 ", " https://
Baike.baidu.com/item/ truck/3697784 " }, due toTherefore go to step 4-1).
Step 104, hyponymy identifies.
The next set of words is extracted by means of a variety of Chinese knowledge resources, judges whether have between word according to the next set of words
Hyponymy, it is specially:
Step 4-1) using HowNet word A and B the next set of words HSET is extracted respectivelyAAnd HSETBIf B ∈
HSETAOr A ∈ HSETB, then word A and B hyponymy be present, otherwise go to step 4-2);
The next set of words for extracting word A " motor vehicle " and B " truck " respectively in HowNet obtains HSETA=" Audi ",
" bus ", " regular bus ", " hired car ", " BMW ", " benz ", " Honda ", " Buick ", " long-distance bus ", " taxi ", " hire out vapour
Car ", " bus ", " motor bus ", " Daewoo ", " taxi ", " electric car ", " Toyota ", " Ford ", " bus ", " bus " is " public
Hand over car ", " illegal vehicle ", " lorry ", " container car ", " Shuttle Bus ", " emergency tender ", " taxi ", " special bus ", " learner-driven vehicle ",
" police car ", " ambulance ", " fire engine ", " old-fashioned automobile ", " truck ", " Cadillac ", " empty wagons ", " Lincoln ", " hopper wagon ",
" station wagon ", " touring car ", " Mercedes ", " minibus ", " local train ", " double deck bus ", " private car ", " private savings
Car ", " scheduled bus ", " round bus ", " trolleybus ", " modern times ", " fire fighting truck ", " minibus ", " mini-bus ", " mini-bus vapour
Car ", " small visitor ", " minibus ", " cruiser ", " tourist coach ", " offroad vehicle ", " transport vehicle ", " truck ", " self-dumping
Truck ", " dumper ", " dump truck " } andBecause of word B " truck " ∈ HSETA, so word A " motor vehicle "
Hyponymy be present with B " truck ", namely so far complete semantic relation identification operation.
Step 4-2) according to the adopted original justice of the HowNet justice hyponymy that originally meaning contains, respectively extraction word A and B
Set DEFSETAAnd DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈DEFSETB, meet main justice it is former consistent andOrThen word A and B have hyponymy.
Step 105, Chinese-English translation.
Chinese word is converted into English to translation using Chinese-English Dictionary, is specially:
Step 5-1) word A and B translated respectively using Chinese-English Dictionary be converted to corresponding English set ENSET (A) and
ENSET(B)。
Step 106, English word semantic relation identification.
Using English knowledge resource to English word obtained by step 5 to carrying out phrase semantic relation recognition, to determine in original
The semantic relation of cliction language pair, it is specially:
Step 6-1) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), are carried according to English knowledge resource
Take word ENAAntisense set of words ENASETAIf word ENB∈ENASETA, then English word ENAAnd ENBAntisense pass be present
Otherwise system, namely former Chinese word go to step 6-2 to antonymy be present;
Step 6-2) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), according to English knowledge resource point
Indescribably take word ENAAnd ENBPart set of words ENMSETAAnd ENMSETBIf word ENB∈ENMSETAOr ENA∈
ENMSETB, then English word ENAAnd ENBIntegral part relation be present, namely to integral part relation be present in former Chinese word, it is no
Then go to step 6-3);
Step 6-3) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), are carried according to English knowledge resource
Take word ENATongYiCi CiLin ENSSETAIf word ENB∈ENSSETA, then English word ENAAnd ENBSynonymous pass be present
Otherwise system, namely former Chinese word go to step 6-4 to synonymy be present);
Step 6-4) for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), according to English knowledge resource point
Indescribably take word ENAAnd ENBThe next set of words ENHSETAAnd ENHSETBIf word ENB∈ENHSETAOr ENA∈
ENHSETB, then English word ENAAnd ENBHyponymy, namely former Chinese word be present to hyponymy be present.
Similarly, it can complete to identify the semantic relation of " people " and " head " word and operate, it is specific extensive in order to illustrate
Operation, is transitioned into step 2-2 directly below):
In HowNet, word A " people " is extracted respectively and B " head " adopted former definition set obtains DEFSETA=
" Behavior | manner:Host=human | people } ", " Physique | physique:Host=AnimalHuman | it is dynamic
Thing } } ", " Strength | strength:Host=community | group } ", " human | people } ", " human | people:
PersonPro=3rdPerson | he ", " human | people:PersonPro=3rdPerson | he, quantity=
Mass | many ", " human | people:Modifier=adult | adult } ", " human | people:Quantity=mass |
It is many } } ", " human | people:Engage | it is engaged in:Agent={~}, content=fact | thing:Modifier=
Specific | specific ", DEFSETB=" part | part:PartPosition=head | and head }, whole=
AnimalHuman | animal } ", it is clear that in the absence of DEFA∈DEFSETAAnd DEFB∈DEFSETBSo that DEFAContain
" whole " attribute and its value are DEFB, or DEFBIt is DEF containing " whole " attribute and its valueA, therefore extensive operation is carried out, it is extensive
DEFA=" { human | people } " is the original justice of justice " { AnimalHuman | animal } } " of its upperseat concept, now meets DEF be presentB
∈DEFSETBIt is " { AnimalHuman | animal } } " containing " whole " attribute and its value, therefore word " people " and " head " are deposited
In integral part relation.
The word pair that can not be identified for Chinese knowledge resource, by step 5-1) to step 6-4) complete its semantic relation
Identification operation, illustrates by taking word A " seasoning " and B " vinegar " as an example below:
According to step 5-1) utilize Iciba Chinese-English Dictionaries and HowNet that word A " seasoning " and B " vinegar " is translated to conversion respectively
For corresponding English gather ENSET (A)={ " accessory food ", " condiments ", " seasoning " } and
ENSET (B)={ " vinegar ", " jealousy " }.
According to step 6-1) word EN is extracted in English knowledge resource BabelNetA" accessory food " antisense
Set of words obtainsExtract ENAThe antisense set of words of " condiments " obtainsExtract ENA
The antisense set of words of " seasoning " obtainsObvious wordTherefore 6-2 is gone to step.
According to step 6-2) in English knowledge resource BabelNet, extract word ENA" accessory food " portion
Participle is gatheredExtract word ENAThe part set of words of " condiments " obtainsCarry
Take word ENAThe part set of words of " seasoning " obtainsExtract ENBThe part set of words of " vinegar "
Extract ENBThe part set of words of " jealousy " obtainsObviously Therefore go to step 6-3).
According to step 6-3) in English knowledge resource BabelNet, extract word ENA" accessory food's " is same
Adopted set of words obtainsExtract word ENAThe TongYiCi CiLin of " condiments " obtainsCarry
Take word ENAThe TongYiCi CiLin of " seasoning " obtains ENSSETA=" flavorer ", " flavourer ",
" flavoring ", " flavouring ", " seasoner ", " seasoning " }, because of wordTherefore go to step
6-4);
According to step 6-4) in English knowledge resource BabelNet, extract word ENA" under accessory food "
Position word obtainsExtract word ENAThe hyponym of " condiments " obtains ENHSETA=" relish ",
" dip ", " mustard ", " table mustard ", " catsup ", " ketchup ", " cetchup ", " tomato
Ketchup ", " chili sauce ", " chutney ", " Indian relish ", " steak sauce ", " taco sauce ",
" salsa ", " mint sauce ", " cranberry sauce ", " duck sauce ", " hoisin sauce ",
" horseradish ", " marinade ", " soy sauce ", " soy ", " vinegar ", " acetum ", " sauce ",
" spread ", " paste ", " wasabi " }, now meet ENB“vinegar”∈ENHSETA, therefore English word ENAAnd ENBDeposit
In hyponymy, namely to there is hyponymy in former Chinese word.
Pass through above operating procedure, you can complete the semantic relation identification work of given word pair.
Correspondingly, the embodiment of the present invention also provides a kind of Chinese word semantic relation identification dress of combination China and Britain knowledge resource
Put, its structural representation is as shown in Figure 2.
In this embodiment, described device includes:
Antonymy recognition unit 201, for obtaining antisense set of words using a variety of Chinese knowledge resources, according to antonym
Whether semantic relation has antonymy between set judges word;
Integral part relation recognition unit 202, for being gathered using a variety of Chinese knowledge resource extraction unit participles, according to portion
Participle set judges whether there is integral part relation between word;
Synonymy recognition unit 203, for extracting TongYiCi CiLins using a variety of Chinese knowledge resources, based on synonym
Set judges whether there is synonymy between word;
Hyponymy recognition unit 204, for extracting the next set of words by means of a variety of Chinese knowledge resources, under
Position set of words judges whether there is hyponymy between word;
Chinese-English translation unit 205, for Chinese word to be converted into English to translation using Chinese-English Dictionary;
English word semantic relation recognition unit 206, for utilizing English knowledge resource to English obtained by Chinese-English translation unit
Cliction language is to carrying out phrase semantic relation recognition, to determine the semantic relation of former Chinese word pair.
The structural representation of the antonymy recognition unit 201 of Fig. 2 shown devices is as shown in figure 3, it includes:
HowNet antonymies recognition unit 301, for the antonymy using explicit definition in HowNet, to giving word
Language A and B carry out word A antisense set of words ASETAExtraction operation, if B ∈ ASETA, then antonymy be present in two words,
Otherwise turn Baidu's Chinese antonymy recognition unit, adopted relation is also served as at a kind of antonymy defined in HowNet in addition
Reason;
Baidu's Chinese antonymy recognition unit 302, for the antisense set of words using the Baidu given word A of Chinese extraction
ASETA, utilize Harbin Institute of Technology Chinese thesaurus extended edition extraction word A TongYiCi CiLin SSETA, for each word W ∈
SSETAExtract its antonym and be merged into ASETAIf word B ∈ ASETA, then word A and B antonymy be present, otherwise turn
Baidupedia antonymy recognition unit;
Baidupedia antonymy recognition unit 303, for the antisense set of words using Baidupedia extraction word A
ASETAIf word B ∈ ASETA, then two words antonymy be present, otherwise turn integral part relation recognition unit.
The structural representation of the integral part relation recognition unit 202 of Fig. 2 shown devices is as shown in figure 4, it includes:
HowNet integral part relation recognitions unit 401, for extracting word A and B part word respectively using HowNet
Set MSETAAnd MSETBIf B ∈ MSETAOr A ∈ MSETB, then there is integral part relation in two words, otherwise escape is former
Define integral part relation recognition unit;
Adopted original adopted integral part relation recognition unit 402, for being handled using the original justice of HowNet justice, defining
In contain adopted former " part | part " word represent part word (part) of the word as some word, " whole " in definition
The value of attribute indicates the original justice of justice of its overall word, extracts word A and B adopted former definition set DEFSET accordinglyAWith
DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈DEFSETB, meet DEFAIt is containing " whole " attribute and its value
DEFB, or DEFBIt is DEF containing " whole " attribute and its valueA, then word A and B integral part relation be present, otherwise turn synonymous pass
It is recognition unit;
In addition, in adopted original adopted integral part relation recognition unit, some words are to directly using justice, originally justice can not
Effectively identify integral part relation, can be handled by extensive mode, will be above-mentioned in " whole " attribute value it is extensive
It is constant for its upperseat concept, remaining operation.
The structural representation of the synonymy recognition unit 203 of Fig. 2 shown devices is as shown in figure 5, it includes:
Word woods synonymy recognition unit 501, for according to the row that "=" is indicated in Harbin Institute of Technology's Chinese thesaurus extended edition
Synonym is represented, obtains word A TongYiCi CiLin SSETAIf B ∈ SSETA, then word A and B synonymy be present, otherwise
Turn HowNet synonymy recognition units;
HowNet synonymies recognition unit 502, for the TongYiCi CiLin SSET using HowNet extraction words AA, such as
Fruit word B ∈ SSETA, then word A and B synonymy be present, otherwise turn Baidu's Chinese synonymy recognition unit;
Baidu's Chinese synonymy recognition unit 503, for the TongYiCi CiLin using Baidu Chinese extraction word A
SSETAIf word B ∈ SSETA, then word A and B synonymy be present, otherwise turn Baidupedia synonymy recognition unit;
Baidupedia synonymy recognition unit 504, for the page link according to Baidupedia, word A is obtained respectively
With B encyclopaedia link page set PSETAAnd PSETBIf meetThen word A and B exists same
Adopted relation, otherwise turn hyponymy recognition unit.
The structural representation of the hyponymy recognition unit 204 of Fig. 2 shown devices is as shown in fig. 6, it includes:
HowNet hyponymies recognition unit 601, for extracting word A and B the next word set respectively using HowNet
Close HSETAAnd HSETBIf B ∈ HSETAOr A ∈ HSETB, then word A and B hyponymy be present, otherwise escape is originally adopted
Hyponymy recognition unit;
Adopted original adopted hyponymy recognition unit 602, for the hyponymy contained according to the original meaning of HowNet justice,
Word A and B adopted former definition set DEFSET are extracted respectivelyAAnd DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈
DEFSETB, meet main justice it is former consistent andOrThen word A and B has upper the next pass
System.
The structural representation of the Chinese-English translation unit 205 of Fig. 2 shown devices is as shown in fig. 7, it includes:
Chinese-English translation unit 701, corresponding English collection is converted to for translating word A and B respectively using Chinese-English Dictionary
Close ENSET (A) and ENSET (B).
The structural representation of the English word semantic relation recognition unit 206 of Fig. 2 shown devices is as shown in figure 8, it includes:
English antonymy recognition unit 801, for each English word ENA∈ ENSET (A), ENB∈ ENSET (B),
Word EN is extracted according to English knowledge resourceAAntisense set of words ENASETAIf word ENB∈ENASETA, then English word
ENAAnd ENBAntonymy be present, namely otherwise former Chinese word turns English integral part relation recognition list to antonymy be present
Member;
English integral part relation recognition unit 802, for each English word ENA∈ ENSET (A), ENB∈ENSET
(B) word EN, is extracted according to English knowledge resource respectivelyAAnd ENBPart set of words ENMSETAAnd ENMSETBIf word
ENB∈ENMSETAOr ENA∈ENMSETB, then English word ENAAnd ENBIntegral part relation, namely former Chinese word pair be present
Integral part relation be present, otherwise turn English synonymy recognition unit;
English synonymy recognition unit 803, for each English word ENA∈ ENSET (A), ENB∈ ENSET (B),
Word EN is extracted according to English knowledge resourceATongYiCi CiLin ENSSETAIf word ENB∈ENSSETA, then English word
ENAAnd ENBSynonymy, namely former Chinese word being present to synonymy be present, it is single otherwise to turn English hyponymy identification
Member;
English hyponymy recognition unit 804, for for each English word ENA∈ ENSET (A), ENB∈ENSET
(B) word EN, is extracted according to English knowledge resource respectivelyAAnd ENBThe next set of words ENHSETAAnd ENHSETBIf word
ENB∈ENHSETAOr ENA∈ENHSETB, then English word ENAAnd ENBHyponymy, namely former Chinese word be present to depositing
In hyponymy.
The Chinese word semantic relation identification device of combination China and Britain knowledge resource shown in Fig. 2~Fig. 8 can be integrated into
In various hardware entities.For example the Chinese word semantic relation identification device for combining Sino-British knowledge resource can be integrated into:It is individual
Among the equipment such as people's computer, smart mobile phone, work station.
The combination China and Britain that embodiment of the present invention is proposed can be known by the storing mode that instruction or instruction set store
The Chinese word semantic relation recognition methods for knowing resource is stored in various storage mediums.These storage mediums include but not limited to
In:Floppy disk, CD, hard disk, internal memory, USB flash disk, CF cards, SM cards etc..
In summary, in embodiments of the present invention, antisense set of words, root are obtained by combining a variety of Chinese knowledge resources
Whether semantic relation has antonymy between judging word according to antisense set of words;Segmented using a variety of Chinese knowledge resource extraction units
Set, judge whether there is integral part relation between word according to part set of words;It is same using a variety of Chinese knowledge resource extractions
Adopted set of words, judge whether there is synonymy between word based on TongYiCi CiLin;Extracted by means of a variety of Chinese knowledge resources
The next set of words, judge whether there is hyponymy between word according to the next set of words;Using Chinese-English Dictionary by Chinese word
English is converted to translation;Using English knowledge resource to English word obtained by Chinese-English translation to carrying out phrase semantic relation knowledge
Not, with the semantic relation of the former Chinese word pair of determination.As can be seen here, after using embodiment of the present invention, realize in combination
The Chinese word semantic relation identification of English knowledge resource.Embodiment of the present invention can utilize a variety of different Chinese knowledge resources
Phrase semantic relation recognition is carried out, takes full advantage of every kind of knowledge resource;In integral part identification process is carried out, for
The characteristics of HowNet justice is original adopted, the present invention is supplemented by extensive method, recognition methods adaptability is got a promotion;
In hyponymy identification operation is carried out, the present invention fully excavates the information contained of the original justice of justice in HowNet, effectively
Improve the accuracy of identification;Present invention incorporates Chinese and English knowledge resource, supplements Chinese knowledge using English knowledge resource and provides
The unlapped phrase semantic relation in source, improves discrimination;The Chinese word language of combination China and Britain proposed by the present invention knowledge resource
Adopted relation recognition method and apparatus, it is capable of the semantic relation of the given word pair of automatic identification, including antonymy, integral part close
System, synonymy, hyponymy, there is higher recognition correct rate.
Embodiment in this specification is described by the way of progressive, mutually the same similar part mutually referring to.
For device embodiment, because it is substantially similar to embodiment of the method, so describe fairly simple, correlation
Place illustrates referring to the part of embodiment of the method.
The embodiment of the present invention is described in detail above, embodiment used herein is carried out to the present invention
Illustrate, the explanation of above example is only intended to help and understands methods and apparatus of the present invention;Meanwhile for the one of this area
As technical staff, according to the thought of the present invention, there will be changes in specific embodiments and applications, therefore this explanation
Book should not be construed as limiting the invention.
Claims (7)
- A kind of 1. Chinese word semantic relation recognition methods of combination China and Britain knowledge resource, it is characterised in that this method include with Lower step:Step 1: obtaining antisense set of words with reference to a variety of Chinese knowledge resources, semanteme closes between judging word according to antisense set of words Whether system has antonymy;Step 2: using a variety of Chinese knowledge resource extraction unit participle set, judge whether have between word according to part set of words There is integral part relation;Step 3: extracting TongYiCi CiLin using a variety of Chinese knowledge resources, judge whether have between word based on TongYiCi CiLin There is synonymy;Step 4: extract the next set of words by means of a variety of Chinese knowledge resources, according to the next set of words judge between word whether With hyponymy;Step 5: Chinese word is converted into English to translation using Chinese-English Dictionary;Step 6: using English knowledge resource to English word obtained by step 5 to carrying out phrase semantic relation recognition, to determine The semantic relation of former Chinese word pair.
- 2. the Chinese word semantic relation recognition methods of combination China and Britain according to claim 1 knowledge resource, its feature exist In in the step 1, when judging antisense semantic relation, specially:Step 1-1) using the antonymy of explicit definition in HowNet, the antisense word set to giving word A and B progress word A Close ASETAExtraction operation, if B ∈ ASETA, then there is antonymy in two words, otherwise go to step 1-2), HowNet in addition Defined in a kind of processing of antonymy is also served as to adopted relation;Step 1-2) use the Baidu given word A of Chinese extraction antisense set of words ASETA, expanded using Harbin Institute of Technology's Chinese thesaurus Open up version extraction word A TongYiCi CiLin SSETA, for each word W ∈ SSETAExtract its antonym and be merged into ASETA, If word B ∈ ASETA, then word A and B antonymy be present, otherwise go to step 1-3);Step 1-3) use Baidupedia extraction word A antisense set of words ASETAIf word B ∈ ASETA, then two words deposit In antonymy, 2-1 is otherwise gone to step).
- 3. the Chinese word semantic relation recognition methods of combination China and Britain according to claim 1 knowledge resource, its feature exist In in the step 2, when judging integral part relation, specially:Step 2-1) using HowNet word A and B part set of words MSET is extracted respectivelyAAnd MSETBIf B ∈ MSETAOr A ∈MSETB, then there is integral part relation in two words, otherwise go to step 2-2);Step 2-2) handled using the original justice of HowNet justice, the word containing justice former " part | part " in definition represents Part word (part) of the word as some word, the value of " whole " attribute in definition indicate the justice of its overall word originally Justice, word A and B adopted former definition set DEFSET are extracted accordinglyAAnd DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈ DEFSETB, meet DEFAIt is DEF containing " whole " attribute and its valueB, or DEFBIt is DEF containing " whole " attribute and its valueA, Then there is integral part relation in word A and B, otherwise go to step 3-1);In addition, some words are to directly utilizing justice, originally justice can not effectively identify integral part relation, can be by extensive Mode is handled, will be above-mentioned in the value of " whole " attribute be generalized for its upperseat concept, remaining operation is constant.
- 4. the Chinese word semantic relation recognition methods of combination China and Britain according to claim 1 knowledge resource, its feature exist In in the step 3, when judging synonymy, specially:Step 3-1) according to the row expression synonym that "=" is indicated in Harbin Institute of Technology's Chinese thesaurus extended edition, obtain the same of word A Adopted set of words SSETAIf B ∈ SSETA, then word A and B synonymy be present, otherwise go to step 3-2);Step 3-2) utilize HowNet extraction words A TongYiCi CiLin SSETAIf word B ∈ SSETA, then word A and B are deposited In synonymy, 3-3 is otherwise gone to step);Step 3-3) utilize Baidu Chinese extraction word A TongYiCi CiLin SSETAIf word B ∈ SSETA, then word A and B Synonymy be present, otherwise go to step 3-4);Step 3-4) according to the page link of Baidupedia, the encyclopaedia for obtaining word A and B respectively links page set PSETAWith PSETBIf meetThen there is synonymy in word A and B, otherwise go to step 4-1).
- 5. the Chinese word semantic relation recognition methods of combination China and Britain according to claim 1 knowledge resource, its feature exist In in the step 4, when judging hyponymy, specially:Step 4-1) using HowNet word A and B the next set of words HSET is extracted respectivelyAAnd HSETBIf B ∈ HSETAOr A ∈HSETB, then word A and B hyponymy be present, otherwise go to step 4-2);Step 4-2) hyponymy that is contained according to the original meaning of HowNet justice, extracts word A and B adopted former definition set respectively DEFSETAAnd DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈DEFSETB, meet main justice it is former consistent andOrThen word A and B have hyponymy.
- 6. the Chinese word semantic relation identification device of a kind of combination China and Britain knowledge resource, it is characterised in that the device includes anti- Adopted relation recognition unit, integral part relation recognition unit, synonym relation recognition unit, hyponymy recognition unit, its In:Antonymy recognition unit, for obtaining antisense set of words using a variety of Chinese knowledge resources, sentenced according to antisense set of words Whether semantic relation has antonymy between determining word;Integral part relation recognition unit, for being gathered using a variety of Chinese knowledge resource extraction unit participles, according to part word set Close and judge whether there is integral part relation between word;Synonymy recognition unit, for using a variety of Chinese knowledge resource extraction TongYiCi CiLins, being sentenced based on TongYiCi CiLin Determine whether there is synonymy between word;Hyponymy recognition unit, for extracting the next set of words by means of a variety of Chinese knowledge resources, according to the next word set Close and judge whether there is hyponymy between word;Chinese-English translation unit, for Chinese word to be converted into English to translation using Chinese-English Dictionary;English word semantic relation recognition unit, for utilizing English knowledge resource to English word pair obtained by Chinese-English translation unit Phrase semantic relation recognition is carried out, to determine the semantic relation of former Chinese word pair.
- 7. the Chinese word semantic relation identification device of combination China and Britain according to claim 6 knowledge resource, its feature exist In, it is described including:HowNet antonymy recognition units, for the antonymy using explicit definition in HowNet, to giving word A and B Carry out word A antisense set of words ASETAExtraction operation, if B ∈ ASETA, then be present antonymy in two words, otherwise turn Baidu's Chinese antonymy recognition unit, a kind of antonymy processing is also served as to adopted relation defined in HowNet in addition;Baidu's Chinese antonymy recognition unit, for the antisense set of words ASET using the Baidu given word A of Chinese extractionA, profit With Harbin Institute of Technology Chinese thesaurus extended edition extraction word A TongYiCi CiLin SSETA, for each word W ∈ SSETAExtract it Antonym is simultaneously merged into ASETAIf word B ∈ ASETA, then word A and B antonymy be present, otherwise turn Baidupedia antisense Relation recognition unit;Baidupedia antonymy recognition unit, for the antisense set of words ASET using Baidupedia extraction word AAIf word Language B ∈ ASETA, then two words antonymy be present, otherwise turn integral part relation recognition unit.HowNet integral part relation recognition units, for extracting word A and B part set of words MSET respectively using HowNetA And MSETBIf B ∈ MSETAOr A ∈ MSETB, then there is integral part relation in two words, otherwise the original overall portion of justice of escape Divide relation recognition unit;Adopted original adopted integral part relation recognition unit, for being handled using the original justice of HowNet justice, contain in definition The word of adopted former " part | part " represents part word (part) of the word as some word, " whole " attribute in definition Value indicates the original justice of justice of its overall word, extracts word A and B adopted former definition set DEFSET accordinglyAAnd DEFSETBIf DEF be presentA∈DEFSETAAnd DEFB∈DEFSETB, meet DEFAIt is DEF containing " whole " attribute and its valueB, or DEFBContain " whole " attribute and its value are DEFA, then word A and B integral part relation be present, otherwise turn synonymy recognition unit;In addition, in adopted original adopted integral part relation recognition unit, some words are to directly using justice, originally justice can not be effective Identify integral part relation, can be handled by extensive mode, will be above-mentioned in the value of " whole " attribute be generalized for it Upperseat concept, remaining operation are constant.Word woods synonymy recognition unit, for representing synonymous according to the row that "=" is indicated in Harbin Institute of Technology's Chinese thesaurus extended edition Word, obtain word A TongYiCi CiLin SSETAIf B ∈ SSETA, then word A and B synonymy be present, otherwise turn HowNet Synonymy recognition unit;HowNet synonymy recognition units, for the TongYiCi CiLin SSET using HowNet extraction words AAIf word B ∈SSETA, then word A and B synonymy be present, otherwise turn Baidu's Chinese synonymy recognition unit;Baidu's Chinese synonymy recognition unit, for the TongYiCi CiLin SSET using Baidu Chinese extraction word AAIf word Language B ∈ SSETA, then word A and B synonymy be present, otherwise turn Baidupedia synonymy recognition unit;Baidupedia synonymy recognition unit, for the page link according to Baidupedia, the hundred of word A and B is obtained respectively Section link page set PSETAAnd PSETBIf meetThen there is synonymy in word A and B, Otherwise hyponymy recognition unit is turned.HowNet hyponymy recognition units, for extracting word A and B the next set of words HSET respectively using HowNetAWith HSETBIf B ∈ HSETAOr A ∈ HSETB, then word A and B hyponymy be present, the otherwise original adopted hyponymy of escape Recognition unit;Adopted original adopted hyponymy recognition unit, for the hyponymy contained according to the original meaning of HowNet justice, carry respectively Take word A and B adopted former definition set DEFSETAAnd DEFSETB, if there is DEFA∈DEFSETAAnd DEFB∈DEFSETB, Meet main justice it is former consistent andOrThen word A and B have hyponymy.Chinese-English translation unit, corresponding English set ENSET is converted to for translating word A and B respectively using Chinese-English Dictionary And ENSET (B) (A).English antonymy recognition unit, for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), according to English Literary knowledge resource extraction word ENAAntisense set of words ENASETAIf word ENB∈ENASETA, then English word ENAWith ENBAntonymy be present, namely otherwise former Chinese word turns English integral part relation recognition unit to antonymy be present;English integral part relation recognition unit, for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), root Word EN is extracted respectively according to English knowledge resourceAAnd ENBPart set of words ENMSETAAnd ENMSETBIf word ENB∈ ENMSETAOr ENA∈ENMSETB, then English word ENAAnd ENBIntegral part relation be present, namely former Chinese word is whole to existing Body portion relation, otherwise turn English synonymy recognition unit;English synonymy recognition unit, for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), according to English Literary knowledge resource extraction word ENATongYiCi CiLin ENSSETAIf word ENB∈ENSSETA, then English word ENAWith ENBSynonymy be present, namely otherwise former Chinese word turns English hyponymy recognition unit to synonymy be present;English hyponymy recognition unit, for for each English word ENA∈ ENSET (A), ENB∈ ENSET (B), root Word EN is extracted respectively according to English knowledge resourceAAnd ENBThe next set of words ENHSETAAnd ENHSETBIf word ENB∈ ENHSETAOr ENA∈ENHSETB, then English word ENAAnd ENBHyponymy, namely former Chinese word be present to existing up and down Position relation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710706832.9A CN107451130B (en) | 2017-08-17 | 2017-08-17 | Chinese word semantic relation recognition method and device combining Chinese and English knowledge resources |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710706832.9A CN107451130B (en) | 2017-08-17 | 2017-08-17 | Chinese word semantic relation recognition method and device combining Chinese and English knowledge resources |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107451130A true CN107451130A (en) | 2017-12-08 |
CN107451130B CN107451130B (en) | 2021-04-02 |
Family
ID=60492720
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710706832.9A Active CN107451130B (en) | 2017-08-17 | 2017-08-17 | Chinese word semantic relation recognition method and device combining Chinese and English knowledge resources |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107451130B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109902673A (en) * | 2019-01-28 | 2019-06-18 | 北京明略软件系统有限公司 | Table Header information identification and method for sorting, system, terminal and storage medium in table |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2280159A1 (en) * | 2009-07-31 | 2011-02-02 | International Engine Intellectual Property Company, LLC. | Exhaust gas cooler |
CN103473222A (en) * | 2013-09-16 | 2013-12-25 | 中央民族大学 | Semantic ontology creation and vocabulary expansion method for Tibetan language |
CN104484411A (en) * | 2014-12-16 | 2015-04-01 | 中国科学院自动化研究所 | Building method for semantic knowledge base based on a dictionary |
CN106202034A (en) * | 2016-06-29 | 2016-12-07 | 齐鲁工业大学 | A kind of adjective word sense disambiguation method based on interdependent constraint and knowledge and device |
-
2017
- 2017-08-17 CN CN201710706832.9A patent/CN107451130B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2280159A1 (en) * | 2009-07-31 | 2011-02-02 | International Engine Intellectual Property Company, LLC. | Exhaust gas cooler |
CN103473222A (en) * | 2013-09-16 | 2013-12-25 | 中央民族大学 | Semantic ontology creation and vocabulary expansion method for Tibetan language |
CN104484411A (en) * | 2014-12-16 | 2015-04-01 | 中国科学院自动化研究所 | Building method for semantic knowledge base based on a dictionary |
CN106202034A (en) * | 2016-06-29 | 2016-12-07 | 齐鲁工业大学 | A kind of adjective word sense disambiguation method based on interdependent constraint and knowledge and device |
Non-Patent Citations (3)
Title |
---|
IRIS HENDRICKX等: "SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals", 《SEW "09: PROCEEDINGS OF THE WORKSHOP ON SEMANTIC EVALUATIONS: RECENT ACHIEVEMENTS AND FUTURE DIRECTIONS》 * |
SHUTIAN MA等: "NLPCC 2016 Shared Task Chinese Words Similarity Measure via Ensemble Learning Based on Multiple Resources", 《NLPCC 2016: NATURAL LANGUAGE UNDERSTANDING AND INTELLIGENT APPLICATIONS》 * |
范庆虎等: "基于词典和Web的词汇关系抽取", 《HTTP://WWW.DOC88.COM/P-1146077617476.HTML》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109902673A (en) * | 2019-01-28 | 2019-06-18 | 北京明略软件系统有限公司 | Table Header information identification and method for sorting, system, terminal and storage medium in table |
Also Published As
Publication number | Publication date |
---|---|
CN107451130B (en) | 2021-04-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Amsler | The structure of the Merriam-Webster pocket dictionary | |
CN104516947B (en) | A kind of Chinese microblog emotional analysis method for merging dominant and recessive character | |
CN105988990A (en) | Device and method for resolving zero anaphora in Chinese language, as well as training method | |
CN105787461A (en) | Text-classification-and-condition-random-field-based adverse reaction entity identification method in traditional Chinese medicine literature | |
CN107451130A (en) | A kind of Chinese word semantic relation recognition methods of combination China and Britain knowledge resource and device | |
CN111488427A (en) | Vehicle interaction method, vehicle interaction system, computing device and storage medium | |
CN113723528A (en) | Vehicle-mounted voice-video fusion multi-mode interaction method, system, device and storage medium | |
CN111784526A (en) | Personalized recommendation method for personal accident risk | |
CN107451123A (en) | A kind of Chinese word semantic relation recognition methods and device based on a variety of Chinese knowledge resources | |
CN108563647A (en) | A kind of automobile Method for Sales Forecast method based on comment sentiment analysis | |
CN114419589A (en) | Road target detection method based on attention feature enhancement module | |
Coxon et al. | Urban mobility design | |
Sytsma | Ordinary meaning and consilience of evidence | |
CN102929863A (en) | Method for intelligently analyzing Chinese character emotional tendency through computer | |
CN109670480A (en) | Image discriminating method, device, equipment and storage medium | |
van Dulken | Do you know English? The challenge of the English language for patent searchers | |
CN202911888U (en) | Folding luggage box type bicycle | |
Williams | Motoring: Swanky, safe, and more affordable | |
CN105955993B (en) | Search result ordering method and device | |
CN109241013A (en) | A kind of method of book content audit in shared book system | |
Lee | Random Forest with Transfer Learning: An Application to Vehicle Valuation | |
Belletti Felicio et al. | Classification of Motorcycles using Extracted Images of Traffic Monitoring Videos | |
Cialdai et al. | Motorcycle-to-car impact: influence of the mass of the rider in the calculation of the relative impact velocity | |
JP2002117401A (en) | Adult and sex image detection system | |
Vadivel et al. | FINE-GRAINED MULTI-CLASS ROAD SEGMENTATION USING MULTISCALE PROBABILITY LEARNING |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |