Disclosure of Invention
In order to overcome the defects in the prior art, the invention discloses a Chinese word semantic relation recognition method and device based on multiple Chinese knowledge resources, so as to more accurately and effectively judge the semantic relation among Chinese words.
Therefore, the invention provides the following technical scheme:
a Chinese word semantic relation recognition method based on multiple Chinese knowledge resources comprises the following steps:
acquiring an antisense word set by combining various Chinese knowledge resources, and judging whether the semantic relation between words has an antisense relation or not according to the antisense word set;
extracting a partial word set by using various Chinese knowledge resources, and judging whether integral partial relations exist among the words or not according to the partial word set;
thirdly, extracting a synonym set by utilizing various Chinese knowledge resources, and judging whether synonym relations exist among the words or not based on the synonym set;
extracting a hyponym set by means of multiple Chinese knowledge resources, and judging whether the words have superior-subordinate relations or not according to the hyponym set;
further, in the step one, when determining the antisense semantic relationship, specifically:
step 1-1) performing an antisense word set ASET of a word A for given words A and B by using an explicitly defined antisense relation in HowNetAExtracting if B is equal to ASETAIf so, the two words have antisense relation, otherwise, the step 1-2) is switched to, and in addition, the sense relation defined in HowNet is also treated as an antisense relation;
step 1-2) extraction of antisense word set ASET of given word A using Baidu ChineseAUsing the same meaning of HagongdaSynonym set SSET for extracting words A by word forest expansion editionAFor each word W ∈ SSETAExtracting the antisense words and incorporating them into ASETAIf the word B belongs to ASETAIf the words A and B have antisense relation, otherwise, turning to the step 1-3);
step 1-3) extracting the antisense word set ASET of the word A by using Baidu encyclopediaAIf the word B belongs to ASETAIf the two words have antisense relation, otherwise, go to step 2-1).
Further, in the second step, when determining the integral part relationship, specifically:
step 2-1) extracting partial word sets MSET of words A and B respectively by using HowNetAAnd MSETBIf B ∈ MSETAOr A ∈ MSETBIf so, the two words have integral part relation, otherwise, the step 2-2) is carried out;
step 2-2) processing using HowNet semantic definition, in which a word containing the semantic "part" indicates that the word is a partial word (part) of a word, and the value of the "while" attribute in the definition indicates the semantic definition of its whole word, from which the semantic definition set DEFSET of words A and B is extractedAAnd DEFSETBIf DEF is presentA∈DEFSETAAnd DEFB∈DEFSETBSatisfies DEFAContains the attribute "whole" and has a value of DEFBOr DEFBContains the attribute "whole" and has a value of DEFAIf so, the words A and B have an integral part relationship, otherwise, turning to the step 3-1);
in addition, some words can be processed in a generalization mode for the whole part relation which can not be effectively recognized by directly utilizing definition of definition, and the value of the attribute of 'window' in the above is generalized to be the upper concept, and the rest operations are unchanged.
Further, in the third step, when determining the synonymous relationship, specifically:
step 3-1) representing synonyms according to the row marked with ═ in the expansion edition of the Harmony large synonym forest, and acquiring the synonym set SSET of the word AAIf B ∈ SSETAThen the words A andb, synonymy relation exists, otherwise, step 3-2) is carried out;
step 3-2) extracting synonym set SSET of the word A by utilizing HowNetAIf the word B ∈ SSETAIf yes, the words A and B have a synonymy relation, otherwise, the step 3-3) is carried out;
step 3-3) extracting synonym set SSET of the word A by utilizing Baidu ChineseAIf the word B ∈ SSETAIf yes, the words A and B have a synonymy relation, otherwise, the step 3-4) is switched;
step 3-4) acquiring encyclopedia link page sets PSET of the words A and B respectively according to the page links of the encyclopedia
AAnd PSET
BIf it is satisfied
Then words a and B have a synonymy relationship, otherwise go to step 4-1).
Further, in the fourth step, when determining the upper-lower relationship, the following steps are specifically performed:
step 4-1) extracting hyponym sets HSET of words A and B respectively by using HowNetAAnd HSETBIf B ∈ HSETAOr A is epsilon to HSETBIf the words A and B have an upper-lower relation, otherwise, turning to the step 4-2);
step 4-2) extracting definition sets DEFSET of words A and B respectively according to the upper and lower relations implied by HowNet definition
AAnd DEFSET
BIf DEF is present
A∈DEFSET
AAnd DEF
B∈DEFSET
BSatisfy the primary and primary sense consistency and
or
Then words a and B have an up-down relationship.
A Chinese word semantic relation recognition device based on multiple Chinese knowledge resources comprises the following steps:
the antisense relation identification unit is used for acquiring an antisense word set by using various Chinese knowledge resources and judging whether the semantic relation among the words has an antisense relation or not according to the antisense word set;
the integral part relation recognition unit is used for extracting a part word set by using various Chinese knowledge resources and judging whether integral part relations exist among the words or not according to the part word set;
the synonymy relation identification unit is used for extracting a synonymy set by utilizing various Chinese knowledge resources and judging whether synonymy relations exist among the words or not based on the synonymy set;
and the superior-inferior relation identification unit is used for extracting a subordinate word set by means of various Chinese knowledge resources and judging whether superior-inferior relations exist among the words or not according to the subordinate word set.
Further, the antisense relation identification unit further comprises:
a HowNet antisense relation recognition unit for performing an ASET set of words A on given words A and B by using the antisense relation explicitly defined in HowNetAExtracting if B is equal to ASETAIf the two words have antisense relation, otherwise, the Chinese antisense relation recognition unit is turned to Baidu, and in addition, the sense relation defined in HowNet is also used as an antisense relation;
baidu Chinese antisense relation recognition unit for extracting set of antisense words ASET of given word A using Baidu ChineseAExtracting synonym set SSET of word A by using expansion version of Hagong big synonym forestAFor each word W ∈ SSETAExtracting the antisense words and incorporating them into ASETAIf the word B belongs to ASETAIf the words A and B have antisense relation, otherwise, turning to an encyclopedia antisense relation recognition unit;
an Baidu encyclopedia antisense relation recognition unit for extracting the antisense word set ASET of the word A by using Baidu encyclopediaAIf the word B belongs to ASETAIf the two words have antisense relation, otherwise, the two words are converted into the whole part relation identification unit.
Further, the whole part relation identifying unit further includes:
a HowNet integral part relation recognition unit for respectively extracting parts of words A and B by using HowNetWord segmentation set MSETAAnd MSETBIf B ∈ MSETAOr A ∈ MSETBIf the two words have the integral part relationship, otherwise, the original definition integral part relationship identification unit is defined;
definition integral part relation recognition unit for processing using HowNet definition, wherein a word containing definition 'part' indicates the word as a part word (part) of a word, and the value of 'w hole' attribute in the definition indicates the definition of the whole word, and thereby the definition set DEFSET of words A and B is extractedAAnd DEFSETBIf DEF is presentA∈DEFSETAAnd DEFB∈DEFSETBSatisfies DEFAContains the attribute "whole" and has a value of DEFBOr DEFBContains the attribute "whole" and has a value of DEFAIf the words A and B have an integral partial relationship, otherwise, the words A and B are converted into a synonymy relationship identification unit;
in addition, in the definition-based whole part relationship recognition unit, some words may be processed in a generalization manner so that the whole part relationship cannot be effectively recognized by directly using the definition, and the value of the "whole" attribute in the above description is generalized to the upper concept thereof, and the rest of the operations are unchanged.
Further, the synonymy relationship identification unit further includes:
a synonym relation identification unit of the word forest, which is used for obtaining the synonym set SSET of the word A according to the line marked with ═ in the expansion version of the large synonym forest of the HaughAIf B ∈ SSETAIf yes, the words A and B have a synonymy relationship, otherwise, the HowNet synonymy relationship recognition unit is switched;
a HowNet synonymy relation recognition unit for extracting the synonymy set SSET of the word A by utilizing HowNetAIf the word B ∈ SSETAIf the words A and B have a synonymy relationship, otherwise, turning to a hundred-degree Chinese synonymy relationship identification unit;
a Baidu Chinese synonymy relation recognition unit for extracting the synonymy set SSET of the word A by utilizing Baidu ChineseAIf the word B ∈ SSETAThen, thenThe words A and B have a synonymy relationship, otherwise, the words A and B are converted into a Baidu encyclopedia synonymy relationship identification unit;
an encyclopedia synonymy relationship identification unit used for respectively acquiring encyclopedia link page sets PSET of the words A and B according to the page links of the encyclopedia
AAnd PSET
BIf it is satisfied
The words A and B have a synonymy relationship, otherwise, the upper and lower relationship identification units are switched.
Further, the context identification unit further includes:
a lower-level relation recognition unit of HowNet for respectively extracting lower-level word sets HSET of words A and B by using HowNetAAnd HSETBIf B ∈ HSETAOr A is epsilon to HSETBIf the words A and B have upper and lower relations, otherwise, the words A and B are transferred to the original definition of the upper and lower relation identification unit;
a definition upper and lower relation identification unit for respectively extracting definition sets DEFSET of the words A and B according to the upper and lower relations implied by the HowNet definition
AAnd DEFSET
BIf DEF is present
A∈DEFSET
AAnd DEF
B∈DEFSET
BSatisfy the primary and primary sense consistency and
or
Then words a and B have an up-down relationship.
The invention has the beneficial effects that:
1. the invention utilizes various Chinese knowledge resources to identify the semantic relation of the words, and fully utilizes each knowledge resource.
2. In the whole part relation recognition operation, aiming at the characteristics of the definition of the sememe of HowNet, the method is supplemented by a generalization method, so that the adaptability of the recognition method is improved.
3. In the process of identifying the upper and lower relation, the invention fully excavates the information contained in the definition in HowNet, and effectively improves the accuracy of identification.
4. The Chinese word semantic relation recognition method and device based on multiple Chinese knowledge resources can automatically recognize semantic relations of a given word pair, including antisense relations, integral part relations, synonymy relations and superior and inferior relations, and have high recognition accuracy.
The specific implementation mode is as follows:
in order to make the technical field better understand the scheme of the embodiment of the invention, the following detailed description is provided for the embodiment of the invention with reference to the accompanying drawings and implementation modes.
The semantic recognition process is exemplified for a word pair consisting of the word a "motor vehicle" and the word B "truck".
The flow chart of the Chinese word semantic relation recognition method based on various Chinese knowledge resources in the embodiment of the invention is shown in FIG. 1, and comprises the following steps:
step 101, antisense relation identification.
Acquiring an antisense word set by combining various Chinese knowledge resources, and judging whether the semantic relation among the words has an antisense relation according to the antisense word set, wherein the method specifically comprises the following steps:
step 1-1) performing an antisense word set ASET of a word A for given words A and B by using an explicitly defined antisense relation in HowNetAExtracting if B is equal to ASETAIf so, the two words have antisense relation, otherwise, the step 1-2) is switched to, and in addition, the sense relation defined in HowNet is also treated as an antisense relation;
extracting the antisense words (including the para-meaning words) of the word A 'motor vehicle' from HowNet to obtain ASET
AThe term "trailer", "cart", "dongfu car", "wheelbarrow", "rickshaw", "yellow croaker", "skeleton car", "rubber car", "bicycle", "horse car", "donkey car", "cow car", "volleyball", "flatbed car", "flatbed tricycle", "rickshaw", "tricycle", "mountain bike", "handcart", "animal car", "cart", "trolley", "ocean car", "moped", "bicycle", "a light horse cart", "chariot", "hub", "halter strap" }, obviously the term "truck" B "
So step 1-2) is performed.
Step 1-2) extraction of antisense word set ASET of given word A using Baidu ChineseAExtracting synonym set SSET of word A by using expansion version of Hagong big synonym forestAFor each word W ∈ SSETAExtracting the antisense words and incorporating them into ASETAIf the word B belongs to ASETAIf the words A and B have antisense relation, otherwise, turning to the step 1-3);
extracting the antisense words of the words A 'motor vehicles' from Baidu Chinese
Due to the word B 'truck'
So step 1-3) is performed.
Step 1-3) extracting the antisense word set ASET of the word A by using Baidu encyclopediaAIf the word B belongs to ASETAIf the two words have antisense relation, otherwise, go to step 2-1).
The antisense words of the word A 'motor vehicle' are extracted from Baidu encyclopedia
Due to the word B 'truck'
So go to step 2-1).
And 102, identifying the relationship of the whole part.
Extracting partial word sets by using various Chinese knowledge resources, and judging whether integral partial relations exist among words or not according to the partial word sets, wherein the method specifically comprises the following steps:
step 2-1) extracting partial word sets MSET of words A and B respectively by using HowNetAAnd MSETBIf B ∈ MSETAOr A ∈ MSETBIf so, the two words have integral part relation, otherwise, the step 2-2) is carried out;
extracting partial word sets of words A 'motor vehicle' and B 'truck' from HowNet to obtain MSET
A{ "headlight", "steering wheel", "sun visor", "trunk", "rear window", "tailgate", "rear lamp", "rear mirror", "cab", "sidecar", "straddle bucket", "automobile engine", "automobile horn", "automobile accessory", "cylinder", "headlight", "fuel gauge", "tail lamp", "trunk", "throttle", "sun visor roof" },
"truck" for the reason of B "
A 'motor vehicle'
So go to step 2-2).
Step 2-2) processing using HowNet semantic definition, in which a word containing the semantic "part" indicates that the word is a partial word (part) of a word, and the value of the "while" attribute in the definition indicates the semantic definition of its whole word, from which the semantic definition set DEFSET of words A and B is extractedAAnd DEFSETBIf DEF is presentA∈DEFSETAAnd DEFB∈DEFSETBSatisfies DEFAContains the attribute "whole" and has a value of DEFBOr DEFBContains the attribute "whole" and has a value of DEFAIf so, the words A and B have an integral part relationship, otherwise, turning to the step 3-1);
in addition, some words can be processed in a generalization mode for the whole part relation which can not be effectively recognized by directly utilizing definition of definition, and the value of the attribute of 'window' in the above is generalized to be the upper concept, and the rest operations are unchanged.
Using HowNet to extract DEFSET sets of definitions for words A "Motor vehicles" and B "trucksA(LandVehicle { "{ automotive ═ automatic } }" } and DEFSETB{ LandVehicle { [ automatic ], { transport | transport { [ from }, and (physical | substance } } } "}, apparently no DEF is presentA∈DEFSETAOr DEFB∈DEFSETBContains the sense original "part | part", thus going to step 3-1).
And step 103, identifying the synonymy relation.
The method comprises the following steps of extracting a synonym set by utilizing various Chinese knowledge resources, and judging whether synonym relations exist among words or not based on the synonym set, wherein the method specifically comprises the following steps:
step 3-1) representing synonyms according to the row marked with ═ in the expansion edition of the Harmony large synonym forest, and acquiring the synonym set SSET of the word AAIf B ∈ SSETAIf yes, the words A and B have a synonymy relation, otherwise, the step 3-2) is switched;
extracting synonym set of the word A 'motor vehicle' from the expansion version of the great synonym forest of the Hagong
B 'truck'
So go to step 3-2).
Step 3-2) extracting synonym set SSET of the word A by utilizing HowNetAIf the word B ∈ SSETAIf yes, the words A and B have a synonymy relation, otherwise, the step 3-3) is carried out;
in HowNet, the SSET is extracted from the set of synonyms for the word A "motor vehicle
A{ "motor vehicle", "automobile", "car", "sleeper" }, due to B "truck"
So go to step 3-3).
Step 3-3) extracting synonym set SSET of the word A by utilizing Baidu ChineseAIf the word B ∈ SSETAIf yes, the words A and B have a synonymy relation, otherwise, the step 3-4) is switched;
in Baidu Chinese, the synonyms of the word A 'motor vehicle' are extracted to be collected
"truck" for the reason of B "
So go to step 3-4).
Step 3-4) acquiring encyclopedia link page sets PSET of the words A and B respectively according to the page links of the encyclopedia
AAnd PSET
BIf it is satisfied
Then words a and B have a synonymy relationship, otherwise go to step 4-1).
In Baidu encyclopedia, the encyclopedia link pages of words A 'motor vehicle' and B 'truck' are respectively extracted to be aggregated into PSET
A{ "https:// baike
B{ "https:// baike.baidu.com/item/truck/4339", "https:// baike.baidu.com/item/truck/15281831", "https:// baike.baidu.com/item/truck/622401", "https:// baike.baidu.com/item/truck/3697802", "https:// baike.baidu.com/item/truck/7109303", "https:// baike.baidu.com/item/truck/3697784" }, since
So go to step 4-1).
And 104, identifying the upper and lower relation.
Extracting a hyponym set by means of various Chinese knowledge resources, and judging whether the words have a superior-subordinate relation according to the hyponym set, wherein the method specifically comprises the following steps:
step 4-1) extracting hyponym sets HSET of words A and B respectively by using HowNetAAnd HSETBIf B ∈ HSETAOr A is epsilon to HSETBIf the words A and B have an upper-lower relation, otherwise, turning to the step 4-2);
extracting the lower words of words A 'motor vehicle' and B 'truck' respectively in HowNet to obtain HSET
A{ "audi", "bus", "regular bus", "charter", "bmw", "galloping", "honk", "coach", "taxi", "coach", "bus", "universe", "taxi", "tram", "toyota", "ford", "bus", "black car", "truck", "container car", "airport bus", "emergency tender", "taxi", "traffic van", "coach", "police car", "ambulance", "old car", "truck", "cadilac", "empty car", "linken", "leak car", "truck", "kadi lac", "taxi", "parking in a car", "parking in a car", "parking", "in a car", "parking", etc. in a car "," parking ", and the like", etc. in a car "," parking, and/in a car "," parking in a car ", etc. in a carTrolley, station wagon, tourist coach, merseidess, minibus, shuttle coach, double bus, private car, commuter coach, shuttle bus, trolley bus, modern, fire truck, minibus, mini-bus, minibus, cruiser, patrol car, tourist coach, cross-country vehicle, transport vehicle, truck, dump truck and dump truck
The term B 'truck' belongs to HSET
ATherefore, the words A "motor vehicle" and B "truck" have a context relationship, that is, the semantic relationship recognition operation is completed up to this point.
Step 4-2) extracting definition sets DEFSET of words A and B respectively according to the upper and lower relations implied by HowNet definition
AAnd DEFSET
BIf DEF is present
A∈DEFSET
AAnd DEF
B∈DEFSET
BSatisfy the primary and primary sense consistency and
or
Then words a and B have an up-down relationship.
In the same way, the semantic relation recognition operation of the words on the human and the brain bags can be completed, and in order to explain the specific generalization operation, the following directly transits to the step 2-2),
in HowNet, DEFSET is extracted from the definition sets of words A "human" and B "brain bag" respectivelyA{ Behavior | hold ═ host ═ human } }, "{ Physique ═ host ═ animal } }," { Strength | power: host ═ community } }, "{ human }", "{ human | human ═ 3rdPerson }," { human | human: persona ═ 3rdPerson } }, "{ human | human: persona ═ 3rdPerson ═ he }", "{ human | human: persona ═ 3rdPerson ═ he }," { human ═ quality ═ mass } }, "{ human | human: modifier ═ mass ═ and" { human ═ modifier { }, "{ human { (physical }," "human { (human }," human { (human }, "human { (human },", and/oradult | adult } }, "{ human | quality ═ mass }," { human | person: { environment { }, "{ human | person: { environment: agent { - }, content ═ fact: modefier ═ specific } }, DEFSSETBThe term "part" means "part" head ", hold" animal ", obviously no DEF is presentA∈DEFSETAAnd DEFB∈DEFSETBSo that DEFAContains the attribute "whole" and has a value of DEFBOr DEFBContains the attribute "whole" and has a value of DEFATherefore, a generalization operation is performed to generalize DEFA"human" defines "AnimalHuman animal" for its higher-order concept meaning, when DEF is presentB∈DEFSETBContains the attribute "whole" and has a value of "{ AnimalHuman | animals } }", so that the words "person" and "brain sack" have an integral part relationship.
Through the above operation steps, the semantic relation recognition work of the given word pair can be completed.
Correspondingly, the embodiment of the invention also provides a Chinese word semantic relation recognition device based on multiple Chinese knowledge resources, and the structural schematic diagram of the device is shown in FIG. 2.
In this embodiment, the apparatus comprises:
an antisense relation recognition unit 201, configured to obtain an antisense word set using multiple chinese knowledge resources, and determine whether a semantic relation between words has an antisense relation according to the antisense word set;
an integral part relation recognition unit 202, configured to extract a partial word set using multiple chinese knowledge resources, and determine whether there is an integral part relation between words according to the partial word set;
the synonymy relation recognition unit 203 is used for extracting a synonymy set by utilizing various Chinese knowledge resources and judging whether synonymy relations exist among the words or not based on the synonymy set;
and the superior-inferior relation identification unit 204 is used for extracting a subordinate word set by means of various Chinese knowledge resources and judging whether superior-inferior relations exist among the words or not according to the subordinate word set.
The schematic structure diagram of the antisense relation recognition unit 201 of the device shown in fig. 2 is shown in fig. 3, and it includes:
a HowNet antisense relation recognition unit 301 for performing a set of antisense words ASET of word a for given words a and B using the explicitly defined antisense relation in HowNetAExtracting if B is equal to ASETAIf the two words have antisense relation, otherwise, the Chinese antisense relation recognition unit is turned to Baidu, and in addition, the sense relation defined in HowNet is also used as an antisense relation;
a Baidu Chinese antisense relation identifying unit 302 for extracting an antisense word set ASET of a given word A using Baidu ChineseAExtracting synonym set SSET of word A by using expansion version of Hagong big synonym forestAFor each word W ∈ SSETAExtracting the antisense words and incorporating them into ASETAIf the word B belongs to ASETAIf the words A and B have antisense relation, otherwise, turning to an encyclopedia antisense relation recognition unit;
an Baidu encyclopedia antisense relation recognition unit 303 for extracting an antisense word set ASET of the word A using Baidu encyclopediaAIf the word B belongs to ASETAIf the two words have antisense relation, otherwise, the two words are converted into the whole part relation identification unit.
Fig. 4 is a schematic structural diagram of the whole part relationship identification unit 202 of the apparatus shown in fig. 2, which includes:
a HowNet integral part relation recognition unit 401, configured to extract part of word sets MSET of the words a and B by using HowNetAAnd MSETBIf B ∈ MSETAOr A ∈ MSETBIf the two words have the integral part relationship, otherwise, the original definition integral part relationship identification unit is defined;
an ambiguities definition global part relation recognition unit 402 for processing using HowNet ambiguities, a word in a definition containing an ambiguities "part" representing the word as a part word (part) of a word, the value of the "while" attribute in the definition indicating the ambiguities definition of its global word, from which the ambiguities definition sets DEFSET of words a and B are extractedAAnd DEFSETBIf DEF is presentA∈DEFSETAAnd DEFB∈DEFSETBSatisfies DEFAContains the attribute "whole" and has a value of DEFBOr DEFBContains the attribute "whole" and has a value of DEFAIf the words A and B have an integral partial relationship, otherwise, the words A and B are converted into a synonymy relationship identification unit;
in addition, in the definition-based whole part relationship recognition unit, some words may be processed in a generalization manner so that the whole part relationship cannot be effectively recognized by directly using the definition, and the value of the "whole" attribute in the above description is generalized to the upper concept thereof, and the rest of the operations are unchanged.
Fig. 5 shows a schematic structural diagram of the synonymy relationship identification unit 203 of the apparatus shown in fig. 2, which includes:
a synonym relation identifying unit 501 for obtaining a synonym set SSET of the word a according to the synonym represented by the row labeled "═ in the expansion version of the large synonym forest of hayageAIf B ∈ SSETAIf yes, the words A and B have a synonymy relationship, otherwise, the HowNet synonymy relationship recognition unit is switched;
a HowNet synonymy relation identification unit 502, configured to extract a synonymy set SSET of the word A using HowNetAIf the word B ∈ SSETAIf the words A and B have a synonymy relationship, otherwise, turning to a hundred-degree Chinese synonymy relationship identification unit;
a Baidu Chinese synonymy relation identification unit 503 for extracting a synonym set SSET of the word a using Baidu ChineseAIf the word B ∈ SSETAIf the words A and B have a synonymy relationship, otherwise, turning to a Baidu encyclopedia synonymy relationship identification unit;
an encyclopedia synonymy
relationship identification unit 504, configured to obtain encyclopedia link page sets PSET for the words a and B according to the page links of the encyclopedia
AAnd PSET
BIf it is satisfied
The words A and B have a synonymy relationship, otherwise, the upper and lower relationship identification units are switched.
Fig. 6 shows a schematic structural diagram of the superior-inferior relation identification unit 204 of the apparatus shown in fig. 2, which includes:
a lower relation recognition unit 601 of HowNet, configured to extract lower word sets HSET of words A and B by using HowNetAAnd HSETBIf B ∈ HSETAOr A is epsilon to HSETBIf the words A and B have upper and lower relations, otherwise, the words A and B are transferred to the original definition of the upper and lower relation identification unit;
a definition
context identification unit 602, configured to extract definition sets DEFSET of words a and B according to the context implied by the HowNet definition
AAnd DEFSET
BIf DEF is present
A∈DEFSET
AAnd DEF
B∈DEFSET
BSatisfy the primary and primary sense consistency and
or
Then words a and B have an up-down relationship.
The Chinese word semantic relation recognition device based on various Chinese knowledge resources shown in fig. 2 to 6 can be integrated into various hardware entities. For example, a Chinese term semantic relationship recognition device based on multiple Chinese knowledge resources can be integrated into: personal computers, smart phones, workstations, and the like.
The Chinese word semantic relation recognition method based on multiple Chinese knowledge resources provided by the embodiment of the invention can be stored on various storage media in a storage mode of instruction or instruction set storage. Such storage media include, but are not limited to: floppy disk, optical disk, hard disk, memory, U disk, CF card, SM card, etc.
In summary, in the embodiment of the present invention, the antisense word set is obtained by combining multiple chinese knowledge resources, and whether the semantic relationship between words has an antisense relationship is determined according to the antisense word set; extracting a partial word set by using various Chinese knowledge resources, and judging whether integral partial relations exist among words or not according to the partial word set; extracting a synonym set by utilizing various Chinese knowledge resources, and judging whether synonym relations exist among the words or not based on the synonym set; extracting a hyponym set by means of various Chinese knowledge resources, and judging whether the words have a superior-subordinate relationship or not according to the hyponym set. Therefore, after the embodiment of the invention is applied, the Chinese word semantic relation recognition based on various Chinese knowledge resources is realized. The implementation mode of the invention can utilize various different Chinese knowledge resources to identify the semantic relation of the words, and fully utilizes each knowledge resource; in the whole part identification process, aiming at the characteristics of the definition of the HowNet sememe, the method is supplemented by a generalization method, so that the adaptability of the identification method is improved; in the process of identifying the upper and lower relations, the invention fully excavates the information contained in the definition in HowNet, thereby effectively improving the accuracy of identification; the Chinese word semantic relation recognition method and device based on multiple Chinese knowledge resources can automatically recognize semantic relations of a given word pair, including antisense relations, integral part relations, synonymy relations and superior and inferior relations, and have high recognition accuracy.
The embodiments in this specification are described in a progressive manner, and like parts may be referred to each other. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points.
The foregoing detailed description of the embodiments of the present invention has been presented for purposes of illustration and description and is intended to be exemplary only of the method and apparatus for practicing the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and therefore the present specification should not be construed as limiting the present invention.