CN109726385A - Word sense disambiguation method and equipment, meaning of a word extended method and device - Google Patents

Word sense disambiguation method and equipment, meaning of a word extended method and device Download PDF

Info

Publication number
CN109726385A
CN109726385A CN201711048364.7A CN201711048364A CN109726385A CN 109726385 A CN109726385 A CN 109726385A CN 201711048364 A CN201711048364 A CN 201711048364A CN 109726385 A CN109726385 A CN 109726385A
Authority
CN
China
Prior art keywords
word
training
related term
read statement
disambiguation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711048364.7A
Other languages
Chinese (zh)
Inventor
张驰
郭心语
李安新
陈岚
礒田佳德
小野隆哉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NTT Docomo Inc
Original Assignee
NTT Docomo Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NTT Docomo Inc filed Critical NTT Docomo Inc
Priority to CN201711048364.7A priority Critical patent/CN109726385A/en
Priority to JP2020524159A priority patent/JP2021501420A/en
Priority to CN201880071178.1A priority patent/CN111295661A/en
Priority to PCT/CN2018/104334 priority patent/WO2019085640A1/en
Publication of CN109726385A publication Critical patent/CN109726385A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The present invention relates to a kind of Word sense disambiguation method and equipment based on hypernym, and meaning of a word extended method and equipment using the Word sense disambiguation method.The Word sense disambiguation method includes: reception read statement;Based on predetermined ambiguity dictionary, the disambiguation target word in the read statement is determined;Based on the syntactic analysis and contextual information analysis to the read statement, the related term of the target word is determined;Determine one or more hypernyms of the related term;And the morphology based on the related term and one or more of hypernyms, part of speech and the syntactic relation with target word, determine the meaning of a word of the target word in the read statement.

Description

Word sense disambiguation method and equipment, meaning of a word extended method and device
Technical field
The present invention relates to artificial intelligence fields, more particularly it relates to which a kind of Word sense disambiguation method and equipment, utilize The meaning of a word extended method and device and computer readable storage medium of the Word sense disambiguation method.
Background technique
Word sense disambiguation (WSD) refers to the meaning of a word of the determining polysemant in the specific context of natural language.Word sense disambiguation is The Basic Problems of natural language process field.When in the sentence in natural language processing to be carried out there are when polysemant, if not It can correctly determine the correct meaning of a word of the polysemant in the sentence context, just will appear word Ambiguity, thus serious shadow Ring correct understanding and processing of the machine for natural language.Such as language identification, machine translation, information retrieval, text classification, Automatic abstract etc. is based in the application field of natural language, requiring to solve the problems, such as the word sense disambiguation for polysemant.
Currently, the word sense disambiguation scheme based on corpus mainly includes supervision and unsupervised approaches.Unsupervised approaches are not required to Training corpus is wanted, but it disambiguates precision and is unable to satisfy real requirement.Current measure of supervision then needs extensive high quality Corpus be trained to model is disambiguated, and it is once practical wait disambiguate the word for occurring corpus in sentence and being not covered with, then It is likely to the case where can not determining ambiguity word occur.
Summary of the invention
In view of the above problems, the present invention provides a kind of Word sense disambiguation method and equipment, utilizes the word of the Word sense disambiguation method Adopted extended method and device and computer readable storage medium.
According to one embodiment of present invention, a kind of Word sense disambiguation method is provided, comprising: receive read statement;It is based on Predetermined ambiguity dictionary determines the disambiguation target word in the read statement;Based on to the read statement syntactic analysis and Contextual information analysis, determines the related term of the target word;Determine one or more hypernyms of the related term;And base In the related term and one or more of hypernyms, the meaning of a word of the target word in the read statement is determined.
In addition, Word sense disambiguation method according to an embodiment of the invention, wherein described based on to the read statement Syntactic analysis and contextual information analysis, determine that the related term of the target word includes: based on to the read statement Part of speech analysis mark, determines the part of speech of each word in the read statement;And it is based on the part of speech and the syntactic analysis Result and to contextual analysis of target word etc. as a result, determining the related term of the target word according to pre-defined rule.
In addition, Word sense disambiguation method according to an embodiment of the invention, further includes that preparatory training executes the meaning of a word The word sense disambiguation module of disambiguation method, wherein the training word sense disambiguation module includes: to be noted for trained training data; Data processing is executed to the training data, and obtains the predetermined ambiguity dictionary;For every in the training data Training sentence, is based on the predetermined ambiguity dictionary, determines the disambiguation training objective word in described every trained sentence;Based on to institute The syntactic analysis and contextual information analysis for stating every trained sentence, determine the training related term of the training objective word;Really Determine the morphology of the hypernym of the training objective word, the trained related term, the training objective word and the trained related term, Part of speech and with the syntactic relation of target word as training characteristics;And utilize the training characteristics training word sense disambiguation mould Block.
According to another embodiment of the invention, a kind of meaning of a word extended method is provided, comprising: receive read statement;Base In predetermined ambiguity dictionary, the disambiguation target word in the read statement and non-ambiguity word are determined;It is determined using word sense disambiguation module The meaning of a word that target word is disambiguated in the read statement;Based on predetermined thesaurus, determination corresponds respectively to the non-discrimination The synonym and hypernym of adopted word and the meaning of a word for disambiguating target word;And the synonym and hypernym are utilized, extension The read statement, wherein described to determine word of the disambiguation target word in the read statement using word sense disambiguation module Justice includes: to determine the related term of the target word based on the syntactic analysis and contextual information analysis to the read statement; Determine one or more hypernyms of the related term;And it is based on the related term and one or more of hypernyms, Determine the meaning of a word of the target word in the read statement.
In addition, meaning of a word extended method according to another embodiment of the invention, wherein described based on to the input language The syntactic analysis of sentence and contextual information analysis determine that the related term of the target word includes: based on to the read statement Part of speech analyze mark, determine the part of speech of each word in the read statement;And based on the part of speech and the syntax point The result of analysis and to contextual analysis of target word etc. as a result, determining the related term of the target word according to pre-defined rule.
In addition, meaning of a word extended method according to another embodiment of the invention, further includes that preparatory training executes institute's predicate The word sense disambiguation module of adopted disambiguation method, wherein the training word sense disambiguation module includes: the training number for being noted for training According to;Data processing is executed to the training data, and obtains the predetermined ambiguity dictionary;For every in the training data Item trains sentence, is based on the predetermined ambiguity dictionary, determines the disambiguation training objective word in described every trained sentence;Based on pair The syntactic analysis and contextual information analysis of described every trained sentence, determine the training related term of the training objective word; Determine the word of the hypernym of the training objective word, the trained related term, the training objective word and the trained related term Shape, part of speech and with the syntactic relation of target word as training characteristics;And utilize the training characteristics training word sense disambiguation Module.
According to still another embodiment of the invention, a kind of word sense disambiguation equipment is provided, comprising: receiving unit is configured to Receive read statement;Target word determination unit is configured to predetermined ambiguity dictionary, determines the disambiguation mesh in the read statement Mark word;Related term determination unit is configured to analyze the syntactic analysis of the read statement and contextual information, determine The related term of the target word;Hypernym determination unit is configured to determine one or more hypernyms of the related term;And Word sense disambiguation unit is configured to the related term and one or more of hypernyms, determines the target word in institute State the meaning of a word in read statement.
In addition, word sense disambiguation equipment according to still another embodiment of the invention, wherein the related term determination unit into One step is configured that based on the part of speech analysis mark to the read statement, determines the part of speech of each word in the read statement; And result based on the part of speech and the syntactic analysis and to contextual analysis of target word etc. as a result, according to pre- set pattern Then determine the related term of the target word.
In addition, word sense disambiguation equipment according to still another embodiment of the invention, further includes training unit, it is configured that mark Training data of the note for training;Data processing is executed to the training data, and obtains the predetermined ambiguity dictionary;For Every trained sentence in the training data is based on the predetermined ambiguity dictionary, determines disappearing in described every trained sentence Discrimination training objective word;Based on the syntactic analysis and contextual information analysis to described every trained sentence, the training is determined The training related term of target word;Determine the training objective word, the trained related term, the training objective word and the training The morphology of the hypernym of related term, part of speech and with the syntactic relation of target word as training characteristics;And it is special using the training The sign training word sense disambiguation unit.
Still another embodiment in accordance with the present invention provides a kind of meaning of a word expanding unit, comprising: receiving module is configured to Receive read statement;Target word determining module is configured to predetermined ambiguity dictionary, determines the disambiguation mesh in the read statement Mark word and non-ambiguity word;Word sense disambiguation module is configured to determine the meaning of a word of the disambiguation target word in the read statement;Word Adopted expansion module, configuration are based on predetermined thesaurus, and determination corresponds respectively to the non-ambiguity word and the disambiguation target word The meaning of a word synonym and hypernym;And the synonym and hypernym are utilized, extend the read statement, wherein described Word sense disambiguation module is further configured to include: related term determination unit, is configured to the syntax point to the read statement Analysis and contextual information analysis, determine the related term of the target word;Hypernym determination unit is configured to determine the correlation One or more hypernyms of word;And word sense disambiguation unit, it is configured to the related term and one or more of Hypernym determines the meaning of a word of the target word in the read statement.
In addition, the meaning of a word expansion equipment of still another embodiment in accordance with the present invention, wherein the related term determination unit into One step is configured that based on the part of speech analysis mark to the read statement, determines the part of speech of each word in the read statement; And result based on the part of speech and the syntactic analysis and to contextual analysis of target word etc. as a result, according to pre- set pattern Then determine the related term of the target word.
In addition, the meaning of a word expansion equipment of still another embodiment in accordance with the present invention, further includes training module, it is configured that mark Training data of the note for training;Data processing is executed to the training data, and obtains the predetermined ambiguity dictionary;For Every trained sentence in the training data is based on the predetermined ambiguity dictionary, determines disappearing in described every trained sentence Discrimination training objective word;Based on the syntactic analysis and contextual information analysis to described every trained sentence, the training is determined The training related term of target word;Determine the training objective word, the trained related term, the training objective word and the training The morphology of the hypernym of related term, part of speech and with the syntactic relation of target word as training characteristics;And it is special using the training The sign training word sense disambiguation unit.
Still another embodiment in accordance with the present invention provides a kind of word sense disambiguation equipment, comprising: processor;And storage Device is configured to storage computer program instructions;Wherein, when the computer program instructions are run by the processor, so that The processor executes Word sense disambiguation method.
Still another embodiment in accordance with the present invention provides a kind of meaning of a word expansion equipment, comprising: processor;And storage Device is configured to storage computer program instructions;Wherein, when the computer program instructions are run by the processor, so that The processor executes meaning of a word extended method.
Still another embodiment in accordance with the present invention provides a kind of computer readable storage medium, described computer-readable Storage medium is stored with computer program instructions, wherein when the computer program instructions are run by processor, so that the place It manages device and executes Word sense disambiguation method.
Still another embodiment in accordance with the present invention provides a kind of computer readable storage medium, described computer-readable Storage medium is stored with computer program instructions, wherein when the computer program instructions are run by processor, so that the place It manages device and executes meaning of a word extended method.
Word sense disambiguation method according to an embodiment of the present invention and equipment utilize the meaning of a word extended method of the Word sense disambiguation method And device, the related term for disambiguating target word is determined by syntactic analysis, and related term is expanded into its hypernym, to pass through Consider related term and its hypernym, realizes the determination for disambiguating the target word meaning of a word, greatly reduce for training corpus size It relies on.
It is to be understood that foregoing general description and following detailed description are both illustrative, and it is intended to In the further explanation of the claimed technology of offer.
Detailed description of the invention
The embodiment of the present invention is described in more detail in conjunction with the accompanying drawings, the above and other purposes of the present invention, Feature and advantage will be apparent.Attached drawing is used to provide to further understand the embodiment of the present invention, and constitutes explanation A part of book, is used to explain the present invention together with the embodiment of the present invention, is not construed as limiting the invention.In the accompanying drawings, Identical reference label typically represents same parts or step.
Fig. 1 is the flow chart for illustrating the Word sense disambiguation method of embodiment according to the present invention;
Fig. 2 is the flow chart of the Word sense disambiguation method of further diagram embodiment according to the present invention;
Fig. 3 is the flow chart for illustrating the training method of word sense disambiguation module of embodiment according to the present invention;
Fig. 4 is the block diagram for illustrating the word sense disambiguation equipment of embodiment according to the present invention;
Fig. 5 is the flow chart for illustrating the meaning of a word extended method of embodiment according to the present invention;
Fig. 6 is the block diagram for illustrating the meaning of a word expanding unit of embodiment according to the present invention;
Fig. 7 is the schematic diagram for illustrating the meaning of a word expansion process of embodiment according to the present invention;
Fig. 8 is the hardware block diagram for illustrating the word sense disambiguation equipment of embodiment according to the present invention;
Fig. 9 is the hardware block diagram for illustrating the meaning of a word expansion equipment of embodiment according to the present invention;And
Figure 10 is the schematic diagram for illustrating the computer readable storage medium of embodiment according to the present invention.
Specific embodiment
In order to enable the object, technical solutions and advantages of the present invention become apparent, root is described in detail below with reference to accompanying drawings According to example embodiments of the present invention.Obviously, described embodiment is only a part of the embodiments of the present invention, rather than this hair Bright whole embodiments, it should be appreciated that the present invention is not limited by example embodiment described herein.Based on described in the present invention Embodiment, those skilled in the art's obtained all other embodiment in the case where not making the creative labor should all be fallen Enter within protection scope of the present invention.
Hereinafter, the embodiment of the present invention will be described in detail with reference to the attached drawings.It will describe first referring to figs. 1 to Fig. 4 according to this hair The Word sense disambiguation method of bright embodiment, realize the Word sense disambiguation method word sense disambiguation module training side and using should The word sense disambiguation equipment of Word sense disambiguation method.
Fig. 1 is the flow chart for illustrating the Word sense disambiguation method of embodiment according to the present invention.As shown in Figure 1, according to this hair The Word sense disambiguation method of bright embodiment includes the following steps.
In step s101, read statement is received.In an embodiment of the present invention, for example, receive sentence " his skill in martial arts is very It is high ".Hereafter, processing enters step S102.
In step s 102, it is based on predetermined ambiguity dictionary, determines the disambiguation target word in read statement.In reality of the invention It applies in example, ambiguity dictionary is to generate in the training stage willed then be described for training corpus.For read statement, pass through Ambiguity dictionary is searched, determines ambiguity word present in ambiguity dictionary as disambiguation target word.For example, for connecing in step s101 The sentence " his skill in martial arts is very high " of receipts determines "high" as disambiguation target word.Disambiguate target word "high" have " superb, superpower " with And the difference meaning of a word as " tall ".Hereafter, processing enters step S103.
In step s 103, based on the syntactic analysis and contextual information analysis to read statement, target word is determined Related term.Hereinafter, by being described in detail how referring to Fig. 2 based on the syntactic analysis and contextual information analysis to read statement, Determine the related term of target word.For example, for received sentence " his skill in martial arts is very high " in step s101, in step s 102 After determining "high" as target word is disambiguated, the related term of " skill in martial arts " as target word "high" will be determined in step s 103.This Afterwards, processing enters step S104.
In step S104, one or more hypernyms of related term are determined.Target word is disambiguated for example, being used as in "high", In the case where the related term of " skill in martial arts " as target word "high", the hypernym " ability, intelligence and art " of related term " skill in martial arts " is determined.This Afterwards, processing enters step S105.
In step s105, based on related term and one or more hypernyms, determine target word in read statement The meaning of a word.For example, based on related term " skill in martial arts " and hypernym " ability, intelligence and art ", in this way it is easy to determine "high" is corresponding to " ability, intelligence and art " The meaning of a word be " superb, superpower " rather than " tall ".
By referring to the Word sense disambiguation method of the embodiment according to the present invention of Fig. 1 description, pass through the sentence to read statement Method analysis and contextual information analysis, determine the related term of target word, and related term is expanded to its hypernym, thus logical Consideration related term and its hypernym are crossed, the determination for disambiguating the target word meaning of a word is realized, greatly reduces for training corpus size Dependence.For example, related term can also be passed through even if not occurring related term " skill in martial arts " in the lesser training corpus of scale Hypernym " ability, intelligence and art " the correct correct meaning of a word for determining target word "high" in sentence of " skill in martial arts ".And if do not used State the extension of related term to its hypernym, then probably due to related term does not appear in the training corpus of limited scale and It can not correctly determine the meaning of a word of target word.
Fig. 2 is the flow chart of the Word sense disambiguation method of further diagram embodiment according to the present invention.As shown in Fig. 2, root Include the following steps according to the Word sense disambiguation method of the embodiment of the present invention.
In step s 200, training word sense disambiguation module.In an embodiment of the present invention, support vector machines can be used (SVM) classifier is as word sense disambiguation module, therefore before executing Word sense disambiguation method, need using training corpus for Word sense disambiguation module executes training.Hereinafter, by the word sense disambiguation module referring to Fig. 3 detailed description embodiment according to the present invention Training method.After obtaining trained word sense disambiguation module, processing enters step S201.
Step S201 and S202 in Fig. 2 is identical as the step S101 and S102 described above by reference to Fig. 1 respectively, will omit Its repeated description.Hereafter, processing enters step S203.Step S203 and step S204 is in the step S103 described referring to Fig.1 Determine the specific steps of the related term processing of target word.
In step S203, mark is analyzed based on the part of speech for read statement, determines each word in read statement Part of speech.In an embodiment of the present invention, it is handled using part-of-speech tagging (POS) and obtains the part of speech of read statement.Hereafter, processing enters Step S204.
In step S204, result based on part of speech and syntactic analysis and to contextual analysis of target word etc. as a result, The related term of target word is determined according to pre-defined rule.
In an embodiment of the present invention, it can for example be indicated with following table 1 according to the syntactic relation type of syntactic analysis:
Table 1
The part of speech of each word is determined in step S203, and determined in step S204 syntactic relation type it Afterwards, so that it may the related term of target word is determined according to pre-defined rule.For example, for read statement " his skill in martial arts is very high ", target The part of speech of word "high" is adjective, and " he " is pronoun, and it is adverbial word that " skill in martial arts ", which is noun, " very ", and in addition syntactic analysis is shown There is relationship in fixed between " skill in martial arts " and "high", so that it is determined that " skill in martial arts " is the related term of target word "high".Determining related term Later, processing enters in step S205.
Step S205 and step S206 is identical as the step S104 and S105 described referring to Fig.1 respectively, for based on related term And one or more hypernyms, it determines the process of the meaning of a word of the target word in read statement, will omit its repeated description herein.
Fig. 3 is the flow chart for illustrating the training method of word sense disambiguation module of embodiment according to the present invention.Such as Fig. 3 institute Show, the training method of the word sense disambiguation module of embodiment according to the present invention includes the following steps.
In step S301, it is noted for trained training data.Hereafter, processing enters step S302.
In step s 302, data processing is executed to training data, and obtains predetermined ambiguity dictionary.In reality of the invention It applies in example, useful data is filtered and extracted by data processing, and obtain the ambiguity of the ambiguity word including predetermined number Dictionary.Hereafter, processing enters step S303.
In step S303, for every trained sentence in training data, it is based on predetermined ambiguity dictionary, determines every instruction Practice the disambiguation training objective word in sentence.Every trained sentence in training data is determined in step S303 and disambiguates training The method of target word is identical with the step S102 described above by reference to Fig. 1 and referring to the step S202 of Fig. 2 description, is all that can lead to The mode for crossing lookup ambiguity dictionary carries out.Hereafter, processing enters step S304.
In step s 304, based on the syntactic analysis and contextual information analysis to every trained sentence, training is determined The training related term of target word.In step s 304 for disambiguating training objective word in every trained sentence in training data Determine the method and the step S103 described above by reference to Fig. 1 and the step described referring to Fig. 2 of the training related term of training objective word Rapid S203 and S204 is identical, is all that can utilize syntax by obtaining the part of speech of trained sentence using part-of-speech tagging (POS) processing Analysis has determined syntactic relation type, and the related term of target word is determined according to pre-defined rule.Hereafter, processing enters step S305.
In step S305, the hypernym of training objective word, training related term, training objective word and training related term is determined Morphology, part of speech and with the syntactic relation of target word as training characteristics.In an embodiment of the present invention, by training objective word, Training related term, the hypernym of training objective word and training related term and morphology, the part of speech of these words etc. are extracted as training Feature, and conversion (for example, executing the implantation of feature hash) is executed to feature and obtains the feature for being suitable for machine learning.This Afterwards, processing enters step S306.
In step S306, training characteristics training word sense disambiguation module is utilized.In an embodiment of the present invention, training is utilized Feature SVM classifier, and trained model is saved as word sense disambiguation module.
Fig. 4 is the block diagram for illustrating the word sense disambiguation equipment of embodiment according to the present invention.As shown in figure 4, according to the present invention Embodiment word sense disambiguation equipment 400 include receiving unit 401, target word determination unit 402, related term determination unit 403, Hypernym determination unit 404 and word sense disambiguation unit 405.
Specifically, receiving unit 401 is configured to receive read statement.Target word determination unit 402 is configured to make a reservation for Ambiguity dictionary determines the disambiguation target word in the read statement.Related term determination unit 403 is configured to the input The syntactic analysis of sentence and contextual information analysis, determine the related term of the target word.Hypernym determination unit 404 configures For one or more hypernyms of the determination related term.Word sense disambiguation unit 405 is configured to the related term and institute One or more hypernyms are stated, determine the meaning of a word of the target word in the read statement.The related term determination unit 403 It is further configured to determine the word of each word in the read statement based on the part of speech analysis mark to the read statement Property;And result based on the part of speech and the syntactic analysis and to contextual analysis of target word etc. as a result, according to pre- Set pattern then determines the related term of the target word.Each unit of word sense disambiguation equipment 400 as described above execute referring to Fig.1 and The Word sense disambiguation method of the embodiment according to the present invention of Fig. 2 description.
In addition, the word sense disambiguation equipment 400 of embodiment according to the present invention can also include training unit (not shown).Instruction White silk unit, which is configured that, is noted for trained training data;Data processing is executed to the training data, and is obtained described pre- Determine ambiguity dictionary;For every trained sentence in the training data, it is based on the predetermined ambiguity dictionary, is determined every described Disambiguation training objective word in training sentence;Based on to described every trained sentence syntactic analysis and contextual information point Analysis, determines the training related term of the training objective word;Determine the training objective word, the trained related term, the training The morphology of the hypernym of target word and the trained related term, part of speech and with the syntactic relation of target word as training characteristics;With And utilize the training characteristics training word sense disambiguation unit.
More than, the Word sense disambiguation method of embodiment according to the present invention is described referring to figs. 1 to Fig. 4 and word sense disambiguation is set It is standby;Expanded hereinafter, Fig. 5 to Fig. 7 description will be referred to further using the meaning of a word of the Word sense disambiguation method of embodiment according to the present invention Exhibition method and meaning of a word expanding unit.
Fig. 5 is the flow chart for illustrating the meaning of a word extended method of embodiment according to the present invention.As shown in figure 5, according to this hair The meaning of a word extended method of bright embodiment includes the following steps.
In step S501, read statement is received.In an embodiment of the present invention, the meaning of a word of embodiment according to the present invention Extended method carries out meaning of a word extension for the word in received read statement.Hereafter, processing enters step S502.
In step S502, it is based on predetermined ambiguity dictionary, determines the disambiguation target word in read statement and non-ambiguity word.? In the embodiment of the present invention, predetermined ambiguity dictionary can be as described in above by reference to Fig. 3 the training stage determine.Hereafter, Processing enters step S503.
In step S503, the meaning of a word for disambiguating target word in read statement is determined using word sense disambiguation module.In this hair In bright embodiment, word sense disambiguation module executes the Word sense disambiguation method described referring to Figures 1 and 2, i.e., by read statement Syntactic analysis and contextual information analysis, determine the related term of target word, and related term is expanded into its hypernym, from And by considering related term and its hypernym, realize the determination for disambiguating the target word meaning of a word.Hereafter, processing enters step S504.
In step S504, it is based on predetermined thesaurus, it is determining to correspond respectively to non-ambiguity word and disambiguate target word The synonym and hypernym of the meaning of a word.In an embodiment of the present invention, predetermined thesaurus can be existing Chinese thesaurus.This Afterwards, processing enters step S05.
In step S505, using synonym and hypernym, read statement is extended.
Fig. 6 is the block diagram for illustrating the meaning of a word expanding unit of embodiment according to the present invention.As shown in fig. 6, according to the present invention Embodiment meaning of a word expanding unit 600 include receiving module 601, target word determining module 602,603 and of word sense disambiguation module Meaning of a word expansion module 604.
Specifically, receiving module 601 is configured to receive read statement.Target word determining module 602 is configured to make a reservation for Ambiguity dictionary determines the disambiguation target word in the read statement and non-ambiguity word.Word sense disambiguation module 603 is configured to determine institute State the meaning of a word for disambiguating target word in the read statement.The configuration of meaning of a word expansion module 604 is based on predetermined thesaurus, determines and divides Not Dui Yingyu the non-ambiguity word and it is described disambiguate target word the meaning of a word synonym and hypernym;And it utilizes described synonymous Word and hypernym extend the read statement.
More specifically, the word sense disambiguation module 603 is further configured to include: related term determination unit 6031, configuration To determine the related term of the target word based on the syntactic analysis and contextual information analysis to the read statement;It is upper Word determination unit 6032 is configured to determine one or more hypernyms of the related term;And word sense disambiguation unit 6033, match It is set to based on the related term and one or more of hypernyms, determines word of the target word in the read statement Justice.The related term determination unit 6031 is further configured to: being analyzed mark based on the part of speech to the read statement, is determined institute State the part of speech of each word in read statement;And result based on the part of speech and the syntactic analysis and to target word Contextual analysis etc. is as a result, determine the related term of the target word according to pre-defined rule.
In addition, the meaning of a word expanding unit 600 of embodiment according to the present invention can also include training module (not shown).Instruction White silk module, which is configured that, is noted for trained training data;Data processing is executed to the training data, and is obtained described pre- Determine ambiguity dictionary;For every trained sentence in the training data, it is based on the predetermined ambiguity dictionary, is determined every described Disambiguation training objective word in training sentence;Based on to described every trained sentence syntactic analysis and contextual information point Analysis, determines the training related term of the training objective word;Determine the training objective word, the trained related term, the training The morphology of the hypernym of target word and the trained related term, part of speech and with the syntactic relation of target word as training characteristics;With And utilize the training characteristics training word sense disambiguation unit.
Fig. 7 is the schematic diagram for illustrating the meaning of a word expansion process of embodiment according to the present invention.Specifically, Fig. 7 illustrates ginseng The reality according to the present invention referring to Fig. 5 description is executed according to the meaning of a word expanding unit 600 of the embodiment according to the present invention of Fig. 6 description Apply an example of the meaning of a word extended method of example.
As shown in fig. 7, receiving module 601 receives read statement " how many event being distinguished in Olympics ".
The read statement enters target word determining module 602, is based on predetermined ambiguity dictionary, determines in the read statement Disambiguate target word and non-ambiguity word.In this example, target word determining module 602 determines that " Olympics has read statement respectively " difference " in how many events " is to disambiguate target word, and other words are non-ambiguity word.
Determining disambiguation target word " difference " is supplied to word sense disambiguation module 603 by target word determining module 602.The meaning of a word disappears 603 pairs of discrimination module disambiguate the Word sense disambiguation method that target word " difference " executes embodiment according to the present invention, and determine and disambiguate The meaning of a word of target word " difference ".
" difference " of the meaning of a word is determined in word sense disambiguation module 603 and is determined as non-discrimination in target word determining module 602 The word of adopted word enters meaning of a word expansion module 604.604 Chinese thesaurus of meaning of a word expansion module, by read statement " Olympics point How many other event " be extended to extension sentence " [Olympics | the Olympics Olympic Games | < match contest is had a competition competing Skill trial of strength >] [Respectively | it respectively exists side by side individually respectively separately] [have | have possess possess possess it is all] it is how many [and match | match Contest, which has a competition with racing, to haggle] [project | Genre categories category type class]? ".
Fig. 8 is the hardware block diagram for illustrating the word sense disambiguation equipment of embodiment according to the present invention.As shown in figure 8, according to this The word sense disambiguation equipment 800 of the embodiment of invention includes processor 801 and memory 802.The memory 802 is configured to store Computer program instructions, the computer program instructions are executed when being run by processor 801 and are described above with reference to the figures above Word sense disambiguation method.
Fig. 9 is the hardware block diagram for illustrating the meaning of a word expansion equipment of embodiment according to the present invention.As shown in figure 9, according to this The word sense disambiguation equipment 900 of the embodiment of invention includes processor 901 and memory 902.The memory 902 is configured to store Computer program instructions, the computer program instructions are executed when being run by processor 901 and are described above with reference to the figures above Word sense disambiguation method.
Figure 10 is the schematic diagram for illustrating the computer readable storage medium of embodiment according to the present invention.As shown in Figure 10, Computer readable storage medium 1000 according to an embodiment of the present invention is stored thereon with computer program instructions 1001.When the meter When calculation machine program instruction 1001 is run by processor, execution disappears referring to the meaning of a word according to an embodiment of the present invention that the figures above describes Discrimination method and meaning of a word extended method.
More than, it describes Word sense disambiguation method according to an embodiment of the present invention and equipment with reference to the accompanying drawings, is disappeared using the meaning of a word The meaning of a word extended method and device of discrimination method.The related term for disambiguating target word is determined by syntactic analysis, and related term is expanded Its hypernym is opened up, to realize the determination for disambiguating the target word meaning of a word by considering related term and its hypernym, greatly reduce Dependence for training corpus size.
Basic principle of the invention is described in conjunction with specific embodiments above, however, it is desirable to, it is noted that in the present invention The advantages of referring to, advantage, effect etc. are only exemplary rather than limitation, must not believe that these advantages, advantage, effect etc. are of the invention Each embodiment is prerequisite.In addition, detail disclosed above is merely to exemplary effect and the work being easy to understand With, rather than limit, above-mentioned details is not intended to limit the present invention as that must realize using above-mentioned concrete details.
Device involved in the present invention, device, equipment, system block diagram only as illustrative example and be not intended to It is required that or hint must be attached in such a way that box illustrates, arrange, configure.As those skilled in the art will appreciate that , it can be connected by any way, arrange, configure these devices, device, equipment, system.Such as "include", "comprise", " tool " etc. word be open vocabulary, refer to " including but not limited to ", and can be used interchangeably with it.Vocabulary used herein above "or" and "and" refer to vocabulary "and/or", and can be used interchangeably with it, unless it is not such that context, which is explicitly indicated,.Here made Vocabulary " such as " refers to phrase " such as, but not limited to ", and can be used interchangeably with it.
Step flow chart and above method description in the present invention only as illustrative example and are not intended to require Or imply the step of must carrying out each embodiment according to the sequence that provides, certain steps can it is parallel, independently of one another or according to Other sequences appropriate execute.In addition, such as " thereafter ", " then ", " following " etc. word be not intended to limit step Sequentially;These words are only used for the description that guidance reader reads over these methods.
In addition, as used herein, the "or" instruction separation used in the enumerating of the item started with "at least one" It enumerates, so that enumerating for such as " at least one of A, B or C " means A or B or C or AB or AC or BC or ABC (i.e. A and B And C).In addition, wording " exemplary " does not mean that the example of description is preferred or more preferable than other examples.
It may also be noted that in the apparatus and method of the present invention, each component or each step are can to decompose and/or again Combination nova.These, which decompose and/or reconfigure, should be regarded as equivalent scheme of the invention.
For those of ordinary skill in the art, it is to be understood that whole or any portions of methods and apparatus of the present invention Point, can in any computing device (including processor, storage medium etc.) or the network of computing device, with hardware, firmware, Software or their combination are realized.The hardware can be using being designed to carry out the logical of function described herein With processor, digital signal processor (DSP), ASIC, field programmable gate array signal (FPGA) or other programmable logic devices Part (PLD), discrete gate or transistor logic, discrete hardware component or any combination thereof.General processor can be micro- place Device is managed, but as an alternative, the processor can be any commercially available processor, controller, microcontroller or shape State machine.Processor is also implemented as calculating the combination of equipment, such as the combination of DSP and microprocessor, multi-microprocessor, with The one or more microprocessors of DSP core cooperation or any other such configuration.The software can reside in any form Computer-readable tangible media in.It by example rather than limits, such computer-readable tangible storage is situated between Matter may include RAM, ROM, EEPROM, CD-ROM or other optical disc storages, disk storage or other magnetic memory devices or can For carrying or the desired program code of store instruction or data structure form and can be accessed by computer any Other tangible mediums.As used herein, disk include compact disk (CD), laser disk, CD, digital versatile disc (DVD), floppy disk and Blu-ray disc.
Intelligent control technology disclosed by the invention can also be by running a program or one on any computing device Program is organized to realize.The computing device can be well known fexible unit.Intellectual technology disclosed in this invention can also be only Only by providing the program product comprising realizing the method perhaps program code of device to realize or by being stored with this Any storage medium of the program product of sample is realized.
The technology instructed defined by the appended claims can not departed from and carried out to the various of technology described herein Change, replace and changes.In addition, the scope of the claims of the invention is not limited to process described above, machine, manufacture, thing Composition, means, method and the specific aspect of movement of part.Can use carried out to corresponding aspect described herein it is essentially identical Function or realize essentially identical result there is currently or later to be developed processing, machine, manufacture, event group At, means, method or movement.Thus, appended claims include such processing, machine, manufacture, event within its scope Composition, means, method or movement.
The above description of disclosed aspect is provided so that any person skilled in the art can make or use this Invention.Various modifications in terms of these are readily apparent to those skilled in the art, and are defined herein General Principle can be applied to other aspect without departing from the scope of the present invention.Therefore, the present invention is not intended to be limited to Aspect shown in this, but according to principle disclosed herein and the consistent widest range of novel feature.
In order to which purpose of illustration and description has been presented for above description.In addition, this description is not intended to reality of the invention It applies example and is restricted to form disclosed herein.Although already discussed above multiple exemplary aspects and embodiment, this field skill Its certain modifications, modification, change, addition and sub-portfolio will be recognized in art personnel.

Claims (16)

1. a kind of Word sense disambiguation method, comprising:
Receive read statement;
Based on predetermined ambiguity dictionary, the disambiguation target word in the read statement is determined;
Based on the syntactic analysis and contextual information analysis to the read statement, the related term of the target word is determined;
Determine one or more hypernyms of the related term;And
Morphology based on the related term and one or more of hypernyms, part of speech and the syntactic relation with target word, really The fixed meaning of a word of the target word in the read statement.
2. Word sense disambiguation method as described in claim 1, wherein the syntactic analysis based on to the read statement and Contextual information analysis, determines that the related term of the target word includes:
Mark is analyzed based on the part of speech to the read statement, determines the part of speech of each word in the read statement;And
Result based on the part of speech and the syntactic analysis and to contextual analysis of target word etc. as a result, according to pre- set pattern Then determine the related term of the target word.
3. Word sense disambiguation method as claimed in claim 1 or 2 further includes the word that preparatory training executes the Word sense disambiguation method Adopted disambiguation module, wherein the training word sense disambiguation module includes:
It is noted for trained training data;
Data processing is executed to the training data, and obtains the predetermined ambiguity dictionary;
For every trained sentence in the training data, it is based on the predetermined ambiguity dictionary, determines described every trained language Disambiguation training objective word in sentence;
Based on the syntactic analysis and contextual information analysis to described every trained sentence, the instruction of the training objective word is determined Practice related term;
Determine the hypernym of the training objective word, the trained related term, the training objective word and the trained related term Morphology, part of speech and with the syntactic relation of target word as training characteristics;And
Utilize the training characteristics training word sense disambiguation module.
4. a kind of meaning of a word extended method, comprising:
Receive read statement;
Based on predetermined ambiguity dictionary, the disambiguation target word in the read statement and non-ambiguity word are determined;
The meaning of a word of the disambiguation target word in the read statement is determined using word sense disambiguation module;
Based on predetermined thesaurus, the synonymous of the meaning of a word for corresponding respectively to the non-ambiguity word and the disambiguation target word is determined Word and hypernym;And
Using the synonym and hypernym, the read statement is extended,
Wherein, described to determine that the meaning of a word of the disambiguation target word in the read statement includes: using word sense disambiguation module
Based on the syntactic analysis and contextual information analysis to the read statement, the related term of the target word is determined;
Determine one or more hypernyms of the related term;And
Based on the related term and one or more of hypernyms, word of the target word in the read statement is determined Justice.
5. meaning of a word extended method as claimed in claim 4, wherein the syntactic analysis based on to the read statement and Contextual information analysis, determines that the related term of the target word includes:
Mark is analyzed based on the part of speech to the read statement, determines the part of speech of each word in the read statement;And
Result based on the part of speech and the syntactic analysis and to contextual analysis of target word etc. as a result, according to pre- set pattern Then determine the related term of the target word.
6. meaning of a word extended method as described in claim 4 or 5 further includes the word that preparatory training executes the Word sense disambiguation method Adopted disambiguation module, wherein the training word sense disambiguation module includes:
It is noted for trained training data;
Data processing is executed to the training data, and obtains the predetermined ambiguity dictionary;
For every trained sentence in the training data, it is based on the predetermined ambiguity dictionary, determines described every trained language Disambiguation training objective word in sentence;
Based on the syntactic analysis and contextual information analysis to described every trained sentence, the instruction of the training objective word is determined Practice related term;
Determine the hypernym of the training objective word, the trained related term, the training objective word and the trained related term Morphology, part of speech and with the syntactic relation of target word as training characteristics;And
Utilize the training characteristics training word sense disambiguation module.
7. a kind of word sense disambiguation equipment, comprising:
Receiving unit is configured to receive read statement;
Target word determination unit is configured to predetermined ambiguity dictionary, determines the disambiguation target word in the read statement;
Related term determination unit is configured to analyze the syntactic analysis of the read statement and contextual information, determine The related term of the target word;
Hypernym determination unit is configured to determine one or more hypernyms of the related term;And
Word sense disambiguation unit is configured to the related term and one or more of hypernyms, determines the target word The meaning of a word in the read statement.
8. word sense disambiguation equipment as claimed in claim 7, wherein the related term determination unit is further configured to:
Mark is analyzed based on the part of speech to the read statement, determines the part of speech of each word in the read statement;And base In the result of the part of speech and the syntactic analysis and to contextual analysis of target word etc. as a result, being determined according to pre-defined rule The related term of the target word.
9. word sense disambiguation equipment as claimed in claim 7 or 8, further includes training unit, is configured that
It is noted for trained training data;
Data processing is executed to the training data, and obtains the predetermined ambiguity dictionary;
For every trained sentence in the training data, it is based on the predetermined ambiguity dictionary, determines described every trained language Disambiguation training objective word in sentence;
Based on the syntactic analysis and contextual information analysis to described every trained sentence, the instruction of the training objective word is determined Practice related term;
Determine the hypernym of the training objective word, the trained related term, the training objective word and the trained related term Morphology, part of speech and with the syntactic relation of target word as training characteristics;And
Utilize the training characteristics training word sense disambiguation unit.
10. a kind of meaning of a word expanding unit, comprising:
Receiving module is configured to receive read statement;
Target word determining module is configured to predetermined ambiguity dictionary, determines disambiguation target word in the read statement and non- Ambiguity word;
Word sense disambiguation module is configured to determine the meaning of a word of the disambiguation target word in the read statement;
Meaning of a word expansion module, configuration are based on predetermined thesaurus, and determination corresponds respectively to the non-ambiguity word and the disambiguation The synonym and hypernym of the meaning of a word of target word;And the synonym and hypernym are utilized, the read statement is extended,
Wherein, the word sense disambiguation module be further configured to include:
Related term determination unit is configured to analyze the syntactic analysis of the read statement and contextual information, determine The related term of the target word;
Hypernym determination unit is configured to determine one or more hypernyms of the related term;And
Word sense disambiguation unit is configured to the related term and one or more of hypernyms, determines the target word The meaning of a word in the read statement.
11. meaning of a word expanding unit as claimed in claim 10, wherein the related term determination unit is further configured to:
Mark is analyzed based on the part of speech to the read statement, determines the part of speech of each word in the read statement;And base In the result of the part of speech and the syntactic analysis and to contextual analysis of target word etc. as a result, being determined according to pre-defined rule The related term of the target word.
12. meaning of a word expanding unit as described in claim 10 or 11, further includes training module, is configured that
It is noted for trained training data;
Data processing is executed to the training data, and obtains the predetermined ambiguity dictionary;
For every trained sentence in the training data, it is based on the predetermined ambiguity dictionary, determines described every trained language Disambiguation training objective word in sentence;
Based on the syntactic analysis and contextual information analysis to described every trained sentence, the instruction of the training objective word is determined Practice related term;
Determine the hypernym of the training objective word, the trained related term, the training objective word and the trained related term Morphology, part of speech and with the syntactic relation of target word as training characteristics;And
Utilize the training characteristics training word sense disambiguation unit.
13. a kind of word sense disambiguation equipment, comprising:
Processor;And
Memory is configured to storage computer program instructions;
Wherein, when the computer program instructions are run by the processor, so that the processor executes such as claim 1 Or Word sense disambiguation method described in 2.
14. a kind of meaning of a word expansion equipment, comprising:
Processor;And
Memory is configured to storage computer program instructions;
Wherein, when the computer program instructions are run by the processor, so that the processor executes such as claim 4 Or meaning of a word extended method described in 5.
15. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program instructions, In, when the computer program instructions are run by processor, so that the processor executes word as claimed in claim 1 or 2 Adopted disambiguation method.
16. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program instructions, In, when the computer program instructions are run by processor, so that the processor executes word as described in claim 4 or 5 Adopted extended method.
CN201711048364.7A 2017-10-31 2017-10-31 Word sense disambiguation method and equipment, meaning of a word extended method and device Pending CN109726385A (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201711048364.7A CN109726385A (en) 2017-10-31 2017-10-31 Word sense disambiguation method and equipment, meaning of a word extended method and device
JP2020524159A JP2021501420A (en) 2017-10-31 2018-09-06 Word sense disambiguation method and device, word sense extension method, device and device, computer readable storage medium
CN201880071178.1A CN111295661A (en) 2017-10-31 2018-09-06 Word sense disambiguation method and apparatus, word sense expansion method, device and apparatus, computer readable storage medium
PCT/CN2018/104334 WO2019085640A1 (en) 2017-10-31 2018-09-06 Word meaning disambiguation method and device, word meaning expansion method, apparatus and device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711048364.7A CN109726385A (en) 2017-10-31 2017-10-31 Word sense disambiguation method and equipment, meaning of a word extended method and device

Publications (1)

Publication Number Publication Date
CN109726385A true CN109726385A (en) 2019-05-07

Family

ID=66293105

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201711048364.7A Pending CN109726385A (en) 2017-10-31 2017-10-31 Word sense disambiguation method and equipment, meaning of a word extended method and device
CN201880071178.1A Pending CN111295661A (en) 2017-10-31 2018-09-06 Word sense disambiguation method and apparatus, word sense expansion method, device and apparatus, computer readable storage medium

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201880071178.1A Pending CN111295661A (en) 2017-10-31 2018-09-06 Word sense disambiguation method and apparatus, word sense expansion method, device and apparatus, computer readable storage medium

Country Status (3)

Country Link
JP (1) JP2021501420A (en)
CN (2) CN109726385A (en)
WO (1) WO2019085640A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110321434A (en) * 2019-06-27 2019-10-11 厦门美域中央信息科技有限公司 A kind of file classification method based on word sense disambiguation convolutional neural networks
CN110991196A (en) * 2019-12-18 2020-04-10 北京百度网讯科技有限公司 Translation method and device for polysemous words, electronic equipment and medium
CN111310481A (en) * 2020-01-19 2020-06-19 百度在线网络技术(北京)有限公司 Speech translation method, device, computer equipment and storage medium
CN111414523A (en) * 2020-03-11 2020-07-14 中国建设银行股份有限公司 Data acquisition method and device
CN112632962A (en) * 2020-05-20 2021-04-09 华为技术有限公司 Method and device for realizing natural language understanding in human-computer interaction system

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110134955A (en) * 2019-05-14 2019-08-16 中电协通科技(张家口)有限公司 A kind of semantic processes method
CN110309318B (en) * 2019-05-29 2022-11-29 西安电子科技大学 Intention representation system and method of information communication network, and information data processing terminal
CN111199149B (en) * 2019-12-17 2023-10-20 航天信息股份有限公司 Sentence intelligent clarification method and system for dialogue system
CN111310475B (en) * 2020-02-04 2023-03-10 支付宝(杭州)信息技术有限公司 Training method and device of word sense disambiguation model
CN112580335B (en) * 2020-12-28 2023-03-24 建信金融科技有限责任公司 Method and device for disambiguating polyphone
CN113204962A (en) * 2021-05-31 2021-08-03 平安科技(深圳)有限公司 Word sense disambiguation method, device, equipment and medium based on graph expansion structure
CN113704416B (en) * 2021-10-26 2022-03-04 深圳市北科瑞声科技股份有限公司 Word sense disambiguation method and device, electronic equipment and computer-readable storage medium
CN115204182B (en) * 2022-09-09 2022-11-25 山东天成书业有限公司 Method and system for identifying e-book data to be corrected

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8504355B2 (en) * 2009-11-20 2013-08-06 Clausal Computing Oy Joint disambiguation of syntactic and semantic ambiguity
CN102306144B (en) * 2011-07-18 2013-05-08 南京邮电大学 Terms disambiguation method based on semantic dictionary
CN105718442A (en) * 2016-01-19 2016-06-29 齐鲁工业大学 Word sense disambiguation method based on syntactic analysis
CN106202036B (en) * 2016-06-29 2019-05-21 齐鲁工业大学 A kind of verb Word sense disambiguation method and device based on interdependent constraint and knowledge
CN106598947A (en) * 2016-12-15 2017-04-26 山西大学 Bayesian word sense disambiguation method based on synonym expansion

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110321434A (en) * 2019-06-27 2019-10-11 厦门美域中央信息科技有限公司 A kind of file classification method based on word sense disambiguation convolutional neural networks
CN110991196A (en) * 2019-12-18 2020-04-10 北京百度网讯科技有限公司 Translation method and device for polysemous words, electronic equipment and medium
CN110991196B (en) * 2019-12-18 2021-10-26 北京百度网讯科技有限公司 Translation method and device for polysemous words, electronic equipment and medium
US11275904B2 (en) 2019-12-18 2022-03-15 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for translating polysemy, and medium
CN111310481A (en) * 2020-01-19 2020-06-19 百度在线网络技术(北京)有限公司 Speech translation method, device, computer equipment and storage medium
CN111414523A (en) * 2020-03-11 2020-07-14 中国建设银行股份有限公司 Data acquisition method and device
CN112632962A (en) * 2020-05-20 2021-04-09 华为技术有限公司 Method and device for realizing natural language understanding in human-computer interaction system
CN112632962B (en) * 2020-05-20 2023-11-17 华为技术有限公司 Method and device for realizing natural language understanding in man-machine interaction system

Also Published As

Publication number Publication date
WO2019085640A1 (en) 2019-05-09
CN111295661A (en) 2020-06-16
JP2021501420A (en) 2021-01-14

Similar Documents

Publication Publication Date Title
CN109726385A (en) Word sense disambiguation method and equipment, meaning of a word extended method and device
CN107818085B (en) Answer selection method and system for reading understanding of reading robot
US11531818B2 (en) Device and method for machine reading comprehension question and answer
CN102479191B (en) Method and device for providing multi-granularity word segmentation result
CN109635273A (en) Text key word extracting method, device, equipment and storage medium
Zhou et al. Chinese named entity recognition via joint identification and categorization
US11010554B2 (en) Method and device for identifying specific text information
CN104573099B (en) The searching method and device of topic
CN106021572B (en) The construction method and device of binary feature dictionary
CN108984661A (en) Entity alignment schemes and device in a kind of knowledge mapping
Zhang et al. HANSpeller++: A unified framework for Chinese spelling correction
JP2019082931A (en) Retrieval device, similarity calculation method, and program
JP2021136027A (en) Analysis of theme coverage of documents
CN107943940A (en) Data processing method, medium, system and electronic equipment
Samih et al. Detecting code-switching in moroccan Arabic social media
CN110633456B (en) Language identification method, language identification device, server and storage medium
Alambo et al. Topic-centric unsupervised multi-document summarization of scientific and news articles
CN106502988B (en) A kind of method and apparatus that objective attribute target attribute extracts
CN114995903A (en) Class label identification method and device based on pre-training language model
CN109657052A (en) A kind of abstract of a thesis contains the abstracting method and device of fine granularity Knowledge Element
Salesky et al. Exploiting morphological, grammatical, and semantic correlates for improved text difficulty assessment
KR101983477B1 (en) Method and System for zero subject resolution in Korean using a paragraph-based pivotal entity identification
CN110263345A (en) Keyword extracting method, device and storage medium
Chistikov et al. Improving prosodic break detection in a Russian TTS system
US20110106849A1 (en) New case generation device, new case generation method, and new case generation program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190507

WD01 Invention patent application deemed withdrawn after publication