WO2017197947A1 - 先行词的确定方法和装置 - Google Patents
先行词的确定方法和装置 Download PDFInfo
- Publication number
- WO2017197947A1 WO2017197947A1 PCT/CN2017/074800 CN2017074800W WO2017197947A1 WO 2017197947 A1 WO2017197947 A1 WO 2017197947A1 CN 2017074800 W CN2017074800 W CN 2017074800W WO 2017197947 A1 WO2017197947 A1 WO 2017197947A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- candidate
- pronoun
- antecedent
- word
- antecedents
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/268—Morphological analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Definitions
- the present invention relates to the field of information processing, and in particular to a method and apparatus for determining an antecedent.
- the machine In the man-machine dialogue, the machine needs to accurately understand the context information in the statement. If the machine cannot accurately understand the context information in the statement, the dialogue information will be blurred, and the problem is the main problem causing the information to be blurred.
- referential digestion is the question of determining which noun phrase a pronoun points to in a chapter.
- referential decoding algorithms there are several kinds of referential decoding algorithms: (1) searching from left to right first, and hierarchically traversing the syntax tree to achieve digestion, the algorithm needs to traverse the information to be identified, and the traversal workload is large; (2) Semantic constraints are added on the basis of syntactic knowledge. This method is effective in English pronouns, but Chinese vocabulary is difficult to handle.
- the embodiment of the invention provides a method and a device for determining an antecedent word to solve at least the technical problem of low processing efficiency of the reference digestion.
- a method for determining an antecedent comprising: obtaining statement information to be recognized; and extracting from the statement information when identifying a pronoun in the statement information Word candidate features and word features of the plurality of candidate antecedents; determining target antecedent words referred to by the pronouns from the plurality of candidate antecedents based on word features of the plurality of candidate antecedents.
- an apparatus for determining an antecedent comprising: an obtaining unit, configured to acquire sentence information to be recognized; and an extracting unit, configured to identify the statement information
- an obtaining unit configured to acquire sentence information to be recognized
- an extracting unit configured to identify the statement information
- a plurality of candidate antecedents and word features of the plurality of candidate antecedents are extracted from the sentence information
- determining means is configured to use the plurality of candidate antecedent words based on the plurality of candidate antecedent words
- the target antecedent referred to by the pronoun is determined in the candidate antecedent.
- the word features of the candidate antecedent and each candidate antecedent are extracted from the sentence information, and the target antecedent of the pronoun is determined by using the feature of the candidate antecedent.
- the target antecedent specified by the pronoun can be automatically locked by the word feature of the candidate antecedent extracted from the sentence information, thereby solving the problem of low processing efficiency of the reference digestion in the prior art, and achieving accurate and efficient Determine the effect of the pronoun's antecedent.
- FIG. 1 is a schematic diagram of a network environment of an optional method for determining an antecedent according to an embodiment of the present invention
- FIG. 2 is a flowchart 1 of a method for determining an antecedent according to an embodiment of the present invention
- FIG. 3 is a second flowchart of a method for determining an antecedent according to an embodiment of the present invention.
- FIG. 4 is a third flowchart of a method for determining an antecedent according to an embodiment of the present invention.
- FIG. 5 is a first schematic diagram of an apparatus for determining an antecedent according to an embodiment of the present invention
- FIG. 6 is a second schematic diagram of an apparatus for determining an antecedent according to an embodiment of the present invention.
- FIG. 7 is a third schematic diagram of an apparatus for determining an antecedent according to an embodiment of the present invention.
- FIG. 8 is a fourth schematic diagram of an apparatus for determining an antecedent according to an embodiment of the present invention.
- FIG. 9 is a block diagram showing the internal structure of a server according to an embodiment of the present invention.
- Antecedent A phrase that is semantically related to the current pronoun, such as a word or phrase referred to by a pronoun.
- Session The session collection.
- Predicate A term used to describe or determine the relationship between a shell's properties, features, or objects.
- the predicate typically includes verbs and adjectives.
- an embodiment of a method for determining an antecedent is provided, and it should be noted that the steps illustrated in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer executable instructions, and Although the logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in a different order than the ones described herein.
- the above information processing method is applied to the network environment as shown in FIG. 1.
- the network environment includes a terminal 101 and a server 103 (which may be a server or a cloud platform of a network connection application), wherein the terminal may establish a connection with the server through the network, and the processor may be set on both the terminal and the server.
- the above networks include, but are not limited to, a wide area network, a metropolitan area network, or a local area network.
- the terminal may be a terminal having an input device, such as a mobile terminal (for example, a mobile phone, a tablet, etc.), and the terminal may install an intelligent conversation client.
- the server corresponds to the smart conversation client, and the server can be used to process information sent by the terminal by using the smart conversation client.
- FIG. 2 is a flow chart of a method of determining an antecedent in accordance with an embodiment of the present invention. As shown in FIG. 2, the method may include the following steps:
- Step S202 Acquire sentence information to be identified
- Step S204 extracting, from the sentence information, word features of the plurality of candidate antecedents and the plurality of candidate antecedents in the case that the pronoun exists in the sentence information;
- Step S206 Determine the target antecedent referred to by the pronoun from the plurality of candidate antecedents based on the word features of the plurality of candidate antecedents.
- the word features of the candidate antecedent and each candidate antecedent are extracted from the sentence information, and the target antecedent referred to by the pronoun is determined by the word feature of the candidate antecedent.
- the target antecedent specified by the pronoun can be automatically locked by the word feature of the candidate antecedent extracted from the sentence information, thereby solving the problem of low processing efficiency of the reference digestion in the prior art, and achieving accurate and efficient Determine the effect of the pronoun's antecedent.
- the statement information to be identified in the foregoing embodiment may be sent by the terminal 101 to the server, and the statement information may be text information, and the text information may be obtained by converting the voice information in the session information, or may be directly from the statement information.
- the extracted text information may also be information extracted from the article, and the source of the information is not limited in this application.
- the statement information is a set of session information generated by a client and a server during a human-machine conversation.
- the sentence features of the candidate antecedent and each candidate antecedent may be extracted from the sentence information in sequence, or may be in the slave statement.
- the candidate antecedent is extracted from the information while extracting the sentence features of the candidate antecedent.
- the words referred to by the pronouns may be nouns or noun phrases, and the candidate antecedents extracted are nouns or noun phrases.
- the predator can be used to segment the sentence information in the sentence information through the word segmenter, and the plurality of words obtained from the word segmentation are included in the word segmentation. Words that are part of pronouns (ie, pronouns) and nouns/nouns (ie, candidate antecedents) are extracted.
- the target antecedent referred to by the pronoun may be determined from the plurality of candidate antecedents based on the word features of the plurality of candidate antecedents, wherein the term features may include semantic features and grammatical features.
- the communication between the intelligent conversation client and the server is established, and the communication relationship is used to send the session information to the server through the intelligent conversation client, after the server receives the session information.
- the session information is text information
- the session information is used as statement information.
- the session information is voice information
- the voice information is converted into text information, and the converted text information is used as statement information.
- the server identifies the statement information. If a pronoun is found in the statement information, the session set generated by the session process (ie, the above statement information) is obtained, and multiple candidate antecedents and each candidate advance are extracted from the statement information. The word feature of the word, using the word feature to determine the target antecedent of the pronoun.
- the pronoun in the statement information may be replaced with the target antecedent to complete the statement information.
- determining the target antecedent referred to by the pronoun from the plurality of candidate antecedents based on the word features of the plurality of candidate antecedents may include: determining each based on the word features of each candidate antecedent The referential weight value of the candidate antecedent; the candidate antecedent with the largest weight value is selected as the target antecedent of the pronoun.
- the feature of the word in the foregoing embodiment may be a semantic feature or a grammatical feature
- the The semantic feature and/or grammatical feature determines a referential weight value of each candidate antecedent relative to the pronoun, and sorts each obtained referential weight value to obtain a sequence of weighted values, if the weight value is The sequence is arranged according to the weight value of the reference, and the candidate antecedent corresponding to the first weight value in the sequence of weight values is used as the target antecedent of the pronoun; if the sequence of weights is referred to When the weights of the referents are arranged from small to large, the candidate antecedent corresponding to the last weight value in the sequence of weights is referred to as the target antecedent of the pronoun.
- the largest weighting value of the plurality of referential weight values may be obtained according to a pairwise comparison manner.
- the candidate antecedent corresponding to the largest referential weight value is selected as the target antecedent referred to by the pronoun.
- each of the plurality of candidate antecedents includes one or more word features, and wherein each candidate antecedent of the plurality of candidate antecedents includes a word feature, The word feature of each candidate antecedent is converted into a feature value, and the feature value is used as a referential weight value of the candidate antecedent.
- each candidate antecedent of the plurality of candidate antecedents includes one or more word features
- the referential weight of each candidate antecedent is determined based on the word features of each candidate antecedent
- the value includes: converting the extracted word features into feature values; using the feature coefficients of one or more word features set in advance, performing linear weighting calculation on the feature values of each candidate antecedent, and obtaining the fingers of each candidate antecedent Subrogation weight.
- each word feature of each candidate antecedent is separately converted into a feature value, using one or more presets A feature coefficient of the word feature, and performing linear weighting calculation on the plurality of feature values to obtain a referential weight value of each candidate antecedent.
- the word features are two
- the feature values of the two word features are t 1 and t 2 , respectively, and the preset feature coefficients ⁇ 1 and ⁇ 2 of the two word features are acquired, and the two feature values are performed.
- Linear weighting calculation: Weight ⁇ 1 ⁇ t 1 + ⁇ 2 ⁇ t 2 .
- the characteristic coefficients of these features can be given initial values according to experience, and can also be trained.
- the corpus adjusts the size of the feature coefficient.
- each candidate antecedent of the plurality of candidate antecedents includes one or more word features, and the word features include at least one of: a singular and plural feature of the candidate antecedent, a candidate antecedent and The distance between pronouns, whether the candidate antecedent appears in the prepositional phrase, and the semantic relevance of the pronoun and the candidate antecedent.
- the word feature includes the singular and plural features of the candidate antecedent
- the singular pronoun cannot refer to the plural antecedent
- the singular and plural number is an important feature for judging whether the two words have a referential relationship, for example, “Today The weather is very good, my classmates and I are ready to go out for a walk.”
- the pronoun "I” here is singular, while the "classmates” are plural, and the singular cannot refer to plural.
- the singular and plural of the candidate antecedent is consistent with the singular and plural of the pronoun can be used to convert the singular and complex features into eigenvalues, for example, if the singular and plural of the candidate antecedent and the singular and plural of the pronoun If they are consistent, the eigenvalue is set to the first constant; if the singular and plural of the candidate antecedent does not match the singular and plural of the pronoun, the eigenvalue is set to the second constant.
- the first constant may be 1 and the second constant may be 0.
- the distance between the candidate antecedent and the pronoun in the above embodiment generally considers the distance between sentences or between paragraphs where the two words are located, and may also refer to the number of characters between the two words. In a multi-round conversation, a complete sentence information needs to be expressed in multiple sentences. The closer the distance between the candidate antecedent and the pronoun sentence, the greater the correlation. The distance between the pronoun and the antecedent is also significant.
- the word feature includes the distance between the candidate antecedent and the pronoun
- the distance between the candidate antecedent and the sentence in which the pronoun is located, or the number of characters in the interval between the two words may be used. Or the number of statements as its eigenvalue.
- Nouns in direct object and indirect object are referred to as having no significant difference in probability, while nouns in prepositional phrases are referred to with lower probability. Therefore, in the embodiment of the present invention, whether the candidate antecedent appears in the prepositional phrase as a word feature can be used.
- the feature value may be set to a constant if the candidate antecedent appears in the prepositional phrase, such as 1; in the case where the candidate antecedent does not appear in the prepositional phrase, Set the feature value to another constant, such as 0.
- the relevance of the semantic dependent words may also be used as a word feature (ie, the semantic relevance of the pronouns in the above embodiment and the candidate antecedent), for example, the statement information is “the police found the thief to escape from prison and aggravated his punishment. " Among them, the candidate antecedent "thief" and the pronoun "he” depend on “jailbreak” and “penalty” respectively. These two semantic dependencies have great correlation, and we can see the semantic dependence of pronouns and candidate antecedents. The degree of correlation between words can help determine the referential relationship.
- the semantic relevance of the pronoun and the candidate antecedent can be determined based on the correlation between the semantic dependencies of the two words.
- P is a to-be-dissolved pronoun
- A is a candidate antecedent
- (Px 1 , Px 2 ... Px i ) is a pronoun of a pronoun
- (Ax 1 , Ax 2 ... Ax j ) is a dependent word of the candidate antecedent
- i, j is a natural number
- i is the number of pronoun dependent words
- j is the number of dependent words of the candidate antecedent
- WordSence ( P, A) is:
- the feature value may be a value calculated by the above formula.
- the candidate antecedent set for each to-be-dissolved pronoun in the training corpus first determines the candidate antecedent set for each to-be-dissolved pronoun in the training corpus, and then judge whether the pronoun needs to be digested according to the consistency constraint rule, perform feature extraction, based on pronouns and candidate antecedents.
- the distance, semantics and grammar information propose a method for human-to-human dialogue, which is called Chinese pronouns, and determines the final candidate antecedent.
- determining whether the pronouns need to be digested before extracting the plurality of candidate antecedents and the word features of the plurality of candidate antecedents from the sentence information, determining whether the pronouns need to be digested.
- the word features of the plurality of candidate antecedents and the plurality of candidate antecedents are extracted from the sentence information; when it is judged that the pronoun does not need to be digested, no more information is extracted from the sentence information.
- Candidates first Word features and word features of multiple candidate antecedents.
- judging whether the pronoun needs to be digested can be achieved by judging whether the proximate word of the pronoun is a noun. If the proximate word of the pronoun is a noun, it is determined that the pronoun does not need to be dispelled, and if the proximate word of the pronoun is not a noun, then It is judged that the pronoun needs to be digested, and the word features of the plurality of candidate antecedents and the plurality of candidate antecedents may be extracted from the sentence information.
- extracting the word features of the plurality of candidate antecedents and the plurality of candidate antecedents from the sentence information includes: searching for pronouns in the sentence information, and obtaining adjacent words of the found pronouns; and the case where the adjacent words are not nouns
- the word features of the plurality of candidate antecedents and the plurality of candidate antecedents are extracted from the sentence information.
- extracting a plurality of candidate antecedents from the statement information includes:
- the embodiment of the present invention is described in detail below with reference to FIG. 3. As shown in FIG. 3, the embodiment may include the following steps:
- Step S301 detecting a modern word in the sentence information.
- step S306 a step of detecting whether a modern word is generated in the sentence information (ie, step S306 described below) may be performed, and in the case where the pronoun is detected, the step is entered.
- Step S302 Determine whether the pronoun needs to be digested.
- step S303 If it is determined that the pronoun needs to be digested, step S303 is performed; if it is determined that the pronoun does not need to be dissipated, step S306 is continued: detecting whether a modern word is generated in the sentence information.
- the adjacent word of the pronoun is a noun. If the proximate word of the pronoun is a noun, it is judged that the pronoun does not need to be digested; if the proximate word of the pronoun is not a noun, it is judged that the pronoun needs to be digested. .
- Step S303 Acquire a plurality of candidate antecedents.
- whether the word is extracted may be determined based on whether there is a mutual referential relationship between the word to be extracted and the pronoun. If there is a mutual referential relationship between the word to be extracted and the pronoun, the word is extracted; otherwise, vice versa.
- candidate antecedent words such as nouns or noun phrases
- whether the candidate antecedent and the pronoun can refer to each other can be used to filter the plurality of candidate antecedents. , get the filtered candidate antecedent.
- the word features of the filtered candidate antecedent are extracted from the sentence information, and the target antecedent is selected from the filtered candidate antecedent based on the extracted word features.
- Step S304 Extract the word features of the candidate antecedent.
- Step S305 determining the target antecedent of the pronoun by using the word feature of the candidate antecedent.
- a noun or a noun phrase that is closer to the pronoun may be searched, that is, a noun phrase whose distance from the pronoun in the sentence information is within a preset distance is obtained.
- the noun phrase is found, if there is no possible referential relationship between the noun phrase and the pronoun, the noun or noun phrase is not extracted, that is, the noun or noun phrase is not used as a candidate antecedent of the pronoun;
- the noun or noun phrase is extracted and used as a candidate antecedent.
- determining whether the noun phrase and the pronoun refer to each other includes: determining whether the part of the conjunction between the noun phrase and the pronoun is a predicate; if the part of the noun phrase and the pronoun is not a predicate, determining The noun phrase and the pronoun can refer to each other; if the part of the noun phrase and the pronoun is a predicate, it can be judged that the noun phrase and the pronoun cannot Enough to refer to each other.
- the predicate can be a verb or an adjective.
- the candidate antecedent “juice extractor” and the pronoun “fruit” are also bound by the predicate “squeeze”, and the two belong to a relationship that cannot be referred to each other.
- whether the pronoun and the candidate antecedent can refer to each other can be determined by the output result of the parser.
- the candidate antecedent can be filtered by judging whether the noun phrase and the pronoun refer to each other, and the processing amount of the word and the word feature is reduced.
- candidate antecedent words such as nouns or noun phrases
- whether the candidate antecedent and the pronoun can refer to each other can be used to filter the plurality of candidate antecedents.
- the filtered candidate antecedent is obtained.
- the word features of the filtered candidate antecedent are extracted from the sentence information, and the target antecedent is selected from the filtered candidate antecedent based on the extracted word features.
- the weighting of the candidate antecedent ie, the weighting value of the candidate
- the weighting of the candidate antecedent can be sorted according to the manner of linear weighting of different feature weights, and the weight with the highest weight is the final selected pronoun.
- the embodiment may include the following steps:
- Step S401 Filter the candidate antecedent words by using grammatical constraints in case the recognized pronouns need to be digested.
- the grammatical constraint herein may refer to a rule that cannot be referred to between the pronoun and the candidate antecedent. If the pronoun and the candidate antecedent cannot be referred to, the candidate antecedent is directly filtered out.
- Step S402 Extract the word features of the remaining candidate antecedents.
- the word features may include: a singular and plural feature, a distance between the candidate antecedent and the pronoun, a semantic relevance of the candidate antecedent and the pronoun, and whether the candidate antecedent is in the prepositional phrase.
- Step S403 Convert the feature into a feature value.
- the singular and plural number consistency weights Sp if the candidate antecedent and the pronoun singular and plural numbers are consistent with 1, if the candidate antecedent and the pronoun singular and plural numbers do not coincide with zero.
- the feature weight is Dis, and there are several rounds of conversation between the candidate antecedent and the pronoun.
- the grammatical constraint weight Sc the candidate antecedent is 1 in the prepositional phrase, not 0.
- Semantic Dependency Correlation Feature Ws (ie, the semantic relevance of the candidate antecedent and the pronoun) may be implemented by using the corresponding steps in the foregoing embodiments, and details are not described herein.
- Step S404 Calculate the total weight of the candidate antecedent (ie, the referential weight value in the above embodiment).
- the coefficient of the weight of these features (such as ⁇ 1 ) is given an initial value according to experience, and then the coefficient size of the weight is adjusted by training the corpus.
- Step S405 Determine the candidate antecedent with the largest weight value as the target antecedent.
- the candidate antecedent of the maximum weight is selected as the digestion result.
- the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be through hardware, but in many cases the former is a better implementation.
- the technical solution of the present invention which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
- the optical disc includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, or a network device, etc.) to perform the methods described in various embodiments of the present invention.
- the apparatus includes:
- the obtaining unit 51 is configured to obtain statement information to be identified
- the extracting unit 53 is configured to extract, from the sentence information, a plurality of candidate antecedent words and a plurality of candidate antecedent word features, if the pronoun exists in the sentence information;
- the determining unit 55 is configured to determine a target antecedent referred to by the pronoun from the plurality of candidate antecedents based on the word features of the plurality of candidate antecedents.
- the word features of the candidate antecedent and each candidate antecedent are extracted from the sentence information, and the target antecedent referred to by the pronoun is determined by the word feature of the candidate antecedent.
- the target antecedent specified by the pronoun can be automatically locked by the word feature of the candidate antecedent extracted from the sentence information, thereby solving the problem of low processing efficiency of the reference digestion in the prior art, and achieving accurate and efficient Determine the effect of the pronoun's antecedent.
- the statement information to be identified in the foregoing embodiment may be sent by the terminal 101 to the server, and the statement information may be text information, and the text information may be voice information in the session information.
- the converted information may also be text information extracted directly from the sentence information, or may be information extracted from the article.
- the source of the information is not limited in this application.
- the statement information is a set of session information generated by a client and a server during a human-machine conversation.
- the sentence features of the candidate antecedent and each candidate antecedent may be extracted from the sentence information in sequence, or may be in the slave statement.
- the candidate antecedent is extracted from the information while extracting the sentence features of the candidate antecedent.
- the words referred to by the pronouns may be nouns or noun phrases, and the candidate antecedents extracted are nouns or noun phrases.
- the predator can be used to segment the sentence information in the sentence information through the word segmenter, and the plurality of words obtained from the word segmentation are included in the word segmentation. Words that are part of pronouns (ie, pronouns) and nouns/nouns (ie, candidate antecedents) are extracted.
- the target antecedent referred to by the pronoun may be determined from the plurality of candidate antecedents based on the word features of the plurality of candidate antecedents, wherein the term features may include semantic features and grammatical features.
- the pronoun in the statement information may be replaced with the target antecedent to complete the statement information.
- the determining unit includes: a determining module 61, as shown in FIG. 6, for determining a referential weight value of each candidate antecedent based on a word feature of each candidate antecedent; a selection module 63, It is used to select the candidate antecedent with the largest weight value as the target antecedent of the pronoun.
- the word feature in the foregoing embodiment may be a semantic feature or a grammatical feature, and the semantic feature and/or the grammatical feature are used to determine the weight value of each candidate antecedent relative to the pronoun, and each obtained finger is obtained.
- the weighted value is sorted to obtain a sequence of weighted values, if the index
- the weighted value sequence is arranged according to the weight value of the reference, and the candidate antecedent corresponding to the first weight value in the sequence of weights is used as the target antecedent of the pronoun; if the weight is The value sequence is arranged from small to large according to the weight value of the reference, and the candidate antecedent corresponding to the last weight value in the sequence of weight values is used as the target antecedent of the pronoun.
- the largest weighting value of the plurality of referential weight values may be obtained according to a pairwise comparison manner.
- the candidate antecedent corresponding to the largest referential weight value is selected as the target antecedent referred to by the pronoun.
- each of the plurality of candidate antecedents includes one or more word features, and wherein each candidate antecedent of the plurality of candidate antecedents includes a word feature, The word feature of each candidate antecedent is converted into a feature value, and the feature value is used as a referential weight value of the candidate antecedent.
- each candidate antecedent of the plurality of candidate antecedents includes one or more word features
- the determining module 61 shown in FIG. 6 includes:
- a conversion sub-module 611 configured to convert the extracted word features into feature values
- the calculating sub-module 613 is configured to perform linear weighting calculation on the feature values of each candidate antecedent by using feature coefficients of one or more word features set in advance, to obtain a referential weight value of each candidate antecedent.
- each word feature of each candidate antecedent is separately converted into a feature value, using one or more presets A feature coefficient of the word feature, and performing linear weighting calculation on the plurality of feature values to obtain a referential weight value of each candidate antecedent.
- each of the plurality of candidate antecedents includes one or more word features
- the word features include at least one of: a singular and plural feature of the candidate antecedent, a candidate antecedent and a pronoun The distance between them, whether the candidate antecedent appears in the prepositional phrase, and the semantic relevance of the pronoun and the candidate antecedent.
- the singular and complex features are converted into eigenvalues.
- the eigenvalue is set to The first constant; if the singular and plural of the candidate antecedent does not coincide with the singular and plural of the pronoun, the eigenvalue is set to the second constant.
- the first constant may be 1 and the second constant may be 0.
- the word feature includes the distance between the candidate antecedent and the pronoun
- the distance between the candidate antecedent and the sentence in which the pronoun is located, or the number of characters in the interval between the two words may be used. Or the number of statements as its eigenvalue.
- the feature value may be set to a constant if the candidate antecedent appears in the prepositional phrase, such as 1; if the candidate antecedent does not appear in the prepositional phrase, The eigenvalue is set to another constant, such as 0.
- the feature value may be a value calculated by the above formula.
- the extracting unit 53 may include: a searching module 71 for finding a neighboring word of a pronoun in the sentence information; and an extracting module 73 for notifying the part of the word in the adjacent word In the case, word features of a plurality of candidate antecedents and a plurality of candidate antecedents are extracted from the sentence information.
- the extracting unit may include: an obtaining module 81, configured to acquire a noun phrase whose distance from the pronoun in the sentence information is within a preset distance; and a determining module 83, configured to determine between the noun phrase and the pronoun Whether they refer to each other, if the noun phrase and the pronoun refer to each other, the noun phrase is used as the candidate antecedent.
- the determining module includes: a determining sub-module, configured to determine whether the part of speech of the connected word between the noun phrase and the pronoun is a predicate; if the part of the noun phrase and the pronoun is not a predicate, determining the noun phrase It can refer to each other with pronouns; if the part of speech of noun phrase and pronoun is predicate, it can be judged that noun phrase and pronoun can not refer to each other.
- the candidate antecedent set for each to-be-dissolved pronoun in the training corpus first determines the candidate antecedent set for each to-be-dissolved pronoun in the training corpus, and then judge whether the pronoun needs to be digested according to the consistency constraint rule, perform feature extraction, based on pronouns and candidate antecedents.
- the distance, semantics and grammar information propose a method for human-to-human dialogue, which is called Chinese pronouns, and determines the final candidate antecedent.
- the modules provided in this embodiment are the same as the methods used in the corresponding steps of the method embodiment, and the application scenarios may be the same.
- the solution involved in the above module may not be limited to the content and scenario in the foregoing embodiment, and the foregoing module may be run on a computer terminal or a mobile terminal, and may be implemented by software or hardware.
- a server for implementing the foregoing method and apparatus for determining an antecedent is further provided.
- the server includes:
- the server includes: one or more (only one shown in the figure) processor 901, memory 903, and transmission device 905 (such as the transmitting device in the above embodiment), as shown in FIG.
- the terminal may also include an input and output device 907.
- the memory 903 can be used to store the software program and the module, such as the method for determining the antecedent in the embodiment of the present invention and the program instruction/module corresponding to the device, and the processor 901 runs the software program and the module stored in the memory 903, thereby Perform various functional applications and data processing, that is, implement the above-described method for determining antecedent.
- Memory 903 can include high speed random access memory, and can also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory.
- memory 903 can further include memory remotely located relative to processor 901, which can be connected to the terminal over a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
- the transmission device 905 described above is for receiving or transmitting data via a network, and can also be used for data transmission between the processor and the memory. Specific examples of the above network may include a wired network And wireless network.
- the transmission device 905 includes a Network Interface Controller (NIC) that can be connected to other network devices and routers via a network cable to communicate with the Internet or a local area network.
- the transmission device 905 is a Radio Frequency (RF) module for communicating with the Internet wirelessly.
- NIC Network Interface Controller
- RF Radio Frequency
- the memory 903 is used to store an application.
- the processor is configured to: obtain the sentence information to be identified; and when the pronoun exists in the sentence information, extract the word features of the plurality of candidate antecedents and the plurality of candidate antecedents from the sentence information; The word feature of the antecedent, and the target antecedent referred to by the pronoun is determined from the plurality of candidate antecedents.
- the processor is further configured to: determine, according to the word features of each candidate antecedent, a referential weight value of each candidate antecedent; and select a candidate antecedent with the largest weight value as the target of the pronoun word.
- the processor is further configured to perform the step of: each candidate antecedent in the plurality of candidate antecedents includes one or more word features, and determining a referential weight value of each candidate antecedent based on the word features of each candidate antecedent
- the method comprises: converting the extracted word features into feature values; using a feature coefficient of one or more word features set in advance, performing linear weighting calculation on the feature values of each candidate antecedent to obtain a reference of each candidate antecedent Weights.
- each candidate antecedent of the plurality of candidate antecedents includes one or more word features, and the word features include at least one of: a singular and plural feature of the candidate antecedent, a candidate antecedent and The distance between pronouns, whether the candidate antecedent appears in the prepositional phrase, and the semantic relevance of the pronoun and the candidate antecedent.
- the processor is further configured to: extract the word features of the plurality of candidate antecedents and the plurality of candidate antecedents from the sentence information, including: searching for adjacent words of the pronouns in the statement information; and if the part of the adjacent words is not a noun And extracting word features of the plurality of candidate antecedents and the plurality of candidate antecedents from the sentence information.
- the processor is further configured to perform the following steps, and extracting multiple candidate antecedents from the statement information includes: Obtain a noun phrase whose distance from the pronoun is within a preset distance; determine whether the noun phrase and the pronoun refer to each other; if the noun phrase and the pronoun refer to each other, the noun phrase is used as the candidate antecedent.
- the processor is further configured to perform the following steps: determining whether the noun phrase and the pronoun refer to each other include: determining whether the part of the noun phrase and the pronoun is a predicate; if the noun phrase and the pronoun are connected words If it is not a predicate, it is judged that the noun phrase and the pronoun can refer to each other; if the part of the noun phrase and the pronoun is a predicate, it is judged that the noun phrase and the pronoun cannot refer to each other.
- the structure shown in FIG. 9 is only illustrative, and the terminal can be a smart phone (such as an Android mobile phone, an iOS mobile phone, etc.), a tablet computer, a palm computer, and a mobile Internet device (MID). Terminal equipment such as PAD.
- FIG. 9 does not limit the structure of the above electronic device.
- the terminal may also include more or fewer components (such as a network interface, processing device, etc.) than shown in FIG. 9, or have a different configuration than that shown in FIG.
- Embodiments of the present invention also provide a storage medium.
- the foregoing storage medium may be used to store program code for executing the above method.
- the storage medium is arranged to store program code for performing the following steps:
- the target antecedent referred to by the pronoun is determined.
- the storage medium is arranged to store program code for performing the following steps: based on each candidate The word features of the antecedent determine the referential weight value of each candidate antecedent; the candidate antecedent with the largest weight value is selected as the target antecedent of the pronoun.
- the storage medium is arranged to store program code for performing the following steps, each candidate antecedent of the plurality of candidate antecedents comprising one or more word features, each candidate leading is determined based on the word characteristics of each candidate antecedent
- the referential weight value of the word includes: converting the extracted word feature into a feature value; using a feature coefficient of one or more word features set in advance, performing linear weighting calculation on the feature value of each candidate antecedent to obtain each The referential weight of the candidate antecedent.
- the storage medium is arranged to store program code for performing the following steps, each candidate antecedent of the plurality of candidate antecedents comprising one or more word features, the word features comprising at least one of: a singular and plural number of candidate antecedents
- the feature the distance between the candidate antecedent and the pronoun, whether the candidate antecedent appears in the prepositional phrase, and the semantic relevance of the pronoun and the candidate antecedent.
- the storage medium is configured to store program code for performing the following steps, and extracting the word features of the plurality of candidate antecedents and the plurality of candidate antecedents from the sentence information comprises: finding adjacent words of the pronouns in the statement information; In the case where it is not a noun, the word features of the plurality of candidate antecedents and the plurality of candidate antecedents are extracted from the sentence information.
- the storage medium is configured to store program code for performing the following steps, and extracting a plurality of candidate antecedents from the statement information includes: obtaining a noun phrase whose distance from the pronoun is within a preset distance in the sentence information; determining a noun phrase and a pronoun Whether they refer to each other; if noun phrases and pronouns refer to each other, noun phrases are used as candidate antecedents.
- the storage medium is configured to store program code for performing the following steps, and determining whether the noun phrase and the pronoun refer to each other includes: determining whether the part of the noun phrase and the pronoun is a predicate; if the noun phrase and the pronoun If the part of the conjunction is not a predicate, it is judged that the noun phrase and the pronoun can refer to each other; if the part of the noun phrase and the pronoun is a predicate, it is judged that the noun phrase and the pronoun cannot Refer to each other.
- the foregoing storage medium may include, but is not limited to: a USB flash drive, only A medium that can store program code, such as a read-only memory (ROM), a random access memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
- ROM read-only memory
- RAM random access memory
- removable hard disk such as a hard disk, a magnetic disk, or an optical disk.
- the integrated unit in the above embodiment if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in the above-described computer readable storage medium.
- the technical solution of the present invention may contribute to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium.
- a number of instructions are included to cause one or more computer devices (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
- the disclosed client may be implemented in other manners.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not executed.
- the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
Abstract
Description
Claims (14)
- 一种先行词的确定方法,其中,包括:获取待识别的语句信息;在识别出所述语句信息中存在代词的情况下,从语句信息中提取多个候选先行词和所述多个候选先行词的词语特征;基于所述多个候选先行词的词语特征,从所述多个候选先行词中确定所述代词所指代的目标先行词。
- 根据权利要求1所述的方法,其中,基于所述多个候选先行词的词语特征,从所述多个候选先行词中确定所述代词所指代的目标先行词包括:基于每个所述候选先行词的词语特征,确定每个所述候选先行词的指代权重值;将指代权重值最大的候选先行词选取为所述代词所指代的目标先行词。
- 根据权利要求2所述的方法,其中,所述多个候选先行词中的每个候选先行词包括一个或多个所述词语特征,基于每个所述候选先行词的词语特征,确定每个所述候选先行词的指代权重值包括:将提取到的词语特征转换为特征值;利用预先设置的一个或多个所述词语特征的特征系数,对每个所述候选先行词的所述特征值进行线性加权计算,得到每个所述候选先行词的指代权重值。
- 根据权利要求2所述的方法,其中,所述多个候选先行词中的每个候选先行词包括一个或多个所述词语特征,所述词语特征包括下述至少之一:所述候选先行词的单复数特征、所述候选先行词与所述代词之间的距离、所述候选先行词是否出现在介词短语中、以及所述代词和所 述候选先行词的语义关联性。
- 根据权利要求1所述的方法,其中,从语句信息中提取多个候选先行词和所述多个候选先行词的词语特征包括:查找所述语句信息中代词的临近词;在所述临近词的词性不为名词的情况下,从所述语句信息中提取多个候选先行词和所述多个候选先行词的词语特征。
- 根据权利要求1或5所述的方法,其中,从语句信息中提取多个候选先行词包括:获取所述语句信息中与所述代词的距离在预设距离内的名词短语;判断所述名词短语与所述代词之间是否相互指代;若所述名词短语与所述代词之间相互指代,则将所述名词短语作为所述候选先行词。
- 根据权利要求6所述的方法,其中,判断所述名词短语与所述代词之间是否相互指代包括:判断所述名词短语和所述代词之间的连接词的词性是否为谓词;若所述名词短语和所述代词之间的连接词的词性不为谓词,则判断出所述名词短语与所述代词之间能够相互指代;若所述名词短语和所述代词之间的连接词的词性为谓词,则判断出所述名词短语与所述代词之间不能够相互指代。
- 一种先行词的确定装置,其中,包括:获取单元,被设置为获取待识别的语句信息;提取单元,被设置为在识别出所述语句信息中存在代词的情况下,从语句信息中提取多个候选先行词和所述多个候选先行词的词语特征;确定单元,被设置为基于所述多个候选先行词的词语特征,从所 述多个候选先行词中确定所述代词所指代的目标先行词。
- 根据权利要求8所述的装置,其中,所述确定单元包括:确定模块,被设置为基于每个所述候选先行词的词语特征,确定每个所述候选先行词的指代权重值;选取模块,被设置为将指代权重值最大的候选先行词选取为所述代词所指代的目标先行词。
- 根据权利要求9所述的装置,其中,所述多个候选先行词中的每个候选先行词包括一个或多个所述词语特征,所述确定模块包括:转换子模块,被设置为将提取到的词语特征转换为特征值;计算子模块,被设置为利用预先设置的一个或多个所述词语特征的特征系数,对每个所述候选先行词的所述特征值进行线性加权计算,得到每个所述候选先行词的指代权重值。
- 根据权利要求9所述的装置,其中,所述多个候选先行词中的每个候选先行词包括一个或多个所述词语特征,所述词语特征包括下述至少之一:所述候选先行词的单复数特征、所述候选先行词与所述代词之间的距离、所述候选先行词是否出现在介词短语中、以及所述代词和所述候选先行词的语义关联性。
- 根据权利要求8所述的装置,其中,所述提取单元包括:查找模块,被设置为查找所述语句信息中代词的临近词;提取模块,被设置为在所述临近词的词性不为名词的情况下,从所述语句信息中提取多个候选先行词和所述多个候选先行词的词语特征。
- 根据权利要求8或12所述的装置,其中,所述提取单元包括:获取模块,被设置为获取所述语句信息中与所述代词的距离在预设距离内的名词短语;判断模块,被设置为判断所述名词短语与所述代词之间是否相互指代,若所述名词短语与所述代词之间相互指代,则将所述名词短语作为所述候选先行词。
- 根据权利要求13所述的装置,其中,所述判断模块包括:判断子模块,被设置为判断所述名词短语和所述代词之间的连接词的词性是否为谓词;若所述名词短语和所述代词之间的连接词的词性不为谓词,则判断出所述名词短语与所述代词之间能够相互指代;若所述名词短语和所述代词之间的连接词的词性为谓词,则判断出所述名词短语与所述代词之间不能够相互指代。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP17798514.0A EP3460678A4 (en) | 2016-05-20 | 2017-02-24 | METHOD AND APPARATUS FOR DETERMINING ANCEDENTS |
JP2018529148A JP6752282B2 (ja) | 2016-05-20 | 2017-02-24 | 先行詞の決定方法及び装置 |
KR1020187015847A KR102163549B1 (ko) | 2016-05-20 | 2017-02-24 | 선행사의 결정방법 및 장치 |
US16/009,474 US10810372B2 (en) | 2016-05-20 | 2018-06-15 | Antecedent determining method and apparatus |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610341637.6A CN107402913B (zh) | 2016-05-20 | 2016-05-20 | 先行词的确定方法和装置 |
CN201610341637.6 | 2016-05-20 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/009,474 Continuation US10810372B2 (en) | 2016-05-20 | 2018-06-15 | Antecedent determining method and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017197947A1 true WO2017197947A1 (zh) | 2017-11-23 |
Family
ID=60325646
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/074800 WO2017197947A1 (zh) | 2016-05-20 | 2017-02-24 | 先行词的确定方法和装置 |
Country Status (6)
Country | Link |
---|---|
US (1) | US10810372B2 (zh) |
EP (1) | EP3460678A4 (zh) |
JP (1) | JP6752282B2 (zh) |
KR (1) | KR102163549B1 (zh) |
CN (1) | CN107402913B (zh) |
WO (1) | WO2017197947A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109919161A (zh) * | 2019-04-01 | 2019-06-21 | 成都大学 | 基于图像识别的通信方法及装置 |
CN112733534A (zh) * | 2020-12-25 | 2021-04-30 | 北京左医科技有限公司 | 医患对话中半截词指向症状获取方法及系统 |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108920500B (zh) * | 2018-05-24 | 2022-02-11 | 众安信息技术服务有限公司 | 一种时间解析方法 |
CN108681538B (zh) * | 2018-05-28 | 2022-02-22 | 哈尔滨工业大学 | 一种基于深度学习的动词短语省略消解方法 |
CN109446517B (zh) * | 2018-10-08 | 2022-07-05 | 平安科技(深圳)有限公司 | 指代消解方法、电子装置及计算机可读存储介质 |
CN109325234B (zh) * | 2018-10-10 | 2023-06-20 | 深圳前海微众银行股份有限公司 | 语句处理方法、设备及计算机可读存储介质 |
CN109471919B (zh) * | 2018-11-15 | 2021-08-10 | 北京搜狗科技发展有限公司 | 零代词消解方法及装置 |
CN110162600B (zh) * | 2019-05-20 | 2024-01-30 | 腾讯科技(深圳)有限公司 | 一种信息处理的方法、会话响应的方法及装置 |
CN111984766B (zh) * | 2019-05-21 | 2023-02-24 | 华为技术有限公司 | 缺失语义补全方法及装置 |
CN110705206B (zh) * | 2019-09-23 | 2021-08-20 | 腾讯科技(深圳)有限公司 | 一种文本信息的处理方法及相关装置 |
CN110674630B (zh) * | 2019-09-24 | 2023-03-21 | 北京明略软件系统有限公司 | 指代消解方法和装置、电子设备及存储介质 |
CN111325034A (zh) * | 2020-02-12 | 2020-06-23 | 平安科技(深圳)有限公司 | 多轮对话中语义补齐的方法、装置、设备及存储介质 |
CN113297843B (zh) * | 2020-02-24 | 2023-01-13 | 华为技术有限公司 | 指代消解的方法、装置及电子设备 |
CN111522909B (zh) * | 2020-04-10 | 2024-04-02 | 海信视像科技股份有限公司 | 一种语音交互方法及服务器 |
CN111651578B (zh) * | 2020-06-02 | 2023-10-03 | 北京百度网讯科技有限公司 | 人机对话方法、装置及设备 |
CN112148847B (zh) * | 2020-08-27 | 2024-03-12 | 出门问问创新科技有限公司 | 一种语音信息的处理方法及装置 |
CN112989008A (zh) * | 2021-04-21 | 2021-06-18 | 上海汽车集团股份有限公司 | 一种多轮对话改写方法、装置和电子设备 |
US11848017B2 (en) * | 2021-06-10 | 2023-12-19 | Sap Se | Pronoun-based natural language processing |
US20240073161A1 (en) * | 2022-08-26 | 2024-02-29 | SoundHound AI IP, LLC. | Message processing method, information processing apparatus, and program |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101446943A (zh) * | 2008-12-10 | 2009-06-03 | 苏州大学 | 一种中文处理中基于语义角色信息的指代消解方法 |
CN102110087A (zh) * | 2009-12-24 | 2011-06-29 | 北京大学 | 字符数据中实体消解的方法和装置 |
CN104462053A (zh) * | 2013-09-22 | 2015-03-25 | 江苏金鸽网络科技有限公司 | 一种文本内的基于语义特征的人称代词指代消解方法 |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7383169B1 (en) * | 1994-04-13 | 2008-06-03 | Microsoft Corporation | Method and system for compiling a lexical knowledge base |
US5799268A (en) * | 1994-09-28 | 1998-08-25 | Apple Computer, Inc. | Method for extracting knowledge from online documentation and creating a glossary, index, help database or the like |
US7813916B2 (en) * | 2003-11-18 | 2010-10-12 | University Of Utah | Acquisition and application of contextual role knowledge for coreference resolution |
US7376551B2 (en) * | 2005-08-01 | 2008-05-20 | Microsoft Corporation | Definition extraction |
US8594996B2 (en) * | 2007-10-17 | 2013-11-26 | Evri Inc. | NLP-based entity recognition and disambiguation |
CN103150405B (zh) * | 2013-03-29 | 2014-12-10 | 苏州大学 | 一种分类模型建模方法、中文跨文本指代消解方法和系统 |
US9497153B2 (en) * | 2014-01-30 | 2016-11-15 | Google Inc. | Associating a segment of an electronic message with one or more segment addressees |
US9652453B2 (en) * | 2014-04-14 | 2017-05-16 | Xerox Corporation | Estimation of parameters for machine translation without in-domain parallel data |
CN104281645B (zh) * | 2014-08-27 | 2017-06-16 | 北京理工大学 | 一种基于词汇语义和句法依存的情感关键句识别方法 |
CN105988990B (zh) * | 2015-02-26 | 2021-06-01 | 索尼公司 | 汉语零指代消解装置和方法、模型训练方法和存储介质 |
-
2016
- 2016-05-20 CN CN201610341637.6A patent/CN107402913B/zh active Active
-
2017
- 2017-02-24 KR KR1020187015847A patent/KR102163549B1/ko active IP Right Grant
- 2017-02-24 WO PCT/CN2017/074800 patent/WO2017197947A1/zh active Application Filing
- 2017-02-24 JP JP2018529148A patent/JP6752282B2/ja active Active
- 2017-02-24 EP EP17798514.0A patent/EP3460678A4/en not_active Ceased
-
2018
- 2018-06-15 US US16/009,474 patent/US10810372B2/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101446943A (zh) * | 2008-12-10 | 2009-06-03 | 苏州大学 | 一种中文处理中基于语义角色信息的指代消解方法 |
CN102110087A (zh) * | 2009-12-24 | 2011-06-29 | 北京大学 | 字符数据中实体消解的方法和装置 |
CN104462053A (zh) * | 2013-09-22 | 2015-03-25 | 江苏金鸽网络科技有限公司 | 一种文本内的基于语义特征的人称代词指代消解方法 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109919161A (zh) * | 2019-04-01 | 2019-06-21 | 成都大学 | 基于图像识别的通信方法及装置 |
CN112733534A (zh) * | 2020-12-25 | 2021-04-30 | 北京左医科技有限公司 | 医患对话中半截词指向症状获取方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
US20180307671A1 (en) | 2018-10-25 |
CN107402913B (zh) | 2020-10-09 |
EP3460678A1 (en) | 2019-03-27 |
KR102163549B1 (ko) | 2020-10-08 |
JP6752282B2 (ja) | 2020-09-09 |
CN107402913A (zh) | 2017-11-28 |
US10810372B2 (en) | 2020-10-20 |
JP2019504395A (ja) | 2019-02-14 |
KR20180078318A (ko) | 2018-07-09 |
EP3460678A4 (en) | 2019-06-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017197947A1 (zh) | 先行词的确定方法和装置 | |
WO2018157789A1 (zh) | 一种语音识别的方法、计算机、存储介质以及电子装置 | |
WO2017084334A1 (zh) | 一种语种识别方法、装置、设备及计算机存储介质 | |
CN106874441A (zh) | 智能问答方法和装置 | |
CN110347790B (zh) | 基于注意力机制的文本查重方法、装置、设备及存储介质 | |
CN114580382A (zh) | 文本纠错方法以及装置 | |
CN109271524B (zh) | 知识库问答系统中的实体链接方法 | |
US10740570B2 (en) | Contextual analogy representation | |
US11699034B2 (en) | Hybrid artificial intelligence system for semi-automatic patent infringement analysis | |
CN110717021A (zh) | 人工智能面试中获取输入文本和相关装置 | |
US8806455B1 (en) | Systems and methods for text nuclearization | |
CN110659392B (zh) | 检索方法及装置、存储介质 | |
US10055400B2 (en) | Multilingual analogy detection and resolution | |
CN110245361B (zh) | 短语对提取方法、装置、电子设备及可读存储介质 | |
US10061770B2 (en) | Multilingual idiomatic phrase translation | |
US9892112B1 (en) | Machine learning to determine analogy outcomes | |
CN110427626B (zh) | 关键词的提取方法及装置 | |
JP4401269B2 (ja) | 対訳判断装置及びプログラム | |
CN112183117B (zh) | 一种翻译评价的方法、装置、存储介质及电子设备 | |
US10325025B2 (en) | Contextual analogy representation | |
US10503768B2 (en) | Analogic pattern determination | |
CN111401070A (zh) | 词义相似度确定方法及装置、电子设备及存储介质 | |
CN115577090B (zh) | 基于成语理解的语音对话方法、装置、设备及存储介质 | |
US20200142991A1 (en) | Identification of multiple foci for topic summaries in a question answering system | |
CN116306639A (zh) | 疾病名称标准化方法、装置、存储介质及电子设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 20187015847 Country of ref document: KR Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2018529148 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17798514 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2017798514 Country of ref document: EP Effective date: 20181220 |