CN114548115B - Method and device for explaining compound nouns and electronic equipment - Google Patents

Method and device for explaining compound nouns and electronic equipment Download PDF

Info

Publication number
CN114548115B
CN114548115B CN202210170360.0A CN202210170360A CN114548115B CN 114548115 B CN114548115 B CN 114548115B CN 202210170360 A CN202210170360 A CN 202210170360A CN 114548115 B CN114548115 B CN 114548115B
Authority
CN
China
Prior art keywords
noun
semantic
vector
target
angle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210170360.0A
Other languages
Chinese (zh)
Other versions
CN114548115A (en
Inventor
刘俊涛
王宗宇
谢睿
许慧敏
张福宝
武威
刘井平
肖仰华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN202210170360.0A priority Critical patent/CN114548115B/en
Publication of CN114548115A publication Critical patent/CN114548115A/en
Application granted granted Critical
Publication of CN114548115B publication Critical patent/CN114548115B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing

Abstract

The embodiment of the application provides an explanation method and device of a compound noun and electronic equipment, and the method comprises the following steps: generating a plurality of semantic feature vectors representing the semantics of the target compound nouns based on different angles; splicing mapping results of a plurality of semantic feature vectors in a target vector space to obtain a multi-angle semantic feature vector; determining the semantic relation between a first noun and a second noun in the target compound noun based on the multi-angle semantic feature vector; in the event that the semantic relationship correctly characterizes the association between the semantics of the first noun and the second noun, the semantic relationship is determined to be paraphrasing of the target compound noun. The embodiment of the application is considered from multiple angles, so that the accuracy of the multi-angle semantic feature vector is improved, and the accuracy of the semantic relation is further improved.

Description

Method and device for explaining compound nouns and electronic equipment
Technical Field
The present application relates to the field of self-language processing technologies, and in particular, to a method and an apparatus for interpreting compound nouns, and an electronic device.
Background
The compound noun is a compound word formed by directly connecting two nouns, is simple in usage and can be regarded as a fixed form. Wherein the rightmost noun may be referred to as the core word and the other nouns may be referred to as modifiers, e.g., "latex mattress," "latex" is the modifier, "mattress" is the core word. Compound nouns are an important linguistic structure that is widely present in a variety of languages. The interpretation of compound nouns is an important task for understanding the structure and semantics of compound nouns, and has wide application in various natural language processing tasks.
At present, a relationship classifier is usually adopted to interpret compound nouns, that is, after compound nouns are converted into semantic feature vectors representing semantics, the semantic feature vectors are input into the relationship classifier to determine semantic relationships corresponding to the two nouns. For example, if the compound noun is "beef hamburger", the semantic relationship between "beef" and "hamburger" can be determined as the food material relationship, and therefore the beef hamburger can be interpreted using the food material relationship. For explanation.
However, in the above process, the accuracy of the semantic relation is directly determined by the accuracy of the semantic feature vector, and therefore, the accuracy of the semantic relation is still low when the compound nouns are interpreted through the semantic relation, limited by the optimization degree of the conversion algorithm of the semantic feature vector.
Disclosure of Invention
The embodiment of the application provides a method and a device for explaining a compound noun and electronic equipment, so as to at least solve the problem of low accuracy in explaining the compound noun in the prior art.
According to an aspect of an embodiment of the present application, there is provided a method for interpreting a compound noun, the method including:
generating a plurality of semantic feature vectors characterizing semantics of a target compound noun based on different angles, wherein the different angles include: predicting a first angle of a tagged word based on a context of the tagged word, predicting a second angle of the context of the input word based on the input word, predicting at least two of a third angle of a target word based on a relationship graph of the target word;
splicing mapping results of the plurality of semantic feature vectors in a target vector space to obtain a multi-angle semantic feature vector;
determining the semantic relation between a first noun and a second noun in the target compound noun based on the multi-angle semantic feature vector;
determining the semantic relationship as a paraphrase of the target compound noun if the semantic relationship correctly characterizes the association between the first noun and the second noun semantics.
Optionally, the multi-angle semantic feature vector comprises a first vector corresponding to the first noun and a second vector corresponding to the second noun;
the determining the semantic relation between a first noun and a second noun in the target compound noun based on the multi-angle semantic feature vector comprises:
determining respective weights of the first noun and the second noun based on a door mechanism;
respectively multiplying a first vector and a second vector in the multi-angle semantic feature vector by the weight of the corresponding noun to obtain a processed multi-angle semantic feature vector;
and determining the semantic relation between a first noun and a second noun in the target compound noun based on the processed multi-angle semantic feature vector.
Optionally, the determining the respective weights of the first noun and the second noun based on a door mechanism includes:
and multiplying the gate vector by the first vector and the second vector respectively to obtain the weight of the first noun and the weight of the second noun, wherein the gate vector is obtained by training based on the influence of two nouns in the compound nouns on the semantic relationship.
Optionally, in a case that the semantic relationship fails to correctly characterize the association between the first noun and the second noun semantics, the method further comprises:
obtaining a plurality of candidate sentences based on a sentence template, the first nouns and the second nouns, wherein each candidate sentence comprises the first nouns and the second nouns;
determining target candidate sentences based on semantic similarity between the target compound nouns and each candidate sentence;
and determining the target candidate sentence as the paraphrase of the target compound word.
Optionally, the determining a target candidate sentence based on the semantic similarity between the target compound noun and each candidate sentence includes:
extracting a first feature vector representing the semantics of the candidate sentences based on target parameters for each candidate sentence, wherein the target parameters are model parameters in a language model constructed by adopting contrast learning;
extracting a second feature vector representing the semantics of the target compound noun based on the target parameter;
sequentially calculating the similarity between the second feature vector and each first feature vector;
and determining the candidate sentence corresponding to the first feature vector indicated by the maximum similarity as a target candidate sentence.
Optionally, the obtaining a plurality of candidate sentences based on the sentence template, the first noun and the second noun includes:
replacing a first target word in a sentence template comprising at least one empty slot with the first noun, and replacing a second target word with the second noun to obtain an intermediate sentence;
and filling empty slots in the intermediate sentences based on a pre-trained language model to obtain a plurality of candidate sentences.
Optionally, the filling empty slots in the intermediate sentence based on the pre-trained language model to obtain a plurality of candidate sentences includes:
obtaining a plurality of filling words for filling the empty slots based on a pre-trained language model;
and sequentially filling the filling words into empty slots in the intermediate sentences based on a greedy algorithm to obtain a plurality of candidate sentences.
According to another aspect of the embodiments of the present application, there is provided an apparatus for interpreting a compound noun, the apparatus including:
a semantic feature module, configured to generate a plurality of semantic feature vectors characterizing semantics of a target compound noun based on different angles, where the different angles include: predicting a first angle of a tagged word based on a context of the tagged word, predicting a second angle of the context of the input word based on the input word, predicting at least two of a third angle of a target word based on a relationship graph of the target word;
the vector processing module is used for splicing each mapping result of the plurality of semantic feature vectors in a target vector space to obtain a multi-angle semantic feature vector;
the semantic relation module is used for determining the semantic relation between a first noun and a second noun in the target compound noun based on the multi-angle semantic feature vector;
a paraphrasing module for determining the semantic relationship as a paraphrasing of the target compound noun if the semantic relationship correctly characterizes the association between the first noun and the second noun semantics.
According to still another aspect of an embodiment of the present application, there is provided an electronic apparatus including: a processor, a memory and a computer program stored on the memory and executable on the processor, the processor implementing the method of interpreting compound nouns as described above when executing the program.
According to yet another aspect of embodiments of the present application, there is provided a readable storage medium, wherein instructions, when executed by a processor of an electronic device, enable the electronic device to perform the method for interpreting compound nouns as described above.
In the embodiment of the application, a plurality of semantic feature vectors for representing the semantics of target compound nouns are generated based on different angles, wherein the different angles comprise a first angle for predicting the tagged words based on the contexts of the tagged words, a second angle for predicting the contexts of the input words based on the input words, and at least two of third angles for predicting the target words based on the relation graphs of the target words, so that the semantic feature vectors can represent the semantics of the target compound nouns accurately to the maximum extent, and then the multi-angle semantic feature vectors which can be directly used for judging the semantic relation are obtained through mapping and splicing in the same vector space, and finally the multi-angle semantic feature vectors are used for determining the semantic relation of two nouns in the target compound nouns, so that the target compound nouns are explained. In the embodiment of the application, the semantic feature vectors of the target compound nouns generated from multiple angles are generated by using multiple algorithms which are optimized from different aspects for the target algorithm, the target algorithm is an algorithm for generating the semantic feature vectors of words, and the accuracy of the semantic relationship is improved by improving the accuracy of the multi-angle semantic feature vectors.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments of the present application will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive exercise.
FIG. 1 is a flow chart of steps of a method for explaining compound nouns provided in an embodiment of the present application;
FIG. 2 is a diagram illustrating an example of an application of a method for explaining compound nouns according to an embodiment of the present application;
fig. 3 is a block diagram of an explanation apparatus for compound nouns according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
In various embodiments of the present application, it should be understood that the sequence numbers of the following processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
Referring to fig. 1, an embodiment of the present application provides a method for interpreting compound nouns, including:
step 101: a plurality of semantic feature vectors characterizing the semantics of the target compound noun are generated based on different angles.
In this step, the semantic feature vector may be understood as a word vector, and the process of generating the semantic feature vector based on the target compound noun is a process of converting characters into vectors that can be used for calculation. Each semantic feature vector of the plurality of semantic feature vectors may represent the semantics of the target compound noun, with the difference being that the algorithms or model parameters used to generate the different semantic feature vectors are different. Here, different algorithm or model parameters characterize different angles. Wherein, different angles include: predicting a first angle of the tagged word based on the context of the tagged word, predicting a second angle of the context of the input word based on the input word, predicting at least two of a third angle of the target word based on the relationship graph of the target word,
it will be appreciated that there are a large number of algorithmic models available for use in generating semantic feature vectors for words or sentences, the emphasis points or angles of which are different from one another, so that different algorithmic models have their own advantages in different respects. Here, one semantic feature vector that characterizes the semantics of the target compound noun will be generated on a per-angle basis, so that a plurality of semantic feature vectors can be obtained. It is noted that the algorithm model used for generating the semantic feature vector may be a model parameter in a trained network model instead of the trained network model, and the part of the model parameter is a model parameter for converting a word or a sentence into a word vector.
For the first angle, focusing on the context based on the tagged word, the tagged word is predicted, and the context refers to the context of the tagged word, i.e., the word before the tagged word and the word after the tagged word in the sentence to which the tagged word belongs. Thus, a network model may be trained from a first angle, with model parameters required to transform the word vectors being truncated after training is complete. The network model may be a BERT (Bidirectional Encoder replication from transforms) language model, but is not limited thereto. Taking a BERT language model as an example, a large number of sentences containing compound nouns are prepared as a corpus, the corpus is encoded by using the BERT language model, and an EntityMarker (entity marker) technique is used to identify the positions of the compound nouns in the sentences. Since the corpus contains multiple sentences containing different levels of semantic information, some even noise, the Attention mechanism is used to integrate the semantic information of compound nouns provided by all the corpus.
For the second angle, emphasis is placed on predicting the context of the input word based on the input word. Similarly, a network model may be trained from a second perspective, and after training is completed, model parameters required for converting the word vectors are intercepted. The network model may be, but is not limited to, a Skip-Gram (SG) model. The SG model is exemplified here, and is intended to learn expression of words by predicting surrounding words by a central word in a given compound noun. Specifically, user comment content (UGC) may be used as a corpus of the SG, and details of a specific training process will not be described here.
For the third degree, the target word is predicted with emphasis on the relationship graph of the target word. The relation graph of the target word is a network graph formed on the basis of the target word and other words related to the target word. The other words related to the target word may be upper words, lower words of the target word, words that can be collocated with the target word, and the like. Similarly, a network model can be trained from the third angle, and model parameters required for converting word vectors are intercepted after the training is finished. The network model may be, but is not limited to, a Relational graph convolutional network (R-GCN) model. The training process for the R-GCN model is not described in detail here.
Step 102: and splicing each mapping result of the plurality of semantic feature vectors in the target vector space to obtain the multi-angle semantic feature vector.
In this step, since the generation process of each semantic feature vector is not affected, a plurality of semantic feature vectors are not unified and cannot be directly used for calculation. Here, a plurality of semantic feature vectors may be mapped to the same vector space, i.e., a target vector space, and unified to be comparable. And then splicing all mapping results to obtain a multi-angle semantic feature vector.
Step 103: and determining the semantic relation between the first noun and the second noun in the target compound noun based on the multi-angle semantic feature vector.
In this step, the semantic relationship is to explain the relationship between the semantics of the first noun and the second noun in the compound noun. For example, if the compound term is "beef hamburger", the relationship between the semantics of the first term "beef" and the second term "hamburger" is the food material. It is understood that a target compound noun is any compound noun that consists of two nouns, a first noun and a second noun.
Step 104: in the event that the semantic relationship correctly characterizes the association between the semantics of the first noun and the second noun, the semantic relationship is determined to be paraphrasing of the target compound noun.
In this step, when the semantic relationship correctly represents the association between the semantics of the first noun and the second noun, the semantic relationship may be used as the definition of the target compound noun, otherwise, the semantic relationship may not be determined as the definition of the target compound noun. Similarly, taking the target compound noun as "beef hamburger" as an example, if the semantic relationship is the food material relationship, the semantic relationship can correctly represent the association between the semantics of "beef" and "hamburger", and the food material relationship can be used as the paraphrase of "beef hamburger". If the semantic relationship is a style relationship, the association between the semantics of beef and hamburger cannot be represented correctly.
In the embodiment of the application, a plurality of semantic feature vectors for representing the semantics of the target compound noun are generated based on different angles, wherein the different angles comprise a first angle for predicting the tagged words based on the contexts of the tagged words, a second angle for predicting the contexts of the input words based on the input words, and at least two of third angles for predicting the target words based on the relation graphs of the target words, so that the plurality of semantic feature vectors can accurately represent the semantics of the target compound noun to the maximum extent, and then the multi-angle semantic feature vectors which can be directly used for judging the semantic relation are obtained through mapping and splicing in the same vector space, and finally the multi-angle semantic feature vectors are used for determining the semantic relation of two nouns in the target compound noun, so that the interpretation of the target compound noun is realized. In the embodiment of the application, the semantic feature vectors of the target compound nouns generated from multiple angles are generated by using multiple algorithms which are optimized from different aspects for the target algorithm, the target algorithm is an algorithm for generating the semantic feature vectors of words, and the accuracy of the semantic relationship is improved by improving the accuracy of the multi-angle semantic feature vectors.
Optionally, the multi-angle semantic feature vector includes a first vector corresponding to the first noun and a second vector corresponding to the second noun. Since each semantic feature vector is a feature vector generated based on a first noun and a second noun, each semantic feature vector contains a feature vector corresponding to the first noun and a feature vector corresponding to the second noun. The mapping of the semantic feature vectors in the target vector space and the splicing of the mapping results do not cause the disappearance of the two part feature vectors. The multi-angle semantic feature vector generated by the method is stored in a part corresponding to the first noun, namely a first vector; there is a portion corresponding to the second noun, i.e., the second vector.
Determining the semantic relation between a first noun and a second noun in a target compound noun based on a multi-angle semantic feature vector, comprising:
the respective weights of the first noun and the second noun are determined based on a door mechanism.
It should be noted that in compound nouns, the effect of two nouns on the semantic relationship may be the same or different. Some compound nouns may directly determine the semantic relationship. Therefore, a gate mechanism can be set to determine the weight of each noun in the target compound noun, and the influence of the corresponding noun on the semantic relationship is larger when the weight is larger. Here, the gate mechanism is a gate control mechanism, and the weight of the first noun is determined based on the first vector, the weight of the second noun is determined based on the second vector, and the sum of the two weights is 1.
And respectively multiplying the first vector and the second vector in the multi-angle semantic feature vector by the weight of the corresponding noun to obtain the processed multi-angle semantic feature vector.
In this step, the first vector is multiplied by the weight of the first noun, and the second vector is multiplied by the weight of the second noun.
And determining the semantic relation between the first noun and the second noun in the target compound noun based on the processed multi-angle semantic feature vector.
In the embodiment of the application, the weight of the two nouns is dynamically adjusted by using a door mechanism, so that the processed multi-angle semantic feature vector carries more information for determining the semantic relationship, and the accuracy of the semantic relationship is further improved.
Optionally, determining the respective weights of the first noun and the second noun based on a door mechanism comprises:
and multiplying the gate vector by the first vector and the second vector respectively to obtain the weight of the first noun and the weight of the second noun, wherein the gate vector is obtained by training based on the influence of two nouns in the compound nouns on the semantic relation.
It should be noted that the gate vector is a training parameter obtained based on the gate mechanism learning. Here, training or learning may be performed based on the influence of two terms in the compound term on the semantic relationship, such that the term having a large influence on the semantic relationship has a large weight, and the term having a small influence on the semantic relationship has a small weight, and the sigmoid function is used such that the weight obtained based on the two training parameters is in the range of 0 to 1, and the sum of the two weights is 1.
In the embodiment of the application, the influence condition of two nouns in the compound nouns on the semantic relationship is utilized to determine the gate vector, so that the nouns which have larger influence on the semantic relationship have larger weight, and the nouns which have smaller influence on the semantic relationship have smaller weight, and the accuracy of the semantic relationship is further improved.
Optionally, in the case that the semantic relationship is determined by using the classification model, the classification result is a semantic relationship label, and the semantic relationship cannot correctly represent the association between the semantics of the first noun and the semantics of the second noun, which indicates that the probability values of the semantic relationship labels as the classification result are lower. Of course, a feature tag, that is, an NA tag, may also be set in the classification model, and when a reasonable semantic relationship tag cannot be used to explain the semantic relationship between the target compound nouns, the probability value of the NA tag in the classification result is determined to be a larger value. In the event that the semantic relationship fails to correctly characterize the association between the semantics of the first noun and the second noun, the method further comprises:
a plurality of candidate sentences are obtained based on the sentence template, the first noun and the second noun, wherein each candidate sentence comprises the first noun and the second noun.
It should be noted that the sentence model may be a preset phrase model, for example, the sentence may be "may be n2 of n1", "n2 of n2 is n1", where n1 and n2 may be replaced by a first noun and a second noun. Taking "fruit bouquet" as an example, it is possible to first create "may be-a bouquet of fruit", "a bouquet of fruit". And then, filling words into the "- -" to obtain a plurality of candidate sentences, wherein each word filling can generate one candidate sentence.
And determining the target candidate sentences based on the semantic similarity between the target compound nouns and each candidate sentence.
In this step, the semantic similarity, that is, the phase velocity of the two in semantics, is greater, the semantics of the two are closer, and the semantic difference is greater when the semantic similarity is smaller. And the target candidate sentence is the sentence with the largest semantic similarity with the target compound noun.
And determining the target candidate sentence as the paraphrase of the target compound word.
In the embodiment of the application, when the semantic relation can not correctly explain the compound nouns, the compound nouns are explained by a paraphrasing-based method, so that the coverage rate is improved while the accuracy is ensured.
Optionally, determining the target candidate sentence based on the semantic similarity between the target compound noun and each candidate sentence includes:
for each candidate sentence, a first feature vector characterizing the semantics of the candidate sentence is extracted based on the target parameters.
It should be noted that the target parameters are model parameters in a language model constructed by contrast learning. The first feature vector may be understood as a feature vector composed of word vectors of each word in the candidate sentence, and may characterize the semantics of the candidate sentence. It can be understood that, since the first feature vector of each candidate sentence needs to be compared with the semantic feature vector of the target compound noun, here, the language model is constructed by using contrast learning, and then the constructed language model is learned or trained, so that the feature vector extracted by using the model parameter can be better used for comparison.
Extracting a second feature vector representing the semantics of the target compound noun based on the target parameter;
sequentially calculating the similarity between the second feature vector and each first feature vector;
and determining the candidate statement corresponding to the first feature vector indicated by the maximum similarity as a target candidate statement.
In the embodiment of the application, the language model is constructed by adopting comparison learning, and then the constructed language model is learned or trained, so that the feature vectors extracted by adopting the model parameters can be better used for comparison, and the accuracy of the target candidate sentences is improved.
Optionally, obtaining a plurality of candidate sentences based on the sentence template, the first noun and the second noun, where each candidate sentence includes the first noun and the second noun, and the method includes:
and replacing the first target word in the sentence template comprising at least one empty slot with a first noun, and replacing the second target word with a second noun to obtain the intermediate sentence.
In this step, the sentence model may be "may be — n2 of n1", "n2 is — n1", where n1 is the first target word and n2 is the second target word. Then in the case of the compound term "fruit bouquet", the intermediate sentence may be "may-fruit bouquet", "fruit-bouquet".
And filling empty slots in the intermediate sentences based on a pre-trained language model to obtain a plurality of candidate sentences.
In this step, the BERT language may also be used to fill the empty slot in the intermediate sentence, and each filling may obtain one candidate sentence.
In the embodiment of the application, the language model is used for generating the candidate sentences, so that the method is fast and convenient, the accuracy of the candidate sentences is improved, and excessive grammar errors of the candidate sentences are avoided.
Optionally, filling empty slots in the intermediate sentence based on a pre-trained language model to obtain a plurality of candidate sentences, including:
obtaining a plurality of filling words for filling the empty slots based on a pre-trained language model;
and sequentially filling a plurality of filling words into empty slots in the intermediate sentences based on a greedy algorithm to obtain a plurality of candidate sentences.
It should be noted that each filling word may be filled into the empty slot to obtain a candidate sentence, however, there may be an error in the syntax of the candidate sentence obtained after filling, so that the candidate sentence is not smooth. Therefore, when a plurality of filling words are filled, filling is performed based on a greedy algorithm. And will not be described in detail herein with respect to the greedy algorithm.
In the embodiment of the application, the filling words are filled by utilizing a greedy algorithm, so that the rationality of the candidate sentences can be improved.
As shown in fig. 2, a schematic diagram of an actual application of the method for explaining compound nouns provided in the embodiment of the present application is provided, where the number of the compound nouns as an input may be one or more, and in the embodiment, the number of the compound nouns is plural, including, but not limited to, "beef hamburger", "fruit bouquet", "language barrier", "XX coffee".
In stage 1: and respectively converting the compound nouns corresponding to each input into semantic feature vectors representing the semantics of the compound nouns from different angles, wherein the different angles are the first angle for predicting the tagged words based on the contexts of the tagged words, the second angle for predicting the contexts of the input words based on the input words, and the third angle for predicting the target words based on the relational graph of the target words. The angle is the same as the different angle of the above embodiment of the invention, and the description is omitted here.
And mapping a plurality of semantic feature vectors to the same vector space for unification, and then splicing mapping results. After the mapping result is spliced, a door mechanism is needed to process the splicing result, the weight of each noun is calculated for each composite noun, and then the vector part corresponding to each noun is multiplied by the corresponding weight to obtain the multi-angle semantic feature vector. And finally, determining the distribution probability of each compound noun in each semantic relation label by adopting a relation classifier and taking softmax as an activation function. Wherein, each semantic relation label has a special label, namely an NA label, and the representation semantic relation can not correctly represent the association between two noun semantics in the compound noun. As shown in fig. 2, for the "beef hamburger" in the input, the semantic relation label of "XX coffee" is determined as the food material. The semantic tags of the remaining two compound nouns in the input are both NA tags, so the remaining two compound nouns are processed using stage 2.
And (2) stage: 4 paraphrase generating templates (corresponding to the sentence templates in the embodiment of the invention) can be designed in advance, each paraphrase generating template is provided with an empty slot to be filled, the empty slot is filled by using a BERT language model, and the empty slot can be filled with words with a plurality of word numbers. When more than 1 word is to be filled, a greedy strategy may be used, i.e. filling the words with the most probable positions in turn. Eventually, for each compound noun that remains, several candidate definitions are generated. It will be appreciated that good ones of the candidate definitions should be semantically consistent with compound nouns, and their semantic similarity should be very close, whereas poor ones should be less semantically similar to compound nouns. Accordingly, a method of contrast learning is used to model the feature representation of compound nouns and paraphrases, and then candidate paraphrases are selected according to semantic similarity. The explanation of the obtained 'fruit bouquet' is 'bouquet on which fruits can be placed', and the explanation of 'language barrier' is 'language barrier'.
And finally, splicing or combining the outputs of the two stages to be output as a whole.
Compared with the existing method, the method provided by the embodiment of the application has higher accuracy, and the effect ratio of the method to the existing method is shown in the following table 1:
Method Macro-Precision Macro-Recall Macro-F1
SVM 64.64 38.54 44.69
MaxEntropy 64.22 53.02 56.49
MLP 64.07 50.95 54.90
PCNN+ATT 67.85 53.40 58.25
BERT 82.05 80.18 80.54
BERT+CP 84.48 85.31 84.31
the embodiments of the present application 86.33 87.68 86.82
TABLE 1
In table 1, SVM (support vector machine), maxencopy (maximum entropy), MLP (multilayer perceptron), PCNN (Piece-Wise-CNN) + ATT (Attention), BERT + CP are all existing methods, macro-Precision, macro-Recall, and Macro-F1 are three different model indexes, namely, accuracy, recall, and F1 score.
In the embodiments of the present application, the explanation of compound nouns is divided into two stages. The first stage executes a relation classification task, and adds an NA relation label in a predefined relation set, so that all compound nouns can be classified into one relation, and the accuracy of the model is ensured at this stage. The second stage paraphrases those compound terms whose relational tag is NA, thus ensuring that the system will produce an explanation for any compound term entered. Thus allowing both high accuracy and high coverage in compound noun interpretation.
Referring to fig. 3, an embodiment of the present application further provides an apparatus for interpreting compound nouns, where the apparatus includes:
a semantic feature module 31, configured to generate a plurality of semantic feature vectors representing semantics of the target compound noun based on different angles, where the different angles include: predicting a first angle of the tagged word based on the context of the tagged word, predicting a second angle of the context of the input word based on the input word, predicting at least two of a third angle of the target word based on the relationship graph of the target word;
the vector processing module 32 is configured to splice mapping results of the plurality of semantic feature vectors in the target vector space to obtain a multi-angle semantic feature vector;
the semantic relation module 33 is configured to determine a semantic relation between a first noun and a second noun in the target compound noun based on the multi-angle semantic feature vector;
and a paraphrasing module 34 for determining the semantic relationship as a paraphrasing of the target compound noun in case the semantic relationship correctly characterizes the association between the semantics of the first noun and the second noun.
Optionally, the multi-angle semantic feature vector comprises a first vector corresponding to the first noun and a second vector corresponding to the second noun;
a semantic relationship module comprising:
a weighting unit for determining respective weights of the first noun and the second noun based on a gate mechanism;
the processing unit is used for multiplying the first vector and the second vector in the multi-angle semantic feature vector by the weight of the corresponding noun respectively to obtain a processed multi-angle semantic feature vector;
and the determining unit is used for determining the semantic relation between the first noun and the second noun in the target compound noun based on the processed multi-angle semantic feature vector.
Optionally, the weighting unit is specifically configured to multiply the gate vector by the first vector and the second vector respectively to obtain a weight of the first noun and a weight of the second noun, where the gate vector is a vector trained based on influence of two nouns in the compound noun on the semantic relationship.
Optionally, in a case where the semantic relationship fails to correctly characterize the association between the semantics of the first noun and the second noun, the apparatus further comprises:
the sentence structure comprises a sentence template, a first noun, a second noun, a first template paraphrasing module and a second template paraphrasing module, wherein the sentence template comprises a plurality of candidate sentences;
the second template definition module is used for determining target candidate sentences based on the semantic similarity between the target compound nouns and each candidate sentence;
and the third template paraphrasing module is used for determining the target candidate sentence as the paraphrasing of the target compound word.
Optionally, a second template paraphrasing module comprising:
the first extraction unit is used for extracting a first feature vector for representing the semantics of the candidate sentences based on target parameters aiming at each candidate sentence, wherein the target parameters are model parameters in a language model constructed by adopting contrast learning;
a second extraction unit, configured to extract a second feature vector representing semantics of the target compound noun based on the target parameter;
the similarity unit is used for sequentially calculating the similarity between the second characteristic vector and each first characteristic vector;
and the similarity determining unit is used for determining the candidate sentence corresponding to the first feature vector indicated by the maximum similarity as the target candidate sentence.
Optionally, the first template paraphrasing module comprises:
the replacing unit is used for replacing a first target word in the sentence template comprising at least one empty slot with a first noun and replacing a second target word with a second noun to obtain an intermediate sentence;
and the filling unit is used for filling the empty slots in the intermediate sentences based on the pre-trained language model to obtain a plurality of candidate sentences.
Optionally, the filling unit is specifically configured to obtain a plurality of filling words for filling the empty slots based on a pre-trained language model; and sequentially filling a plurality of filling words into empty slots in the intermediate sentences based on a greedy algorithm to obtain a plurality of candidate sentences.
The device for explaining compound nouns provided in the embodiment of the present application can implement each process implemented by the method for explaining compound nouns in the method embodiments of fig. 1 to fig. 2, and is not described herein again to avoid repetition.
In the embodiment of the application, a plurality of semantic feature vectors for representing the semantics of the target compound noun are generated based on different angles, wherein the different angles comprise a first angle for predicting the tagged words based on the contexts of the tagged words, a second angle for predicting the contexts of the input words based on the input words, and at least two of third angles for predicting the target words based on the relation graphs of the target words, so that the plurality of semantic feature vectors can accurately represent the semantics of the target compound noun to the maximum extent, and then the multi-angle semantic feature vectors which can be directly used for judging the semantic relation are obtained through mapping and splicing in the same vector space, finally the multi-angle semantic feature vectors are used for determining the semantic relation of two nouns in the target compound noun, and the interpretation of the target compound noun is realized. In the embodiment of the application, the semantic feature vectors of the target compound nouns generated from multiple angles are generated by using multiple algorithms which are optimized from different aspects for the target algorithm, the target algorithm is an algorithm for generating the semantic feature vectors of words, and the accuracy of the semantic relationship is improved by improving the accuracy of the multi-angle semantic feature vectors.
On the other hand, the embodiment of the present application further provides an electronic device, which includes a processor, a memory, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the processor implements the method for interpreting compound nouns provided by the embodiments of the above-mentioned applications.
In still another aspect, the present application provides a readable storage medium, and when executed by a processor of an electronic device, the instructions in the readable storage medium enable the electronic device to perform the method for interpreting compound nouns provided in the embodiments of the present application.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. Based on the understanding, the above technical solutions substantially or otherwise contributing to the prior art may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (8)

1. A method for interpreting compound nouns, the method comprising:
generating a plurality of semantic feature vectors characterizing semantics of a target compound noun based on different angles, wherein the different angles include: predicting a first angle of a tagged word based on a context of the tagged word, predicting a second angle of the context of the input word based on the input word, predicting at least two of a third angle of a target word based on a relationship graph of the target word;
splicing mapping results of the plurality of semantic feature vectors in a target vector space to obtain a multi-angle semantic feature vector;
determining the semantic relation between a first noun and a second noun in the target compound noun based on the multi-angle semantic feature vector;
determining the semantic relationship as paraphrasing the target compound noun if the semantic relationship correctly characterizes the association between the first noun and the second noun semantics;
wherein the multi-angle semantic feature vector comprises a first vector corresponding to the first noun and a second vector corresponding to the second noun;
the determining the semantic relationship between a first noun and a second noun in the target compound noun based on the multi-angle semantic feature vector comprises:
determining respective weights of the first noun and the second noun based on a door mechanism;
respectively multiplying a first vector and a second vector in the multi-angle semantic feature vector by the weight of the corresponding noun to obtain a processed multi-angle semantic feature vector;
determining the semantic relation between a first noun and a second noun in the target compound noun based on the processed multi-angle semantic feature vector;
wherein the determining respective weights of the first noun and the second noun based on a door mechanism comprises:
and multiplying the gate vector by the first vector and the second vector respectively to obtain the weight of the first noun and the weight of the second noun, wherein the gate vector is obtained by training based on the influence of two nouns in the compound noun on the semantic relationship.
2. The method of claim 1, wherein in the event that the semantic relationship fails to correctly characterize the association between the first noun and the second noun semantics, the method further comprises:
obtaining a plurality of candidate sentences based on a sentence template, the first nouns and the second nouns, wherein each candidate sentence comprises the first nouns and the second nouns;
determining target candidate sentences based on semantic similarity between the target compound nouns and each candidate sentence;
and determining the target candidate sentence as the paraphrase of the target compound word.
3. The method of claim 2, wherein determining target candidate sentences based on semantic similarity between the target compound noun and each of the candidate sentences comprises:
extracting a first feature vector representing the semantics of the candidate sentences based on target parameters for each candidate sentence, wherein the target parameters are model parameters in a language model constructed by adopting contrast learning;
extracting a second feature vector representing the semantics of the target compound noun based on the target parameter;
sequentially calculating the similarity between the second feature vector and each first feature vector;
and determining the candidate statement corresponding to the first feature vector indicated by the maximum similarity as a target candidate statement.
4. The method of claim 2, wherein deriving a plurality of candidate sentences based on the sentence template, the first noun, and the second noun comprises:
replacing a first target word in the sentence template comprising at least one empty slot with the first noun, and replacing a second target word with the second noun to obtain an intermediate sentence;
and filling empty slots in the intermediate sentences based on a pre-trained language model to obtain a plurality of candidate sentences.
5. The method of claim 4, wherein the filling empty slots in the intermediate sentence based on a pre-trained language model to obtain a plurality of candidate sentences comprises:
obtaining a plurality of filling words for filling the empty slots based on a pre-trained language model;
and sequentially filling the filling words into empty slots in the intermediate sentences based on a greedy algorithm to obtain a plurality of candidate sentences.
6. An apparatus for interpreting compound nouns, the apparatus comprising:
a semantic feature module, configured to generate a plurality of semantic feature vectors characterizing semantics of a target compound noun based on different angles, where the different angles include: predicting a first angle of a tagged word based on a context of the tagged word, predicting a second angle of the context of the input word based on the input word, predicting at least two of a third angle of a target word based on a relationship graph of the target word;
the vector processing module is used for splicing each mapping result of the plurality of semantic feature vectors in a target vector space to obtain a multi-angle semantic feature vector;
the semantic relation module is used for determining the semantic relation between a first noun and a second noun in the target compound noun based on the multi-angle semantic feature vector;
a paraphrasing module for determining the semantic relationship as a paraphrasing of the target compound noun if the semantic relationship correctly characterizes the association between the first noun and the second noun semantics;
wherein the plurality of the angular semantic feature vectors includes a first vector corresponding to the first noun and a second vector corresponding to the second noun;
wherein, the semantic relation module includes:
a weighting unit configured to determine a weight of each of the first noun and the second noun based on a gate mechanism; the method specifically includes multiplying a gate vector by the first vector and the second vector respectively to obtain a weight of the first noun and a weight of the second noun, where the gate vector is a vector obtained by training based on influence of two nouns in a compound noun on a semantic relationship;
the processing unit is used for multiplying the first vector and the second vector in the multi-angle semantic feature vector by the weight of the corresponding noun respectively to obtain a processed multi-angle semantic feature vector;
and the determining unit is used for determining the semantic relation between the first noun and the second noun in the target compound noun based on the processed multi-angle semantic feature vector.
7. An electronic device, comprising: processor, memory and computer program stored on the memory and executable on the processor, which when executing the program implements a method of interpreting a compound noun according to one or more of claims 1-5.
8. A readable storage medium, characterized in that instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform a method of interpreting compound nouns as claimed in one or more of claims 1-5.
CN202210170360.0A 2022-02-23 2022-02-23 Method and device for explaining compound nouns and electronic equipment Active CN114548115B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210170360.0A CN114548115B (en) 2022-02-23 2022-02-23 Method and device for explaining compound nouns and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210170360.0A CN114548115B (en) 2022-02-23 2022-02-23 Method and device for explaining compound nouns and electronic equipment

Publications (2)

Publication Number Publication Date
CN114548115A CN114548115A (en) 2022-05-27
CN114548115B true CN114548115B (en) 2023-01-06

Family

ID=81678100

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210170360.0A Active CN114548115B (en) 2022-02-23 2022-02-23 Method and device for explaining compound nouns and electronic equipment

Country Status (1)

Country Link
CN (1) CN114548115B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100073163A (en) * 2008-12-22 2010-07-01 한국전자통신연구원 Compound noun recognition apparatus and its method
KR20110057631A (en) * 2009-11-24 2011-06-01 한국전자통신연구원 Compound noun range determination apparatus and its method
CN107894979A (en) * 2017-11-21 2018-04-10 北京百度网讯科技有限公司 The compound process method, apparatus and its equipment excavated for semanteme
CN109697286A (en) * 2018-12-18 2019-04-30 众安信息技术服务有限公司 A kind of diagnostic standardization method and device based on term vector
CN110457692A (en) * 2019-07-26 2019-11-15 清华大学 Compound word indicates learning method and device
CN110569498A (en) * 2018-12-26 2019-12-13 东软集团股份有限公司 Compound word recognition method and related device
CN110597961A (en) * 2019-09-18 2019-12-20 腾讯科技(深圳)有限公司 Text category labeling method and device, electronic equipment and storage medium
CN111324699A (en) * 2020-02-20 2020-06-23 广州腾讯科技有限公司 Semantic matching method and device, electronic equipment and storage medium
CN112163416A (en) * 2020-10-09 2021-01-01 北京理工大学 Event joint extraction method for merging syntactic and entity relation graph convolution network
CN112364648A (en) * 2020-12-02 2021-02-12 中金智汇科技有限责任公司 Keyword extraction method and device, electronic equipment and storage medium
CN113591462A (en) * 2021-07-28 2021-11-02 咪咕数字传媒有限公司 Bullet screen reply generation method and device and electronic equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113254616B (en) * 2021-06-07 2021-10-19 佰聆数据股份有限公司 Intelligent question-answering system-oriented sentence vector generation method and system

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100073163A (en) * 2008-12-22 2010-07-01 한국전자통신연구원 Compound noun recognition apparatus and its method
KR20110057631A (en) * 2009-11-24 2011-06-01 한국전자통신연구원 Compound noun range determination apparatus and its method
CN107894979A (en) * 2017-11-21 2018-04-10 北京百度网讯科技有限公司 The compound process method, apparatus and its equipment excavated for semanteme
CN109697286A (en) * 2018-12-18 2019-04-30 众安信息技术服务有限公司 A kind of diagnostic standardization method and device based on term vector
CN110569498A (en) * 2018-12-26 2019-12-13 东软集团股份有限公司 Compound word recognition method and related device
CN110457692A (en) * 2019-07-26 2019-11-15 清华大学 Compound word indicates learning method and device
CN110597961A (en) * 2019-09-18 2019-12-20 腾讯科技(深圳)有限公司 Text category labeling method and device, electronic equipment and storage medium
CN111324699A (en) * 2020-02-20 2020-06-23 广州腾讯科技有限公司 Semantic matching method and device, electronic equipment and storage medium
CN112163416A (en) * 2020-10-09 2021-01-01 北京理工大学 Event joint extraction method for merging syntactic and entity relation graph convolution network
CN112364648A (en) * 2020-12-02 2021-02-12 中金智汇科技有限责任公司 Keyword extraction method and device, electronic equipment and storage medium
CN113591462A (en) * 2021-07-28 2021-11-02 咪咕数字传媒有限公司 Bullet screen reply generation method and device and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Automatic term recognition based on statistics of compound nouns and their components;Hirosi Nakagawa;《International Journal of Theoretical and Applied Issues in Specialized Communication》;20000131;第6卷(第2期);第192-210页 *

Also Published As

Publication number Publication date
CN114548115A (en) 2022-05-27

Similar Documents

Publication Publication Date Title
EP4060565A1 (en) Method and apparatus for acquiring pre-trained model
US9977778B1 (en) Probabilistic matching for dialog state tracking with limited training data
US11586814B2 (en) Paraphrase sentence generation method and apparatus
US20130179169A1 (en) Chinese text readability assessing system and method
CN110334186A (en) Data query method, apparatus, computer equipment and computer readable storage medium
CN113743099B (en) System, method, medium and terminal for extracting terms based on self-attention mechanism
US11461613B2 (en) Method and apparatus for multi-document question answering
US20150161109A1 (en) Reordering words for machine translation
CN114385806A (en) Text summarization method and system based on deep learning
CN111079418A (en) Named body recognition method and device, electronic equipment and storage medium
JP7061594B2 (en) Sentence conversion system, sentence conversion method, and program
US20230205994A1 (en) Performing machine learning tasks using instruction-tuned neural networks
CN112084769A (en) Dependency syntax model optimization method, device, equipment and readable storage medium
Liu et al. Cross-domain slot filling as machine reading comprehension: A new perspective
CN111611791A (en) Text processing method and related device
KR102608867B1 (en) Method for industry text increment, apparatus thereof, and computer program stored in medium
CN111291565A (en) Method and device for named entity recognition
JP2022145623A (en) Method and device for presenting hint information and computer program
US20220147719A1 (en) Dialogue management
US20190129948A1 (en) Generating method, generating device, and recording medium
WO2023088278A1 (en) Method and apparatus for verifying authenticity of expression, and device and medium
CN114548115B (en) Method and device for explaining compound nouns and electronic equipment
CN115906854A (en) Multi-level confrontation-based cross-language named entity recognition model training method
CN112818688B (en) Text processing method, device, equipment and storage medium
US20230029196A1 (en) Method and apparatus related to sentence generation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant