CN115906818A - Grammar knowledge prediction method, grammar knowledge prediction device, electronic equipment and storage medium - Google Patents

Grammar knowledge prediction method, grammar knowledge prediction device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115906818A
CN115906818A CN202211644221.3A CN202211644221A CN115906818A CN 115906818 A CN115906818 A CN 115906818A CN 202211644221 A CN202211644221 A CN 202211644221A CN 115906818 A CN115906818 A CN 115906818A
Authority
CN
China
Prior art keywords
word
target
training
natural language
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211644221.3A
Other languages
Chinese (zh)
Inventor
王旭
郭冬杰
汪洋
盛志超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
iFlytek Co Ltd
Original Assignee
iFlytek Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by iFlytek Co Ltd filed Critical iFlytek Co Ltd
Priority to CN202211644221.3A priority Critical patent/CN115906818A/en
Publication of CN115906818A publication Critical patent/CN115906818A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses a grammar knowledge prediction method, a grammar knowledge prediction device, electronic equipment and a storage medium, and belongs to the technical field of natural language processing, wherein the grammar knowledge prediction method comprises the following steps: acquiring a natural language text and position information of a target word in the natural language text; vectorizing each sentence in the natural language text to obtain a first feature vector corresponding to each word in the natural language text; performing dependency syntax analysis on each sentence in the natural language text to obtain dependency syntax information corresponding to each word in the natural language text, and performing vectorization processing on the dependency syntax information to obtain a dependency feature vector corresponding to each word in the natural language text; fusing the first feature vector and the dependency feature vector to obtain a fused vector; and carrying out grammar knowledge classification based on the fused vector and the position information of the target word to obtain grammar knowledge corresponding to the target word.

Description

Grammar knowledge prediction method, grammar knowledge prediction device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of natural language processing technologies, and in particular, to a method and an apparatus for predicting grammatical knowledge, an electronic device, and a storage medium.
Background
In the intelligent education service, grammar prediction is an important application of text classification, and can be used for business scenes such as student learning situation analysis, relevant knowledge point recommendation and the like. The existing grammar prediction scheme uses a text feature-based classification method to obtain a sentence in natural language, a word in the sentence is designated, the sentence is subjected to word segmentation, coding and vectorization, and the obtained feature vector is classified into a grammar class through a classification neural network. However, the syntax of a complex sentence pattern natural language is often difficult to predict due to the existence of compound sentence patterns such as clauses, inverted sentences, nested sentences, and the like.
Disclosure of Invention
The invention provides a grammar knowledge prediction method, a grammar knowledge prediction device, electronic equipment and a storage medium, which are used for solving the problem that the grammar of a complex sentence pattern natural language is difficult to predict.
The invention provides a grammar knowledge prediction method, which comprises the following steps:
acquiring a natural language text and position information of a target word in the natural language text;
vectorizing each sentence in the natural language text to obtain a first feature vector corresponding to each word in the natural language text;
performing dependency syntax analysis on each sentence in the natural language text to obtain dependency syntax information corresponding to each word in the natural language text, and performing vectorization processing on the dependency syntax information to obtain a dependency feature vector corresponding to each word in the natural language text;
and fusing the first feature vector and the dependency feature vector to obtain a fused vector, and carrying out grammar knowledge classification based on the fused vector and the position information of the target word to obtain grammar knowledge corresponding to the target word.
In some embodiments, the vectorizing each sentence in the natural language text to obtain a first feature vector corresponding to each word in the natural language text includes:
and sequentially carrying out vectorization processing on each sentence in the natural language text to obtain a word vector and a position coding vector corresponding to each word in the natural language text.
In some embodiments, the dependency syntax information includes: position information of another word in the sentence on which the word depends syntactically, a relationship between the word and another word on which the word depends syntactically;
the vectorizing the dependency syntax information to obtain the dependency feature vector corresponding to each word in the natural language text includes:
vectorizing the position information in the dependency syntax information to obtain a dependency position vector corresponding to each word in the natural language text;
vectorizing the relationship in the dependency syntax information to obtain a dependency relationship vector corresponding to each word in the natural language text.
In some embodiments, the performing grammar knowledge classification based on the fused vector and the position information of the target word to obtain grammar knowledge corresponding to the target word includes:
extracting the features of the fused vector to obtain a second feature vector;
obtaining a feature vector corresponding to the target word based on the second feature vector and the position information of the target word;
and carrying out grammar knowledge classification on the feature vectors corresponding to the target words to obtain grammar knowledge corresponding to the target words.
In some embodiments, the classifying the feature vector corresponding to the target word with grammar knowledge to obtain the grammar knowledge corresponding to the target word includes:
inputting the feature vectors corresponding to the target words into a multi-target classification layer for grammar knowledge classification to obtain grammar knowledge corresponding to the target words;
the multi-target classification layer is obtained by training by taking a feature vector corresponding to a target training word in a natural language training text as training data and taking grammatical knowledge corresponding to the target training word in the natural language training text as a training label;
or the multi-target classification layer is obtained by training with feature vectors corresponding to target training words in a natural language training text as training data and grammar knowledge and part-of-speech information corresponding to the target training words in the natural language training text as training labels, wherein the part-of-speech information is obtained by performing dependency syntax analysis on the natural language training text.
In some embodiments, the multi-objective classification layer determination process comprises:
acquiring a natural language training text and position information of a target training word in the natural language training text;
extracting the features of the natural language training text to obtain a feature vector corresponding to the natural language training text;
obtaining a feature vector corresponding to the target training word based on the feature vector corresponding to the natural language training text and the position information of the target training word;
determining grammar knowledge corresponding to the target training words;
training an initial multi-target classification layer by taking the feature vector corresponding to the target training word as training data and grammar knowledge corresponding to the target training word as a training label, wherein the initial multi-target classification layer comprises: at least one of a syntax classification layer, an error type classification layer, and a phrase classification layer;
and obtaining the multi-target classification layer after the training of the initial multi-target classification layer is finished.
In some embodiments, the determining of the multi-objective classification layer includes:
acquiring a natural language training text and position information of a target training word in the natural language training text;
extracting the features of the natural language training text to obtain a feature vector corresponding to the natural language training text;
obtaining a feature vector corresponding to the target training word based on the feature vector corresponding to the natural language training text and the position information of the target training word;
determining grammar knowledge and part-of-speech information corresponding to the target training word, wherein the part-of-speech information corresponding to the target training word is obtained according to dependency syntax analysis;
training an initial multi-target classification layer by taking the feature vector corresponding to the target training word as training data and taking grammar knowledge and part of speech information corresponding to the target training word as training labels, wherein the initial multi-target classification layer comprises: at least one of a grammar classification layer, an error type classification layer and a phrase classification layer, and a part-of-speech classification layer;
after the training of the initial multi-target classification layer is completed, at least one of a grammar classification layer, an error type classification layer and a phrase classification layer in the trained initial multi-target classification layer is reserved, and the multi-target classification layer is obtained.
In some embodiments, the training of the initial multi-objective classification layer comprises:
inputting the feature vector corresponding to the target training word into at least one of the grammar classification layer, the error type classification layer and the phrase classification layer to obtain a grammar knowledge prediction result, wherein the grammar knowledge prediction result comprises at least one of a grammar prediction result, an error type prediction result and a phrase prediction result;
calculating a first loss function value by using a multi-label cross entropy loss function based on the grammar knowledge prediction result and grammar knowledge corresponding to the target training word;
updating parameters of the initial multi-objective classification layer based on the first loss function values.
The present invention also provides a syntax knowledge prediction apparatus, comprising:
an acquisition unit configured to acquire a natural language text and position information of a target word in the natural language text;
the text feature extraction unit is used for vectorizing each sentence in the natural language text to obtain a first feature vector corresponding to each word in the natural language text;
the dependency syntax analysis unit is used for carrying out dependency syntax analysis on each sentence in the natural language text to obtain dependency syntax information corresponding to each word in the natural language text, and carrying out vectorization processing on the dependency syntax information to obtain a dependency feature vector corresponding to each word in the natural language text;
and the classification unit is used for fusing the first feature vector and the dependency feature vector to obtain a fused vector, and performing grammar knowledge classification based on the fused vector and the position information of the target word to obtain grammar knowledge corresponding to the target word.
The present invention also provides an electronic device, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the syntax knowledge prediction method as described in any one of the above when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of grammar knowledge prediction as described in any one of the above.
The invention also provides a computer program product comprising a computer program which, when executed by a processor, implements a method of grammar knowledge prediction as in any one of the above.
According to the grammar knowledge prediction method, the grammar knowledge prediction device, the electronic equipment and the storage medium, vectorization processing is carried out on each sentence in the natural language text to obtain the first feature vector corresponding to each word, dependency syntactic analysis and vectorization processing are carried out on each sentence to obtain the dependency feature vector corresponding to each word, the dependency feature vector is introduced to be fused with the first feature vector, so that the model can understand the relation between the sentence form structure and the word, grammar knowledge classification is carried out on the basis of the fused vector and the position information of the target word, and grammar knowledge corresponding to the target word in the complex sentence form natural language text can be predicted.
Drawings
In order to more clearly illustrate the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of a prior art BERT text classification flow;
FIG. 2 is a flow chart of a grammar knowledge prediction method according to an embodiment of the present invention;
FIG. 3 is an exemplary diagram of dependency syntax information provided by an embodiment of the present invention;
FIG. 4 is a flow chart illustrating the syntax knowledge prediction classification based on the fused vector according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a model structure of a multi-objective classification layer according to an embodiment of the present invention;
FIG. 6 is a flowchart illustrating a method for determining multi-objective classification layers according to an embodiment of the present invention;
FIG. 7 is a second schematic diagram of a model structure of a multi-objective classification layer according to an embodiment of the present invention;
FIG. 8 is a second flowchart illustrating a method for determining multiple classification layers according to an embodiment of the present invention;
FIG. 9 is a diagram illustrating a grammar knowledge classification model using mask loss to control loss computation according to an embodiment of the present invention;
FIG. 10 is a block diagram of an apparatus for performing a grammar knowledge prediction method according to an embodiment of the present invention;
FIG. 11 is a schematic diagram of a syntax knowledge prediction apparatus according to the present invention;
fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," and the like in the description of the invention are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in other sequences than those illustrated or otherwise described herein, and that the terms "first" and "second" used herein generally do not limit the number of objects, e.g., the first object can be one or more. In addition, "and/or" in the specification means at least one of connected objects, and a character "/" generally means that a front and rear related object is in an "or" relationship.
Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. NLP techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.
Text classification is a basic task in NLP, is also an important module in text processing, and often appears as an upstream task of a specific service; text classification has a wide range of applications, such as grammar prediction, emotion analysis, intent recognition, spam classification, and the like.
The text classification model generally includes a plurality of encoder layers, which can extract feature representations of the text and output the feature representations in the form of representation vectors. When the method works, after a text is input into a classification model, the text sequentially passes through a plurality of encoder layers, the output of the upper encoder layer is used as the input of the lower encoder layer, the output of the last encoder layer is used as the characteristic information corresponding to the text, the text classification is carried out based on the characteristic information, and the text classification is realized by the classification layers. Wherein, the encoder layer closer to the input side is a lower layer, and the encoder layer closer to the classification layer is a higher layer. For a lower encoder layer, the resolution is higher, more detail information is contained, and lexical features are reflected more semantically; the high-level encoder layer has stronger semantic information, but has limited resolution, and more semantically reflects grammatical features.
A Bidirectional encoding Representation model (BERT) based on transforms is a pre-training language model, and the BERT model aims to train by using large-scale linguistic data to obtain semantic Representation of a text, and then fine-tune the semantic Representation of the text in a specific NLP task and finally apply the NLP task to the text.
The existing grammar prediction scheme generally uses a text classification scheme based on a pre-training language model; and inputting the text to be predicted into a pre-training model for feature extraction, and classifying by using the extracted text features. Generally, the pre-training model uses a BERT model, and fig. 1 is a schematic diagram of a BERT text classification flow in the prior art. As shown in fig. 1, the input text is: the way referred to to The solution The scheme sound needed available, appoint The second "to" in The text as The target word, splice a special character [ CLS ] at The beginning of The input text, splice The special character [ SEP ] at The end of The text, the English text with special character before and after The front is divided into single word or sub-word by The word-segmenter first, each word, sub-word or special character is represented by a number after The encoder coding, the English text input is converted into a string of numbers: 101 762 1029 3825 12752 92 92 52712314 823 15278 1019 6524 1021 102. And then inputting the string of numbers into a BERT model, converting each number into a vector by the model, finally using the vector corresponding to [ CLS ] at the beginning of the sentence as the characteristic of the whole English text, passing the vector through a classification neural network, finally classifying the vector into a certain grammar class, and outputting the usage that the English grammar corresponding to the target word "to" in the English text is "to", and using the indefinite as a fixed language and a non-predicate verb ".
However, the conventional grammar prediction scheme only performs grammar classification through a single text feature, and has a poor grammar classification effect on complex sentence patterns. In a complex sentence pattern, a word to be examined can be in a subordinate sentence, and the subordinate sentence has a corresponding main-subordinate guest structure, so that the structure of the word in the subordinate sentence is easily and wrongly taken as a main structure in the existing scheme; only from the text surface, the existing scheme is difficult to distinguish the difference between the long difficult sentence and other sentences, for partial grammar, the relationship between words needs to be inferred to obtain a correct prediction result, and the existing model is difficult to learn the inference capability from the text data.
The prediction of multiple tasks through one model is also an important development direction of the current general language model. At present, when multi-task training is carried out, manual labeling is generally needed to obtain data with multiple labels at the same time, the difficulty is high, and the obtaining of data of a single label on any subtask is simple. In addition, the existing multi-task training has the problem of unbalanced weight on each task.
Therefore, the invention provides a grammar knowledge prediction method, a grammar knowledge prediction device, electronic equipment and a storage medium, wherein a first feature vector of each word in a text is obtained by vectorizing the acquired natural language text, dependency syntax analysis is carried out on the text to obtain the dependency feature vector of each word in the text, the obtained first feature vector and the dependency feature vector are fused, and the text is simultaneously subjected to multi-task classification based on the fused vector and the acquired position information of the target word to obtain grammar knowledge corresponding to the target word; by introducing dependency syntax analysis, the model can understand the relationship between the sentence pattern structure and the word, and accurate grammar knowledge prediction can be realized for the natural language text with the complex sentence pattern, so that the problem that the grammar of the complex sentence pattern natural language is difficult to predict is solved.
The syntax knowledge prediction method provided by the present invention can be applied to a syntax knowledge prediction device, and the syntax knowledge prediction device can be embodied by hardware or software, and can be configured on a terminal device side or a server side, which is not limited by the present invention.
It should be noted that the grammar knowledge classification in the embodiment of the present invention has the same meaning expression as the grammar knowledge prediction.
Fig. 2 is a flowchart illustrating a syntax knowledge prediction method according to an embodiment of the present invention. As shown in fig. 2, a syntax knowledge prediction method is provided, which is described by taking the example that the syntax knowledge prediction method is applied to the terminal in fig. 2, and includes the following steps: step 210, step 220, step 230 and step 240. The method flow steps are only one possible implementation of the present invention.
Step 210, obtaining a natural language text and position information of a target word in the natural language text;
it is to be explained that natural language generally refers to a language that naturally evolves with culture. Such as chinese, english, japanese, etc. However, sometimes all languages used by humans (including the above-mentioned languages that naturally evolve culturally, as well as man-made languages) are considered "natural" languages, as opposed to "man-made" languages that are provided for computers, such as programming languages.
Corresponding natural language text can be obtained according to the grammar knowledge classification task. For example, if classification/prediction of knowledge of english grammar is to be achieved, english text is acquired.
In NLP, the finest granularity is words, which constitute sentences, which constitute paragraphs, chapters, documents, etc. Thus, in an embodiment of the present invention, the natural language text comprises at least one sentence. One sentence includes at least one word. It should be noted that, in the embodiments of the present invention, words, phrases, words and phrases have the same meaning and may be substituted for each other.
It should be noted that the target word is a word to be investigated specified in the natural language text, and may include at least one type of word among verb, noun, adjective, preposition, pronoun, and the like.
For example, one or more verbs in a sentence are designated as target words; as another example, a verb and a noun in a sentence are designated as target words; as another example, a verb, a noun, and a preposition in a sentence are designated as target words. The target word may be other words that the user specifies and needs to examine, or may be a word with a grammar error, which is not listed here.
The position information of the target word refers to a position where the target word is located in one sentence or natural language text.
Alternatively, the position information of the target word may be labeled manually or by machine.
In some embodiments, obtaining the natural language text and the position information of the target word in the natural language text comprises:
acquiring texts in a question library or a student composition database;
and processing the texts in the subject library or the student composition database to obtain the natural language texts and the position information of the target words in the natural language texts.
The text can be obtained from a topic library or a student composition database, and then the obtained text is processed to obtain the natural language text.
Processing the acquired text, including the following modes:
the method I comprises the following steps: under the condition that the obtained text is a question of single item selection, complete shape filling and the like, abnormal conditions such as wrong answers, unanswered questions and the like may exist, so that the wrong text or incomplete text of the question surface is caused, and the prediction of the grammatical knowledge is influenced.
Therefore, in some embodiments, topics such as single-item selection and complete filling are obtained, correct answers are filled in the topic text to form a complete sentence, the sentence is used as a natural language text, the position information of the word for topic investigation is labeled, and the position information of the word for topic investigation is used as the position information of the target word.
When the natural language text is used as a training sample, grammar knowledge related to the investigation words in the subject can be given by a teaching and research expert as a grammar knowledge tag of the natural language text.
The second method comprises the following steps: and under the condition that the obtained text is a read answer text or a student composition, inputting the sentence written by the student into an error detection correction engine, correcting the position of a grammar error in the sentence written by the student to obtain a sentence with correct grammar, marking the position information of the word at the grammar error as a natural language text, and using the position information of the word at the grammar error as the position information of the target word.
When the natural language text is used for training samples, grammar knowledge related to the investigation words in the questions can be given by teaching and research experts and used as grammar knowledge labels of the natural language text.
It should be noted that, in the case that the acquired natural language text is composed of multiple sentences, the acquired natural language text needs to be first divided into sentences, and then each sentence needs to be processed and labeled.
Step 220, vectorizing each sentence in the natural language text to obtain a first feature vector corresponding to each word in the natural language text;
wherein the first feature vector is a vector characterizing text features of the natural language text.
Specifically, each sentence in the natural language text is participled and encoded before vectorization processing is performed on each sentence.
It should be noted that word segmentation means that an input natural language text is divided into single words or subwords by a word segmentation device; for example, "The way retrieved to slot The dictionary sound resonable," The result of tokenizer tokenization is: "The", "way", "referred", "to", "sleep", "The", "recipe", "sources", "resonable" are used to distinguish each word in The text by segmenting The natural language text.
Encoding means that each word, sub-word or special character is encoded by an encoder, so that The input natural language text is converted into a string of numbers, for example, "The", "way", "referred", "to", "slot", "The", "proble", "sources", "recoverable" corresponding to "101", "762", "1029", "3825", "12752", "92", "5271", "2314", "823", "15278", "1019", "6524", "1021" and "102", respectively.
Vectorization refers to converting a string of numbers formed after encoding into a feature vector sequence through a vectorization layer, and based on the feature vector sequence, grammar knowledge prediction can be subsequently performed on the acquired natural language text.
The word segmenter, the encoder and the vectorization layer may be in a BERT model or may be in a network structure implemented separately, which is not limited in the present invention.
In some embodiments, step 220 comprises:
and sequentially carrying out vectorization processing on each sentence in the natural language text to obtain a word vector and a position coding vector corresponding to each word in the natural language text.
It is understood that the first feature vector includes a word vector and a position-coding vector. That is, the text features of the natural language text can be characterized by the word vector and the position encoding vector corresponding to each word in the natural language text.
The word vector is vectorized representation of a word or a sub-word, and the position coding vector is used for representing the position of the word in a sentence.
The position coding vector can be used for text classification, and in the text classification process, the position information of the target word is matched with the position coding vector of each word in the acquired natural language text, so that the feature vector of the target word is conveniently extracted.
In the embodiment of the invention, word segmentation, coding and vectorization processing are sequentially carried out on each sentence in the natural language text to obtain a word vector and a position coding vector corresponding to each word in the natural language text, and the word vector and the position coding vector reflect the part-of-speech characteristics of the natural language text and the position characteristics of each word forming the text, so that the method can be used for carrying out grammar knowledge classification on target words subsequently.
Step 230, performing dependency syntax analysis on each sentence in the natural language text to obtain dependency syntax information corresponding to each word in the natural language text, and performing vectorization processing on the dependency syntax information to obtain a dependency feature vector corresponding to each word in the natural language text;
syntax knowledge prediction of sentences with complex sentence patterns is difficult to achieve only by a single text feature, and therefore, the embodiment of the present invention also performs dependency syntax analysis on each sentence in the natural language text.
The syntactic analysis is one of key technologies in natural language processing, and mainly analyzes the syntactic structure of a sentence and the dependency relationship among words, wherein the syntactic structure of the sentence refers to analyzing the main predicate object, the fixed shape complement and the like of the sentence, and the dependency relationship among the words comprises: parallel, dependent, progressive, etc.
Common syntactic analysis tasks include:
1) Syntactic structure analysis, also called phrase structure analysis or constituent sentence analysis, is used to identify phrase structures in sentences and hierarchical syntactic relations between phrases.
2) Dependency parsing (dependency parsing), also called dependency parsing for short, parses a sentence into a dependency syntax tree to describe the dependency relationship between words, i.e. to indicate the syntactic collocation relationship between words, which is associated with semantics and serves to identify the interdependency relationship between words in the sentence.
3) Deep Grammar syntax analysis, i.e. utilizing deep grammars such as Lexical Tree Adjacent Grammar (LTAG), lexical Functional Grammar (LFG), combinatorial Category Grammar (CCG), etc., is used to perform deep syntax and semantic analysis on sentences.
In an embodiment of the present invention, dependency parsing is performed on each sentence in the natural language text, and interdependencies between words of each sentence are identified to perform grammar knowledge classification using the interdependencies.
And performing dependency syntax analysis on each sentence in the natural language text to obtain dependency syntax information corresponding to each word in the natural language text.
The dependency syntax information is the representation information of the interdependence relationship between the vocabularies in the sentence;
and coding and vectorizing the dependency syntax information to obtain a dependency feature vector, wherein the dependency feature vector is a feature vector for expressing the interdependency relationship between words in the sentence.
In the dependency syntax, a verb in the "predicate" is considered to be the center of a sentence, and the "dependency" refers to a relationship between words and their dominance, which is not equivalent, and has a direction. Specifically, the dominant component is called the dominant (or head) component, and the dominant component is called the subordinate (or dependency) component.
In some embodiments, dependency syntax information includes:
1) Position information of another word on which a word depends syntactically in the sentence, for example, the position information may be represented by a position index;
2) The relationship between a word and another word on which the word depends grammatically, such as "slave sentence modifier of noun", "noun subject", can be expressed by deprel.
In some embodiments, dependency syntax information includes part-of-speech information for words in addition to 1) and 2) above. The part of speech is used as the basic syntactic attribute of a word and is the key characteristic of the word and a sentence.
Further, in step 230, performing vectorization processing on the dependency syntax information to obtain a dependency feature vector corresponding to each word in the natural language text, including:
vectorizing the position information in the dependency syntax information to obtain a dependency position vector corresponding to each word in the natural language text;
and vectorizing the relation in the dependency syntax information to obtain a dependency relation vector corresponding to each word in the natural language text.
Before vectorization processing is performed on the dependency syntax information, encoding the dependency syntax information is further included.
Fig. 3 is an exemplary diagram of dependency syntax information provided by an embodiment of The present invention, in which dependency syntax analysis is performed on a sentence of english text "The way referred to to The solution found in The solution referenced accessible, and The obtained dependency syntax information is, for example, as shown in fig. 4, where The word" accessible "is a center (root) of The sentence, the word" solved "that The word" to "depends on in syntax has a position tag, the dependency relationship between The word" referred "and The word" way "is" slave sentence modification phrase ", the dependency relationship between The word" solved "and The word" referred "is" subject phrase modification word ", and The dependency relationship between The word" way "and The word" referred "is" subject phrase "; the part of speech corresponding to The word "The" is "DT", "DT" represents "destinker", i.e., "qualifier", the part of speech corresponding to The word "way" is "NN", "NN" represents "Noun, singular or mass", i.e., "Noun, singular form", the part of speech corresponding to The word "referred" is "VBD", "VBD" represents "Verb, past, i.e.," Verb past form ", and The parts of speech of other words are not repeated herein.
By introducing dependency syntax analysis and coding and vectorizing the obtained dependency syntax information, the dependency position vector and the dependency relationship vector corresponding to each word in the natural language text are obtained, and the richness of the grammar feature vector information is improved, so that the model can learn the dependency relationship among the words in a sentence more easily, understand the relationship between the sentence structure and the words, and effectively improve the accuracy of grammar knowledge prediction.
It should be noted that step 230 may also be performed before step 220.
And 240, fusing the first feature vector and the dependency feature vector to obtain a fused vector, and performing grammar knowledge classification based on the fused vector and the position information of the target word to obtain grammar knowledge corresponding to the target word.
On the basis of a single feature vector, grammar knowledge prediction is difficult to perform on the text of the complex sentence pattern, therefore, the embodiment of the invention fuses the dependency feature vector on the basis of the first feature vector, and the obtained fused vector information can more abundantly represent the feature information of the natural language text, thereby improving the accuracy of grammar knowledge prediction on the text of the complex sentence pattern.
The step of fusing the first feature vector and the dependency feature vector refers to adding the first feature vector and the dependency feature vector to obtain a fused vector, wherein the fused vector is a feature vector set corresponding to each word in each sentence in the natural language text.
Optionally, the merged vector includes a word vector, a position-coding vector, a dependency position vector, and a dependency relationship vector corresponding to each word in the natural language text. Optionally, the grammar knowledge classification includes: at least one of a grammar classification, an error type classification, and a phrase classification.
It should be noted that in the intelligent education service, the grammar classification is an important application of text classification, the task requires to input any sentence, specify a word in the sentence, and the model needs to predict the grammar related to the word; the error type classification is another application of text classification in the intelligent education service, and under the scenes of student composition correction and the like, the task requires to input sentences containing errors and error positions and predict which error type the errors belong to; the phrase classification is to predict which key phrase is included in the input english sentence.
Accordingly, knowledge of the grammar includes: at least one of syntax information, error type information, and phrase information.
It should be noted that the grammar knowledge and the grammar knowledge classification are correspondingly consistent. That is, the grammar knowledge includes grammar information if the grammar knowledge classification includes the grammar classification, and the grammar knowledge includes the grammar information and the error type information if the grammar knowledge classification includes the grammar classification and the error type classification.
In the case of single task classification, the grammar knowledge classification may be a grammar classification, an error type classification, or a phrase classification; the corresponding grammar knowledge may be grammar information, error type information, or phrase information.
In the case of multitask classification, the grammar knowledge classification may be at least two of a grammar classification, an error type classification, and a phrase classification. For example, the grammar knowledge classification may be a grammar classification and an error type classification, and the corresponding grammar knowledge may be grammar information and error type information; the grammar knowledge classification can also be grammar classification and phrase classification, and the corresponding grammar knowledge can be grammar information and phrase information. The grammar knowledge classification can also be grammar classification, error type classification and phrase classification, and the corresponding grammar knowledge is grammar information, error type information and phrase information.
Because certain correlation exists among the error types, the grammars and the phrases, at least two items of grammar information, error type information and phrase information can be predicted simultaneously by using a single model to carry out multi-task learning, the respective prediction effects can be mutually promoted, and the accuracy of grammar knowledge prediction is improved.
Grammar prediction, error type prediction and phrase prediction can be used for business scenes such as student learning situation analysis, related knowledge point recommendation and the like. The syntactic knowledge prediction method provided by the embodiment of the invention obtains the first characteristic vector of each word in the text by performing word segmentation, coding and vectorization processing on the acquired natural language text, obtains the dependency characteristic vector of each word in the text by performing dependency syntactic analysis on the text, fuses the obtained first characteristic vector and the dependency characteristic vector, performs multi-task classification on the text simultaneously based on the fused vector and the acquired position information of the target word to obtain the syntactic knowledge corresponding to the target word, and enables a model to understand the relationship between a sentence structure and the word by introducing the dependency syntactic analysis, thereby realizing accurate syntactic knowledge prediction on the natural language text with complex sentence.
It should be noted that each embodiment of the present invention can be freely combined, exchanged in sequence, or executed independently, and does not need to rely on or rely on a fixed execution sequence.
Fig. 4 is a schematic flow diagram of performing syntactic knowledge pre-classification based on the fused vector according to the embodiment of the present invention, that is, performing syntactic knowledge classification based on the fused vector and the position information of the target word to obtain the syntactic knowledge corresponding to the target word, where the process includes:
step 410, extracting features of the fused vector to obtain a second feature vector;
the fused vector is a set of feature vectors corresponding to each word in the natural language text, and comprises a word vector, a position encoding vector, a dependency position vector and a dependency relationship vector.
Optionally, the fused vector is input into a BERT model for feature extraction, so as to obtain a second feature vector.
The second feature vector is a vector representing text features and dependency features of the natural language text, and contains richer feature information, not only text feature information, but also corresponding dependency feature information, compared with the text features corresponding to the first feature vector.
It is understood that the second feature vector can reflect a relationship between part-of-speech information corresponding to each word in the natural language text, position-coding information, position information of another word that is syntactically dependent in the sentence, and another word that is syntactically dependent.
And extracting the features of the fused vector to obtain a second feature vector, so that the syntactic knowledge of the natural language text with complex sentence patterns can be predicted conveniently.
Step 420, obtaining a feature vector corresponding to the target word based on the second feature vector and the position information of the target word;
based on the position information of the target word, a feature vector corresponding to the target word may be extracted from the second feature vector.
And 430, carrying out grammar knowledge classification on the feature vectors corresponding to the target words to obtain grammar knowledge corresponding to the target words.
In the embodiment of the invention, the feature vector corresponding to the target word is obtained by extracting the features of the fused feature vector and based on the position information corresponding to the target word, and the grammar knowledge corresponding to the target word is obtained by carrying out grammar knowledge classification on the feature vector corresponding to the target word, so that the grammar knowledge prediction is realized.
In some embodiments, step 430 comprises:
and inputting the feature vectors corresponding to the target words into the multi-target classification layer for grammar knowledge classification to obtain grammar knowledge corresponding to the target words.
Wherein the multi-objective classification layer includes: at least one of a syntax classification layer, an error type classification layer, and a phrase classification layer. The multi-target classification layer can be one or two of the multi-target classification layers, or the sum of the three multi-target classification layers, and the multi-target classification layer is not developed one by one, so that the grammar knowledge classification and the obtained grammar knowledge are in one-to-one correspondence with each other, single-task prediction or multi-task prediction can be carried out, and the application range is wide.
Fig. 5 is a schematic diagram of a model structure of a multi-objective classification layer according to an embodiment of the present invention, in which the multi-objective classification layer is obtained by training using feature vectors corresponding to target training words in a natural language training text as training data and using grammar knowledge corresponding to the target training words in the natural language training text as training labels.
Fig. 6 is a flowchart illustrating a method for determining a multi-target classification layer according to an embodiment of the present invention, and as shown in fig. 6, the process for determining the multi-target classification layer includes the following steps: step 610, step 620, step 630, step 640 and step 650.
Step 610, acquiring a natural language training text and position information of a target training word in the natural language training text;
the natural language training text and the manner of obtaining the position information of the target training word in the natural language training text may refer to the description in step 210, which is not repeated herein.
Step 620, extracting features of the natural language training text to obtain feature vectors corresponding to the natural language training text;
the feature extraction of the natural language training text may refer to the descriptions in the foregoing steps 220 to 240.
Optionally, the performing feature extraction on the natural language training text to obtain a feature vector corresponding to the natural language training text includes:
performing word segmentation, coding and vectorization processing on each sentence in the natural language training text to obtain a word vector and a position coding vector corresponding to each word in the natural language training text;
performing dependency syntax analysis on each sentence in the natural language training text to obtain dependency syntax information corresponding to each word in the natural language training text, wherein the dependency syntax information comprises: position information of another word on which the word depends in grammar in the sentence, a relation between the word and the other word on which the word depends in grammar, and part-of-speech information of the word;
encoding and vectorizing the position information in the dependency syntax information to obtain a dependency position vector corresponding to each word in the natural language training text, and encoding and vectorizing the relation in the dependency syntax information to obtain a dependency relation vector corresponding to each word in the natural language training text;
fusing a word vector and a position coding vector corresponding to each word in the natural language training text and a dependency position vector and a dependency relationship vector corresponding to each word in the natural language training text to obtain fused vectors;
and extracting the features of the fused vector to obtain a feature vector corresponding to the natural language training text.
Step 630, obtaining a feature vector corresponding to the target training word based on the feature vector corresponding to the natural language training text and the position information of the target training word;
and extracting the characteristic vector corresponding to the target training word from the characteristic vector corresponding to the natural language training text based on the position information of the target training word.
Step 640, determining grammar knowledge corresponding to the target training words;
the grammar knowledge corresponding to the target training word given by the teaching and research expert can be obtained, or the grammar knowledge corresponding to the target training word can be obtained from a preset grammar knowledge base.
The grammatical knowledge of the target training word may include at least one of grammatical information, error type information, and phrase information.
It should be explained that the grammar knowledge corresponding to the determined target training word is used as a training label, i.e. real data, which is convenient for model training.
Step 650, training an initial multi-target classification layer by taking the feature vectors corresponding to the target training words as training data and grammar knowledge corresponding to the target training words as training labels; and obtaining the multi-target classification layer after the training of the initial multi-target classification layer is finished.
Optionally, the initial multi-target classification layer includes: at least one of a syntax classification layer, an error type classification layer, and a phrase classification layer.
In some embodiments, the initial multi-objective classification layer comprises: a syntax classification layer, an error type classification layer, and a phrase classification layer.
The method comprises the steps of taking a feature vector corresponding to a target training word as training data, taking grammatical information, error type information and phrase information corresponding to the target training word as training labels, combining label lists of a plurality of subtasks into one label list, combining classification layers corresponding to the plurality of subtasks into a multi-target classification layer, wherein the dimension of the multi-target classification layer is equal to the sum of the dimensions of the classification layers corresponding to the subtasks, and therefore the plurality of classification tasks are converted into a single multi-label classification task, alternate training of multi-label text data is facilitated, the training process is simplified, and the model training efficiency is improved.
In some embodiments, training the initial multi-objective classification layer in step 650 includes:
inputting the feature vector corresponding to the target training word into at least one of a grammar classification layer, an error type classification layer and a phrase classification layer to obtain a grammar knowledge prediction result;
calculating a first loss function value by using a multi-label cross entropy loss function based on a grammar knowledge prediction result and grammar knowledge corresponding to a target training word;
and updating the parameters of the initial multi-objective classification layer based on the first loss function value.
And the grammar knowledge prediction result comprises at least one of a grammar prediction result, an error type prediction result and a phrase prediction result.
In the embodiment of the invention, a grammar knowledge prediction result is used as prediction data, grammar knowledge corresponding to a target training word is used as real data, a multi-label cross entropy loss function is used for calculating a loss function value, parameters of an initial multi-target classification layer are updated based on the loss function value, and after training is finished, the trained initial multi-target classification layer is obtained, namely the multi-target classification layer which can be used for carrying out grammar knowledge classification is obtained.
Under the condition that the initial multi-target classification layer comprises at least two classification layers, the alternative training of different data sources can be realized by using the multi-label cross entropy loss function, and the loss of the classification layer corresponding to each subtask is dynamically balanced.
Fig. 7 is a second schematic diagram of a model structure of a multi-target classification layer according to an embodiment of the present invention, in which the multi-target classification layer is obtained by training using feature vectors corresponding to target training words in a natural language training text as training data and using grammar knowledge and part-of-speech information corresponding to the target training words in the natural language training text as training labels, where the part-of-speech information is obtained by performing dependency parsing on the natural language training text.
Fig. 8 is a second flowchart of the method for determining a multi-target classification layer according to the embodiment of the present invention, as shown in fig. 8, the process of determining the multi-target classification layer includes the following steps: step 810, step 820, step 830, step 840 and step 850.
Step 810, acquiring a natural language training text and position information of a target training word in the natural language training text;
step 820, extracting the features of the natural language training text to obtain a feature vector corresponding to the natural language training text;
step 830, obtaining a feature vector corresponding to the target training word based on the feature vector corresponding to the natural language training text and the position information of the target training word;
step 840, determining grammar knowledge and part-of-speech information corresponding to the target training word; the part-of-speech information corresponding to the target training word is obtained according to dependency syntax analysis;
step 850, training an initial multi-target classification layer by taking the feature vector corresponding to the target training word as training data and grammar knowledge and part of speech information corresponding to the target training word as training labels; the initial multi-objective classification layer comprises: at least one of a grammar classification layer, an error type classification layer and a phrase classification layer, and a part-of-speech classification layer;
860, after the training of the initial multi-target classification layer is completed, at least one of a grammar classification layer, an error type classification layer and a phrase classification layer in the initial multi-target classification layer after the training is completed is reserved to obtain the multi-target classification layer.
Steps 810-860 may refer to steps 610-650, which are not described herein. The difference is that the present embodiment introduces part-of-speech prediction as a subtask on the basis of grammar knowledge prediction: in step 840, determining grammar knowledge and part-of-speech information corresponding to the target training word; in step 850, grammar knowledge and part-of-speech information corresponding to the target training word are used as training labels, wherein the part-of-speech information is obtained in the process of performing dependency parsing on the natural language training text; in step 860, after the training of the initial multi-objective classification layer is completed, at least one of the grammar classification layer, the error type classification layer and the phrase classification layer in the initial multi-objective classification layer after the training is completed is retained to obtain the multi-objective classification layer, i.e. in the application stage of the multi-objective classification layer, the part of speech classification layer is not needed.
Because the part-of-speech information and the grammar knowledge also have certain correlation, the embodiment achieves the purpose of training a multi-objective classification layer by additionally introducing part-of-speech prediction as a subtask, and can effectively improve the effect of grammar knowledge prediction so as to solve the problem that complex sentence grammar is difficult to predict.
In the embodiment of the invention, the feature vector corresponding to the target training word in the natural language training text is used as training data, meanwhile, grammar knowledge and part of speech information corresponding to the target training word are used as training labels for training, part of speech prediction is introduced as a subtask to train the multi-target classification layer on the basis of grammar knowledge prediction, so that the parameters of the multi-target classification layer are optimized conveniently, and the accuracy of the multi-target classification layer on grammar knowledge prediction is improved.
In some embodiments, training the initial multi-objective classification layer in step 850 includes:
inputting the feature vector corresponding to the target training word into at least one of a grammar classification layer, an error type classification layer and a phrase classification layer to obtain a grammar knowledge prediction result;
inputting the feature vector corresponding to the target training word into a part-of-speech classification layer to obtain a part-of-speech prediction result;
calculating a first loss function value by using a multi-label cross entropy loss function based on a grammar knowledge prediction result and grammar knowledge corresponding to a target training word;
calculating a second loss function value based on the part of speech prediction result and part of speech information corresponding to the target training word;
and updating the parameters of the initial multi-objective classification layer based on the first loss function value and the second loss function value.
And the grammar knowledge prediction result comprises at least one of a grammar prediction result, an error type prediction result and a phrase prediction result.
In the embodiment of the invention, a grammar knowledge prediction result and a part-of-speech prediction result are used as prediction data, grammar knowledge and part-of-speech information corresponding to a target training word are used as real data, a multi-label cross entropy loss function is used for calculating loss before the grammar knowledge prediction result and the grammar knowledge corresponding to the target training word to obtain a first loss function value, a single-label cross entropy loss function or a multi-label cross entropy loss function is used for calculating loss between the part-of-speech prediction result and the part-of-speech information corresponding to the target training word to obtain a second loss function value, parameters of an initial multi-target classification layer are updated based on the first loss function value and the second loss function value, after training is finished, the trained initial multi-target classification layer is obtained, at least one of the grammar classification layer, the error type classification layer and the phrase classification layer is reserved or the part-of-speech classification layer is removed, and the multi-target classification layer capable of carrying out grammar knowledge classification is obtained.
In the embodiment of the invention, the parameters of the initial multi-target classification layer are updated based on the first loss function value and the second loss function value, so that the multi-target classification layer is optimized conveniently, and the prediction effect of the multi-target classification layer on grammar knowledge is further improved.
In some embodiments, the multi-label cross entropy loss function is:
Figure BDA0004009164930000171
wherein s is i Indicates the score, s, corresponding to the ith category j The scores corresponding to the jth category are shown, i represents the ith category, j represents the jth category, neg represents a non-target category set, and pos represents a target category set.
Because the input data are from different data sources of different subtasks, the labels carried by the data may be inconsistent, for example, some labels corresponding to the natural speech training text only include grammar information, other labels corresponding to the natural speech training text only include error type information, and other labels corresponding to the natural speech training text include grammar information and error type information and phrase information. Therefore, in training, it is necessary to control the range of the loss calculation using a loss mask (loss mask) on the loss function.
Meanwhile, due to the fact that the number of labels of different subtasks is different (for example, the number of grammar knowledge points and phrase categories is hundreds, and the number of error types is only dozens), the problem that the loss is small and large can be caused when the loss is calculated by data of different sources.
In order to solve the problem, 0 and 1 are not simply used as the values of the mask when the loss mask is calculated, but the number of the labels contained in each sample is considered, the average value of 0 and the number of the labels is used as the value of the loss mask, and finally the loss subjected to gradient reduction is the product of the original loss and the loss mask, so that the stability of the loss value is ensured.
Wherein a loss mask is used on the multi-label cross-entropy loss function to control a range of loss calculations, the loss mask being determined according to the following formula:
Figure BDA0004009164930000172
wherein t represents the t-th task, mask t Mask of form 0-1 'representing the t-th task' t In order to update the dynamic mask of the weighted value, the task is a grammar classification task, an error type classification task or a phrase classification task.
And finally, performing gradient descent on the dynamic loss function value, wherein the dynamic loss function value is the product of the original loss function value and the loss mask, and the calculation formula of the dynamic loss function value is as follows:
Loss′=Loss*mask′ t
wherein, loss is original Loss function value, mask' t A dynamic mask whose weight values are updated.
FIG. 9 is a diagram illustrating a grammar knowledge classification model using mask loss to control loss computation according to an embodiment of the present invention; as shown in fig. 7, the grammar knowledge classification model includes a word segmentation and coding module, a dependency parsing module, a BERT model, a multi-objective classification layer, and a part-of-speech classification layer, and is configured to perform steps 810 to 860, control a range of Loss computation using a Loss mask on a multi-label cross entropy Loss function, obtain dynamic mask values of different classification tasks in the multi-objective classification layer, and finally perform gradient descent to obtain a dynamic Loss function value Loss1, which is a product of an original Loss function value and the Loss mask, and Loss2, which is a Loss function value of the part-of-speech classification layer.
Table 1 shows an example of dynamic mask loss provided by The embodiment of The present invention, as shown in table 1, the inputs are "[ CLS ] The way transferred to The solution of The solution ] [ SEP ] The way transferred to The solution of The solution ], the" to "and" solution "are designated as target words, and after The processing by The grammar knowledge classification model, the output grammar knowledge is" grammar label: indefinite as fixed language "," error type label: tense error "and" phrase label: none ", the original mask value for the syntax classification task is [0,1, \8230;, 0], the original mask value for the error type classification task is [0,1,0, \8230;, 0], the original mask value of the phrase classification task is [0,1, \8230;, 0], the extent of the penalty computation is controlled using a penalty mask on the multi-tag cross-entropy penalty function, the resulting dynamic mask values for the grammar classification tasks are [0.0967, \ 8230;, 0.0967], the dynamic mask values for the error type classification tasks are [0.9032, \82300.9032 ], the dynamic mask value for the phrase classification task is [0,1, \8230;, 0.
Table 1 dynamic mask penalty example
Figure BDA0004009164930000181
In the embodiment of the invention, 0 and 1 are not simply used as mask values when the loss mask is calculated, but the number of labels contained in each sample is considered, and the average value of 0 and the number of labels is used as the value of the loss mask, so that the loss function of each subtask can be automatically balanced, and the problem of unbalanced category of data of different sources when the loss function value is calculated is solved.
FIG. 10 is a block diagram of an apparatus for performing a syntax knowledge prediction method according to an embodiment of the present invention; in the embodiment, a natural language training text is obtained from a question and student composition database, the obtained natural language training text is processed through a data generating and training module, the training text is obtained after processing, dependency syntax analysis is carried out on the training text through a syntax analysis module, dependency syntax information of each sentence is obtained, the training text is used as training data to be input into a training model for training, and the dependency syntax information of each sentence obtained by the syntax analysis module is introduced in the training process of an application model; when the method is used, a student is input to make a sentence or an English topic, and the input text is inferred through the trained application model to obtain related knowledge points and error types of the text and phrases contained in the text.
In the embodiment of the invention, three tasks of grammar classification, error type classification and phrase classification are simultaneously carried out through the single model, so that the student composition can be corrected, the error type can be determined, relevant grammar knowledge points and phrases in the student composition and the subject can be predicted, and grammar learning and learning situation analysis are facilitated.
The following describes the syntax knowledge prediction apparatus provided in the embodiment of the present invention, and the syntax knowledge prediction apparatus described below and the syntax knowledge prediction method described above can be referred to correspondingly.
Fig. 11 is a schematic structural diagram of a syntax knowledge prediction apparatus according to the present invention, and as shown in fig. 11, the syntax knowledge prediction apparatus 1100 includes:
an acquisition unit 1110 that acquires a natural language text and position information of a target word in the natural language text;
the text feature extraction unit 1120 is configured to perform vectorization processing on each sentence in the natural language text to obtain a first feature vector corresponding to each word in the natural language text;
a dependency syntax analysis unit 1130, configured to perform dependency syntax analysis on each sentence in the natural language text to obtain dependency syntax information corresponding to each word in the natural language text, and perform vectorization processing on the dependency syntax information to obtain a dependency feature vector corresponding to each word in the natural language text;
and a classifying unit 1140, configured to fuse the first feature vector and the dependency feature vector to obtain a fused vector, and perform grammar knowledge classification based on the fused vector and the position information of the target word to obtain grammar knowledge corresponding to the target word.
Optionally, performing vectorization processing on each sentence in the natural language text to obtain a first feature vector corresponding to each word in the natural language text, including:
and sequentially carrying out vectorization processing on each sentence in the natural language text to obtain a word vector and a position coding vector corresponding to each word in the natural language text.
Optionally, the dependency syntax information includes: position information of another word on which the word depends syntactically in the sentence, a relationship between the word and another word on which the word depends syntactically;
vectorizing the dependency syntax information to obtain a dependency feature vector corresponding to each word in the natural language text, including:
vectorizing the position information in the dependency syntax information to obtain a dependency position vector corresponding to each word in the natural language text;
and vectorizing the relation in the dependency syntax information to obtain a dependency relation vector corresponding to each word in the natural language text.
Optionally, the classifying grammar knowledge based on the fused vector and the position information of the target word to obtain grammar knowledge corresponding to the target word includes:
extracting the features of the fused vector to obtain a second feature vector;
obtaining a feature vector corresponding to the target word based on the second feature vector and the position information of the target word;
and carrying out grammar knowledge classification on the feature vectors corresponding to the target words to obtain grammar knowledge corresponding to the target words.
Optionally, performing grammar knowledge classification on the feature vectors corresponding to the target word to obtain grammar knowledge corresponding to the target word, including:
inputting the feature vectors corresponding to the target words into a multi-target classification layer to classify grammar knowledge to obtain grammar knowledge corresponding to the target words;
the multi-target classification layer is obtained by training by taking a feature vector corresponding to a target training word in a natural language training text as training data and taking grammatical knowledge corresponding to the target training word in the natural language training text as a training label;
or the multi-target classification layer is obtained by training by taking the feature vector corresponding to the target training word in the natural language training text as training data and taking grammatical knowledge and part-of-speech information corresponding to the target training word in the natural language training text as training labels, wherein the part-of-speech information is obtained by performing dependency syntactic analysis on the natural language training text.
Optionally, the determining of the multi-target classification layer includes:
acquiring a natural language training text and position information of a target training word in the natural language training text;
extracting features of the natural language training text to obtain feature vectors corresponding to the natural language training text;
obtaining a feature vector corresponding to a target training word based on the feature vector corresponding to the natural language training text and the position information of the target training word;
determining grammar knowledge corresponding to the target training word;
training an initial multi-target classification layer by taking the feature vectors corresponding to the target training words as training data and grammar knowledge corresponding to the target training words as training labels, wherein the initial multi-target classification layer comprises: at least one of a syntax classification layer, an error type classification layer, and a phrase classification layer;
and obtaining the multi-target classification layer after the training of the initial multi-target classification layer is completed.
Optionally, the determining of the multi-target classification layer includes:
acquiring a natural language training text and position information of a target training word in the natural language training text;
extracting the features of the natural language training text to obtain a feature vector corresponding to the natural language training text;
obtaining a feature vector corresponding to a target training word based on the feature vector corresponding to the natural language training text and the position information of the target training word;
determining grammar knowledge and part-of-speech information corresponding to the target training word, wherein the part-of-speech information corresponding to the target training word is obtained according to dependency syntax analysis;
training an initial multi-target classification layer by taking the feature vectors corresponding to the target training words as training data and taking grammatical knowledge and part-of-speech information corresponding to the target training words as training labels, wherein the initial multi-target classification layer comprises: at least one of a grammar classification layer, an error type classification layer and a phrase classification layer, and a part-of-speech classification layer;
after the training of the initial multi-target classification layer is completed, at least one of a grammar classification layer, an error type classification layer and a phrase classification layer in the initial multi-target classification layer after the training is completed is reserved, and the multi-target classification layer is obtained.
Optionally, training an initial multi-objective classification layer comprises:
inputting the feature vector corresponding to the target training word into at least one of a grammar classification layer, an error type classification layer and a phrase classification layer to obtain a grammar knowledge prediction result, wherein the grammar knowledge prediction result comprises at least one of a grammar prediction result, an error type prediction result and a phrase prediction result;
calculating a first loss function value by using a multi-label cross entropy loss function based on a grammar knowledge prediction result and grammar knowledge corresponding to a target training word;
and updating the parameters of the initial multi-objective classification layer based on the first loss function value.
It should be noted that the syntax knowledge prediction apparatus provided in the embodiment of the present invention can implement all the method steps implemented in the syntax knowledge prediction method embodiment, and can achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as those in the method embodiment in this embodiment are not repeated herein.
Fig. 12 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 12: a processor (processor) 1210, a communication Interface (Communications Interface) 1220, a memory (memory) 1230, and a communication bus 1240, wherein the processor 1210, the communication Interface 1220, and the memory 1230 communicate with each other via the communication bus 1240. Processor 1210 may invoke logic instructions in memory 1230 to perform a grammar knowledge prediction method comprising: acquiring a natural language text and position information of a target word in the natural language text; vectorizing each sentence in the natural language text to obtain a first feature vector corresponding to each word in the natural language text; performing dependency syntax analysis on each sentence in the natural language text to obtain dependency syntax information corresponding to each word in the natural language text, and performing vectorization processing on the dependency syntax information to obtain a dependency feature vector corresponding to each word in the natural language text; and fusing the first characteristic vector and the dependency characteristic vector to obtain a fused vector, and classifying grammar knowledge based on the fused vector and the position information of the target word to obtain grammar knowledge corresponding to the target word.
In addition, the logic instructions in the memory 1230 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention or a part thereof which substantially contributes to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.
The electronic device provided by the embodiment of the present invention can implement all the method steps implemented by the above syntax knowledge prediction method embodiment, and can achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as the method embodiment in this embodiment are not repeated here.
In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer-readable storage medium, the computer program, when executed by a processor, being capable of executing the method for predicting grammatical knowledge provided by the above-mentioned method embodiments, the method comprising: acquiring a natural language text and position information of a target word in the natural language text; vectorizing each sentence in the natural language text to obtain a first feature vector corresponding to each word in the natural language text; performing dependency syntax analysis on each sentence in the natural language text to obtain dependency syntax information corresponding to each word in the natural language text, and performing vectorization processing on the dependency syntax information to obtain a dependency feature vector corresponding to each word in the natural language text; and fusing the first characteristic vector and the dependency characteristic vector to obtain a fused vector, and carrying out grammar knowledge classification based on the fused vector and the position information of the target word to obtain grammar knowledge corresponding to the target word.
The computer program product provided by the embodiment of the present invention can implement all the method steps implemented by the above syntax knowledge prediction method embodiment, and can achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as those of the method embodiment in this embodiment are not repeated herein.
In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to perform the method for predicting grammar knowledge provided by the above method embodiments, the method comprising: acquiring a natural language text and position information of a target word in the natural language text; vectorizing each sentence in the natural language text to obtain a first feature vector corresponding to each word in the natural language text; performing dependency syntax analysis on each sentence in the natural language text to obtain dependency syntax information corresponding to each word in the natural language text, and performing vectorization processing on the dependency syntax information to obtain a dependency feature vector corresponding to each word in the natural language text; and fusing the first characteristic vector and the dependency characteristic vector to obtain a fused vector, and classifying grammar knowledge based on the fused vector and the position information of the target word to obtain grammar knowledge corresponding to the target word.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (11)

1. A method for predicting grammar knowledge, comprising:
acquiring a natural language text and position information of a target word in the natural language text;
vectorizing each sentence in the natural language text to obtain a first feature vector corresponding to each word in the natural language text;
performing dependency syntax analysis on each sentence in the natural language text to obtain dependency syntax information corresponding to each word in the natural language text, and performing vectorization processing on the dependency syntax information to obtain a dependency feature vector corresponding to each word in the natural language text;
and fusing the first feature vector and the dependency feature vector to obtain a fused vector, and carrying out grammar knowledge classification based on the fused vector and the position information of the target word to obtain grammar knowledge corresponding to the target word.
2. The method of claim 1, wherein the vectorizing each sentence in the natural language text to obtain a first feature vector corresponding to each word in the natural language text comprises:
and sequentially carrying out vectorization processing on each sentence in the natural language text to obtain a word vector and a position coding vector corresponding to each word in the natural language text.
3. The syntax knowledge prediction method of claim 1, wherein the dependency syntax information comprises: position information of another word in the sentence on which the word is grammatically dependent, a relationship between the word and another word on which the word is grammatically dependent;
the vectorizing the dependency syntax information to obtain the dependency feature vector corresponding to each word in the natural language text includes:
vectorizing the position information in the dependency syntax information to obtain a dependency position vector corresponding to each word in the natural language text;
vectorizing the relationship in the dependency syntax information to obtain a dependency relationship vector corresponding to each word in the natural language text.
4. The method according to any one of claims 1 to 3, wherein the performing grammar knowledge classification based on the fused vector and the position information of the target word to obtain grammar knowledge corresponding to the target word comprises:
extracting the features of the fused vector to obtain a second feature vector;
obtaining a feature vector corresponding to the target word based on the second feature vector and the position information of the target word;
and carrying out grammar knowledge classification on the feature vectors corresponding to the target words to obtain grammar knowledge corresponding to the target words.
5. The method of claim 4, wherein the classifying the feature vectors corresponding to the target words into grammar knowledge to obtain the grammar knowledge corresponding to the target words comprises:
inputting the feature vectors corresponding to the target words into a multi-target classification layer for grammar knowledge classification to obtain grammar knowledge corresponding to the target words;
the multi-target classification layer is obtained by training by taking a feature vector corresponding to a target training word in a natural language training text as training data and taking grammatical knowledge corresponding to the target training word in the natural language training text as a training label;
or the multi-target classification layer is obtained by training by taking the feature vector corresponding to the target training word in the natural language training text as training data and taking grammatical knowledge and part-of-speech information corresponding to the target training word in the natural language training text as training labels, wherein the part-of-speech information is obtained by performing dependency syntax analysis on the natural language training text.
6. The method of predicting grammar knowledge according to claim 5, wherein the determining process of the multi-objective classification layer includes:
acquiring a natural language training text and position information of a target training word in the natural language training text;
extracting the features of the natural language training text to obtain a feature vector corresponding to the natural language training text;
obtaining a feature vector corresponding to the target training word based on the feature vector corresponding to the natural language training text and the position information of the target training word;
determining grammar knowledge corresponding to the target training words;
training an initial multi-target classification layer by taking the feature vector corresponding to the target training word as training data and grammar knowledge corresponding to the target training word as a training label, wherein the initial multi-target classification layer comprises: at least one of a syntax classification layer, an error type classification layer, and a phrase classification layer;
and obtaining the multi-target classification layer after the training of the initial multi-target classification layer is finished.
7. The method of predicting grammar knowledge according to claim 5, wherein the determining process of the multi-objective classification layer includes:
acquiring a natural language training text and position information of a target training word in the natural language training text;
extracting the features of the natural language training text to obtain a feature vector corresponding to the natural language training text;
obtaining a feature vector corresponding to the target training word based on the feature vector corresponding to the natural language training text and the position information of the target training word;
determining grammar knowledge and part-of-speech information corresponding to the target training word, wherein the part-of-speech information corresponding to the target training word is obtained according to dependency syntax analysis;
training an initial multi-target classification layer by taking the feature vector corresponding to the target training word as training data and taking grammar knowledge and part of speech information corresponding to the target training word as training labels, wherein the initial multi-target classification layer comprises: at least one of a grammar classification layer, an error type classification layer and a phrase classification layer, and a part-of-speech classification layer;
after the training of the initial multi-target classification layer is completed, at least one of a grammar classification layer, an error type classification layer and a phrase classification layer in the trained initial multi-target classification layer is reserved, and the multi-target classification layer is obtained.
8. The method of predicting grammatical knowledge of claim 6, wherein training the initial multi-objective classification layer comprises:
inputting the feature vector corresponding to the target training word into at least one of the grammar classification layer, the error type classification layer and the phrase classification layer to obtain a grammar knowledge prediction result, wherein the grammar knowledge prediction result comprises at least one of a grammar prediction result, an error type prediction result and a phrase prediction result;
calculating a first loss function value by using a multi-label cross entropy loss function based on the grammar knowledge prediction result and grammar knowledge corresponding to the target training word;
and updating the parameters of the initial multi-objective classification layer based on the first loss function value.
9. A grammar knowledge prediction apparatus, comprising:
an acquisition unit configured to acquire a natural language text and position information of a target word in the natural language text;
the text feature extraction unit is used for vectorizing each sentence in the natural language text to obtain a first feature vector corresponding to each word in the natural language text;
the dependency syntax analysis unit is used for carrying out dependency syntax analysis on each sentence in the natural language text to obtain dependency syntax information corresponding to each word in the natural language text, and carrying out vectorization processing on the dependency syntax information to obtain a dependency feature vector corresponding to each word in the natural language text;
and the classification unit is used for fusing the first feature vector and the dependency feature vector to obtain a fused vector, and performing grammar knowledge classification based on the fused vector and the position information of the target word to obtain grammar knowledge corresponding to the target word.
10. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the syntax knowledge prediction method of any one of claims 1 to 8 when executing the program.
11. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the syntax knowledge prediction method according to any one of claims 1 to 8.
CN202211644221.3A 2022-12-20 2022-12-20 Grammar knowledge prediction method, grammar knowledge prediction device, electronic equipment and storage medium Pending CN115906818A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211644221.3A CN115906818A (en) 2022-12-20 2022-12-20 Grammar knowledge prediction method, grammar knowledge prediction device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211644221.3A CN115906818A (en) 2022-12-20 2022-12-20 Grammar knowledge prediction method, grammar knowledge prediction device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115906818A true CN115906818A (en) 2023-04-04

Family

ID=86496252

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211644221.3A Pending CN115906818A (en) 2022-12-20 2022-12-20 Grammar knowledge prediction method, grammar knowledge prediction device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115906818A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117610562A (en) * 2024-01-23 2024-02-27 中国科学技术大学 Relation extraction method combining combined category grammar and multi-task learning

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117610562A (en) * 2024-01-23 2024-02-27 中国科学技术大学 Relation extraction method combining combined category grammar and multi-task learning

Similar Documents

Publication Publication Date Title
CN108363790B (en) Method, device, equipment and storage medium for evaluating comments
CN106599032B (en) Text event extraction method combining sparse coding and structure sensing machine
CN111931506B (en) Entity relationship extraction method based on graph information enhancement
CN112069295B (en) Similar question recommendation method and device, electronic equipment and storage medium
KR20190133931A (en) Method to response based on sentence paraphrase recognition for a dialog system
CN112699216A (en) End-to-end language model pre-training method, system, device and storage medium
US20230069935A1 (en) Dialog system answering method based on sentence paraphrase recognition
CN111831789A (en) Question-answer text matching method based on multilayer semantic feature extraction structure
CN113657123A (en) Mongolian aspect level emotion analysis method based on target template guidance and relation head coding
CN115204143B (en) Method and system for calculating text similarity based on prompt
CN116661805B (en) Code representation generation method and device, storage medium and electronic equipment
CN115757695A (en) Log language model training method and system
CN113361252B (en) Text depression tendency detection system based on multi-modal features and emotion dictionary
CN115906818A (en) Grammar knowledge prediction method, grammar knowledge prediction device, electronic equipment and storage medium
CN113065352B (en) Method for identifying operation content of power grid dispatching work text
CN116483314A (en) Automatic intelligent activity diagram generation method
Lee Natural Language Processing: A Textbook with Python Implementation
CN115774782A (en) Multilingual text classification method, device, equipment and medium
CN114443818A (en) Dialogue type knowledge base question-answer implementation method
CN114896973A (en) Text processing method and device and electronic equipment
CN113569560A (en) Automatic scoring method for Chinese bilingual composition
CN114154497A (en) Language disease identification method and device, electronic equipment and storage medium
CN113468875A (en) MNet method for semantic analysis of natural language interaction interface of SCADA system
CN113742445A (en) Text recognition sample obtaining method and device and text recognition method and device
Basumatary et al. Deep Learning Based Bodo Parts of Speech Tagger

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination