CN116681061A - English grammar correction technology based on multitask learning and attention mechanism - Google Patents

English grammar correction technology based on multitask learning and attention mechanism Download PDF

Info

Publication number
CN116681061A
CN116681061A CN202310630375.5A CN202310630375A CN116681061A CN 116681061 A CN116681061 A CN 116681061A CN 202310630375 A CN202310630375 A CN 202310630375A CN 116681061 A CN116681061 A CN 116681061A
Authority
CN
China
Prior art keywords
model
english
training
grammar
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310630375.5A
Other languages
Chinese (zh)
Inventor
赵铁军
朱聪慧
曹海龙
刘梓航
徐冰
杨沐昀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN202310630375.5A priority Critical patent/CN116681061A/en
Publication of CN116681061A publication Critical patent/CN116681061A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/232Orthographic correction, e.g. spell checking or vowelisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

An English grammar correction technology based on a multitask learning and attention mechanism relates to an English grammar correction technology. The invention aims to solve the problems that the existing English grammar correction technology has poor adaptability and is inaccurate in grammar correction of some complex sentences. The method comprises the following steps: for an input sentence, reading an English word segmentation vocabulary and an editing tag vocabulary from a database; inputting the sentence into a pre-training coding model to obtain the context representation of the whole sentence; the obtained context feature vector passes through a self-attention layer; judging whether the input sub-words need editing operation or not, and classifying editing labels of the input sub-words by using a classifier with the size of a vocabulary; and carrying out post-processing on words in the input sentence according to the meaning corresponding to the correction label predicted by the model, and inputting the obtained post-processing result into the model for multiple iterations to obtain a final result. The invention belongs to the technical field of natural language processing.

Description

English grammar correction technology based on multitask learning and attention mechanism
Technical Field
The invention relates to an English grammar correction technology, and belongs to the technical field of natural language processing.
Background
Along with popularization of technologies such as internet and mobile equipment, the application range of English is wider and wider, and English grammar correction technology is also receiving more and more attention. Correct english grammar is critical for efficient communication. In writing or spoken language communication, grammatical errors can confuse the original meaning, resulting in confusion or misinterpretation. By correcting grammar errors, we can ensure that the information we convey is clear and accurate. Good grammar also improves the readability of the written text. When the reader encounters a grammatical error, more effort may be required to understand the text, which may be tiring and distracting. By correcting grammar errors, we can make text easier to read and more attractive. Grammar correction is also very important for language learning. By identifying and correcting errors, a learner may better understand grammar rules and improve his own writing and spoken language abilities. In addition, the automatic grammar correction tool can provide immediate feedback to learners to allow them to more quickly and effectively correct their own errors. However, the current english grammar correction technology still has room for improvement, and the english grammar correction technology of the multitask learning and attention mechanism proposed by the invention is focused on solving the following problems:
1. the grammar error defined in the field of English grammar correction is complex, and in order to adapt to training difficulty, the invention provides a training mode of multi-task learning joint constraint: the preface work of grammar correction is grammar checking, and the first task of training is to detect whether words in sentences have errors or not; the second task of training is to predict the editing labels corresponding to the words in the sentences; the third task of training is a margin loss based on contrast learning, and the expected model improves the classification confidence and gives a correct classification result more confidently;
2. english grammar correction models the syntactic and semantic information of English sentences by a layer dependency model. In order to further encode the syntactic information and the semantic information of an input sentence, the invention provides a method for outputting hidden layers of attention modeling, which comprises the steps of firstly carrying out average operation on the last three layers of output of a pre-training encoder to obtain more complete semantic representation, and then focusing each subword of the whole sentence on a part containing syntactic relations through an attention layer to obtain more optimal context representation fused with the syntactic information and the semantic information.
Disclosure of Invention
The invention aims to solve the problems of poor adaptability and inaccurate grammar correction of some complex sentences of the existing English grammar correction technology, and further provides an English grammar correction technology based on a multi-task learning and attention mechanism.
The technical scheme adopted by the invention for solving the problems is as follows: the method comprises the following steps:
step 1, for an input sentence, reading an English word segmentation vocabulary and an editing tag vocabulary from a database;
step 2, inputting sentences into a pre-training coding model to obtain the context representation of the whole sentences;
step 3, the obtained context feature vectors pass through a self-attention layer, and semantic vectors of all words in the sentence are further interacted through a self-attention mechanism;
step 4, judging whether the input sub-words need editing operation or not by using a two-class classifier, classifying the input sub-words by using a classifier with the size of a word list, and selecting the classification result with the highest score as the editing label of the corresponding sub-word;
and 5, performing post-processing on words in the input sentence according to the meaning corresponding to the correction label predicted by the model, and inputting the obtained post-processing result into the model for multiple iterations to obtain a final result.
Furthermore, the pre-training English coding model adopted in the step 2 comprises Roberta, XLnet and Deberta, and the three pre-training models are all improved versions of BERT, so that more pre-training corpus is used, more reasonable pre-training tasks and modeling mechanisms are trained, and the method has good effects in multiple English semantic modeling fields. The specific process comprises the following steps:
step 2.1, loading an English word segmentation tool for an input English sentence, and segmenting each word of the English sentence into sub-word forms;
step 2.2, mapping the sub word sequence of the English sentence into 768-dimensional vectors through a word embedding layer of a pre-training English coding model;
and 2.3, passing the mapping vector through a 12-layer pre-training English coding model, and splicing and averaging hidden layer vectors output by the last three layers of the model along the last one dimension, so as to obtain hidden layer vectors containing more semantic information.
Further, the self-attention layer in step 3 performs self-attention operation on the encoded representation containing semantic information output in step 2, so that the semantic representations in the whole sentence are further interacted,
Attn(x)=(W 2 (tanh(W 1 *x+b 1 ))+b 2 )·x(1),
in formula (1), x represents the semantic representation, W, of the sentence obtained in step 2 1 And W is 2 、b 1 And b 2 In order for the parameters to be trainable,h is the dimension of the last dimension of x, tanh is a hyperbolic tangent function, and serves as an activation function to provide nonlinear capability for the attention layer, the self-attention layer enables the representation vectors in the sentences to further interact, and more attention scores are distributed for the components containing the syntactic relations, so that the model can further model the syntactic relations of the sentences.
Further, in step 4, the optimization direction of the model is further constrained by using a multi-task learning mode in the training stage,
three constraint help models are provided in the training stage to achieve better;
Loss=Loss contrast +Loss detect +Loss label
equation (2) is the first loss constraint equation, where P d (f i I X) represents the probability of predicting that the i-position word is wrong according to the input sequence X model;
equation (3) is a second loss constraint equation, where P i (y i I X) represents the probability of a tag predicted from the input sequence X for the i-position word;
equation (4) is a third loss constraint equation, where y is a vector of all 1's, determining the direction of optimization;representing the subword x i After the model is input, the model predicts that the current editing label is y i Probability of (2); />The probability of the top5 high label representing the model output; mask l Is a boolean vector for circumventing the case that the model output front 5 contains a real label: if the model outputs y i When the probability value row of (1) is in the first 5, the mask of the corresponding position l The value is 0, the loss calculation is not participated, and the rest 4 positions are all 1; mean is the averaging operation, which averages along the last bit of the vector; margin is the minimum interval of two acceptable probabilities, and the reference value can be set between 0.1 and 0.3; the goal of the MarginLoss is that for each input, the probability of the real label expected to be output by the model is larger than the average value of the top5 probability output by the model, and the confidence of the model classification correctness is adjusted through a proper margin value; ultimately multitasking learning through three constraintsThe model is better and has higher classification confidence.
Further, in the training process of the step 4, multi-stage training is adopted to gradually increase the training difficulty;
in the grammar error correction field, only a small part of sentences have grammar errors, i.e. most of the components need to remain unchanged, which requires that the model should see sentences without grammar errors during training; according to the invention, multi-stage training is adopted in the training stage, so that the training difficulty is gradually improved, and the accuracy of model correction is ensured; training of the model is specifically divided into three stages:
the first stage: training by adopting artificial generated pseudo data, wherein the data is formed by using a verb tense table and deleting and inserting English words immediately, and the total number of the data is 900 ten thousand; all the data contain errors, and the correction capability of the model to grammar error sentences is primarily improved through the fact that a sufficient amount of data are used as pre-training;
and a second stage: erroneous data from university papers and writings from non-english native speakers are employed. Compared with the pseudo data, the data is closer to the real situation, the data quality is relatively higher, and the training in the stage is helpful for modeling English sentences in the real world;
and a third stage: training takes English level certificate exams and erroneous data in papers from the native English speaker. The training data also comprises sentences without grammar errors in the stage, so that the training difficulty is further increased, and the model is required to distinguish whether the true sentences contain grammar errors or not, so that the model is prevented from being excessively corrected in the test stage, and the correction effect is prevented from being influenced; the data of the English native speaker has more difficulty, so that the training difficulty of the model can be further improved, and the English grammar correcting capability is improved.
Further, in step 5, the grammar error occurring in the sentence is found out to the maximum extent by using an iterative correction method.
The beneficial effects of the invention are as follows: the English grammar correction technology based on the multi-task learning and attention mechanism uses a pre-training model with stronger modeling on English semantic information, introduces the attention mechanism of the semantic information, acquires the context representation which better fuses the syntax information and the semantic information, introduces three tasks to constraint model training from different aspects in the training process, and thus obtains the English grammar error correction system with higher accuracy and confidence. Because of various types of English grammar errors and overlarge modification span, the method can enable a user to quickly find the grammar error types appearing in the text, give out the modification comments of correct sentences, and improve correction efficiency and interpretability. The invention can also be applied to search technology to correct the query information input by the user, better help the search system to identify the user intention and the query target, promote the search quality and optimize the user experience; or in English learning of non-English native, pertinence improvement is carried out on various grammar errors, so that a user is helped to quickly raise English grammar level; or in the checking system for English articles, proper sentence modification suggestions are given, and the accuracy and the readability of the articles are improved.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a diagram of a model structure of an English grammar correction model based on a multitask learning and attention mechanism in the present invention;
fig. 3 is a diagram illustrating a text correction effect in an embodiment of the present invention.
Detailed Description
The first embodiment is as follows: referring to fig. 1 to 3, an english grammar correction technique based on a multitasking learning and attention mechanism according to the present embodiment is implemented by:
step 1, for an input sentence, reading an English word segmentation vocabulary and an editing tag vocabulary from a database;
step 2, inputting sentences into a pre-training coding model to obtain the context representation of the whole sentences;
step 3, the obtained context feature vectors pass through a self-attention layer, and semantic vectors of all words in the sentence are further interacted through a self-attention mechanism;
step 4, judging whether the input sub-words need editing operation or not by using a two-class classifier, classifying the input sub-words by using a classifier with the size of a word list, and selecting the classification result with the highest score as the editing label of the corresponding sub-word;
and 5, performing post-processing on words in the input sentence according to the meaning corresponding to the correction label predicted by the model, and inputting the obtained post-processing result into the model for multiple iterations to obtain a final result.
In the embodiment, step 1 is a pre-step, and in the field of English grammar correction, error types comprise main and subordinate inconsistencies, verb tense errors, article deletion, preposition misuse and the like; the common implementation method in the field is to give correction comments to each word in English sentences, and then perform post-processing according to the correction comments to form a final correction result; therefore, compared with the word segmentation in the Chinese correction technology, the English word segmentation technology divides some words into a plurality of partial sub-words, such as English words 'transformers' are segmented into two parts of 'transformers' and 'ers', so that the English word segmentation vocabulary is firstly loaded to guide the segmentation of the input sentence; after dividing the sentence into words, the english grammar correction technique predicts the corresponding editing operation for each sub-word, for example, the editing tag corresponding to the sub-word "to" in the sentence "Go to home now" is "$delete", meaning that the current sub-word is deleted, so that by predicting all the sub-words of the whole sentence, an editing tag sequence can be obtained, and after the types of the editing tags are processed, the sentence after grammar correction can be obtained.
Step 2 is to convert the input English sentence into a vector composed of numbers. And obtaining the feature vector representing the context sentence information and the semantic information of the input sentence through the coded representation of the BERT.
And 3, enabling the context feature vectors output by the pre-training coding model in the step 2 to pass through a self-attention layer, enabling the representation vectors of different words to interact through a word attention mechanism, enabling the corresponding attention to be higher than other words in the sentence for the words containing syntax dependency, enabling the model to understand which parts of words are syntactically dependent, and further integrating the syntax information and semantic information in the input sentence.
And step 4, carrying out weighted fusion on the three representation input gating layers obtained in the step 3, and carrying out fusion on a plurality of modal representations to obtain the final representation of the Chinese characters.
Step 5, post-processing the editing tag sequence output in the step 4 and the input sentence sequence, and performing post-processing on each word in the sentence according to the meaning of the editing tag to form a grammar correction result; since the sentence with the english grammar error usually has a plurality of error superposition conditions, iterative error correction is required to be performed on the sentence to correct all the errors in the sentence as much as possible, so as to form a final correction result.
The second embodiment is as follows: with reference to fig. 1 to 3, the present embodiment is described, where the pre-training english coding model adopted in step 2 of the english grammar correction technique based on the multitask learning and attention mechanism includes RoBERTa, XLNet and DeBERTa, and the three pre-training models are all improved versions of BERT, using more pre-training corpora, completing more reasonable training tasks and modeling mechanism training, and having good effects in multiple english semantic modeling fields. The specific process comprises the following steps:
step 2.1, loading an English word segmentation tool for an input English sentence, and segmenting each word of the English sentence into sub-word forms;
step 2.2, mapping the sub word sequence of the English sentence into 768-dimensional vectors through a word embedding layer of a pre-training English coding model;
and 2.3, passing the mapping vector through a 12-layer pre-training English coding model, and splicing and averaging hidden layer vectors output by the last three layers of the model along the last one dimension, so as to obtain hidden layer vectors containing more semantic information.
And a third specific embodiment: referring to fig. 1 to 3, the self-attention layer in step 3 of the english grammar correction technique based on the multitasking learning and attention mechanism according to the present embodiment performs self-attention operation on the encoded representation including the semantic information output in step 2, so that the semantic representations in the whole sentence are further interacted,
Attn(x)=(W 2 (tanh(W 1 *x+b 1 ))+b 2 )·x (1),
in formula (1), x represents the semantic representation, W, of the sentence obtained in step 2 1 And W is 2 、b 1 And b 2 In order for the parameters to be trainable,h is the dimension of the last dimension of x, tanh is a hyperbolic tangent function, and serves as an activation function to provide nonlinear capability for the attention layer, the self-attention layer enables the representation vectors in the sentences to further interact, and more attention scores are distributed for the components containing the syntactic relations, so that the model can further model the syntactic relations of the sentences.
The specific embodiment IV is as follows: referring to fig. 1 to 3, the present embodiment describes an english grammar correction technique based on a multi-task learning and attention mechanism, in which the optimization direction of the model is further constrained by using the multi-task learning method in the training phase in step 4,
three constraint help models are provided in the training stage to achieve better;
Loss=Loss contrast +Loss detect +Loss label
equation (2) is the first loss constraint equation, where P d (f i I X) represents the probability of predicting that the i-position word is wrong according to the input sequence X model;
equation (3) is a second loss constraint equation, where P l (y i I X) represents the probability of a tag predicted from the input sequence X for the i-position word;
equation (4) is a third loss constraint equation, where y is a vector of all 1's, determining the direction of optimization;representing the subword x i After the model is input, the model predicts that the current editing label is y i Probability of (2); />The probability of the top5 high label representing the model output; mask l Is a boolean vector for circumventing the case that the model output front 5 contains a real label: if the model outputs y i When the probability value row of (1) is in the first 5, the mask of the corresponding position l The value is 0, the loss calculation is not participated, and the rest 4 positions are all 1; mean is the averaging operation, which averages along the last bit of the vector; margin is the minimum interval of two acceptable probabilities, and the reference value can be set between 0.1 and 0.3; the goal of the MarginLoss is that for each input, the probability of the real label expected to be output by the model is larger than the average value of the top5 probability output by the model, and the confidence of the model classification correctness is adjusted through a proper margin value; finally, through three constraint multitask learning, the model is better and has higher classification confidence.
Fifth embodiment: referring to fig. 1 to 3, the present embodiment is described, in which a multi-stage training is adopted in the training process of step 4 of the english grammar correction technique based on the multi-task learning and attention mechanism to gradually increase the training difficulty;
in the grammar error correction field, only a small part of sentences have grammar errors, i.e. most of the components need to remain unchanged, which requires that the model should see sentences without grammar errors during training; according to the invention, multi-stage training is adopted in the training stage, so that the training difficulty is gradually improved, and the accuracy of model correction is ensured; training of the model is specifically divided into three stages:
the first stage: training by adopting artificial generated pseudo data, wherein the data is formed by using a verb tense table and deleting and inserting English words immediately, and the total number of the data is 900 ten thousand; all the data contain errors, and the correction capability of the model to grammar error sentences is primarily improved through the fact that a sufficient amount of data are used as pre-training;
and a second stage: erroneous data from university papers and writings from non-english native speakers are employed. Compared with the pseudo data, the data is closer to the real situation, the data quality is relatively higher, and the training in the stage is helpful for modeling English sentences in the real world;
and a third stage: training takes English level certificate exams and erroneous data in papers from the native English speaker. The training data also comprises sentences without grammar errors in the stage, so that the training difficulty is further increased, and the model is required to distinguish whether the true sentences contain grammar errors or not, so that the model is prevented from being excessively corrected in the test stage, and the correction effect is prevented from being influenced; the data of the English native speaker has more difficulty, so that the training difficulty of the model can be further improved, and the English grammar correcting capability is improved.
Specific embodiment six: referring to fig. 1 to 3, the present embodiment describes an english grammar correction technique based on a multi-task learning and attention mechanism, which uses an iterative correction method to find out the grammar errors occurring in sentences to the maximum in step 5.
Grammar errors in part of English sentences are implication, namely, some errors can be found by a model only by correcting preamble errors; according to the correction mode of the edit tag prediction provided by the invention, firstly, correcting the single and plural nouns, and then connecting words by using hyphens to finish final error correction, thus requiring multi-step error correction; in addition, the model can deviate from semantic modeling of sentences with more grammar errors, which also requires iterative ways to allow the model to understand step by step; the use of multiple rounds of iterative error correction enables the model to find further implicit grammatical errors in sentences.
Examples
According to the steps, the invention can realize a simple automatic English grammar correction module which can be embedded into any existing system to achieve the effect of plug and play, and the invention has the following specific verification effects:
the embodiment is carried out according to the flow shown in fig. 1, and a Chinese spelling correction system based on multi-mode pre-training fusion is built. After the system is started, the input English text to be corrected is firstly taken out from the database and divided into sub words, and then the sub word sequences are input into a pre-training English coding model. The context characteristics output by the model pass through the self-attention layer to finish further interaction, and the final grammar correction result is obtained by using the representation containing the syntax information and the semantic information.
The invention selects a piece of content in the writing article of the non-English native speaker, and the correction result of the system built by the invention is shown in figure 3. According to the correction result and the correction opinion given in the figure, the English grammar correction technology based on the multi-task learning and attention mechanism used by the invention can realize the correction of the article missing error: "with certain" - "with acertain"; correction of preposition misuse errors can also be achieved: "diagnosed out" - "diagnosed"; meanwhile, the verb tense errors which are easy to make by learners can be corrected: "supported" - "supported". Through the method, a user can intuitively see English grammar errors in the article and quickly correct the English grammar errors.
The present invention is not limited to the preferred embodiments, but is capable of modification and variation in detail, and other embodiments, such as those described above, of making various modifications and equivalents will fall within the spirit and scope of the present invention.

Claims (6)

1. An English grammar correcting technology based on a multitask learning and attention mechanism is characterized in that: the English grammar correction technology based on the multitask learning and attention mechanism is realized through the following steps:
step 1, for an input sentence, reading an English word segmentation vocabulary and an editing tag vocabulary from a database;
step 2, inputting sentences into a pre-training coding model to obtain the context representation of the whole sentences;
step 3, the obtained context feature vectors pass through a self-attention layer, and semantic vectors of all words in the sentence are further interacted through a self-attention mechanism;
step 4, judging whether the input sub-words need editing operation or not by using a two-class classifier, classifying the input sub-words by using a classifier with the size of a word list, and selecting the classification result with the highest score as the editing label of the corresponding sub-word;
and 5, performing post-processing on words in the input sentence according to the meaning corresponding to the correction label predicted by the model, and inputting the obtained post-processing result into the model for multiple iterations to obtain a final result.
2. The english grammar correction technique based on the multitasking learning and attention mechanism of claim 1, wherein: the pre-training English coding model adopted in the step 2 comprises RoBERTa, XLNet and DeBERTa, and the three pre-training models are all improved versions of BERT, so that more pre-training corpus is used, more reasonable pre-training tasks and modeling mechanisms are used for training, and the method has good effects in multiple English semantic modeling fields. The specific process comprises the following steps:
step 2.1, loading an English word segmentation tool for an input English sentence, and segmenting each word of the English sentence into sub-word forms;
step 2.2, mapping the sub word sequence of the English sentence into 768-dimensional vectors through a word embedding layer of a pre-training English coding model;
and 2.3, passing the mapping vector through a 12-layer pre-training English coding model, and splicing and averaging hidden layer vectors output by the last three layers of the model along the last one dimension, so as to obtain hidden layer vectors containing more semantic information.
3. The english grammar correction technique based on the multitasking learning and attention mechanism of claim 1, wherein: the self-attention layer in step 3 performs self-attention operation on the encoded representation containing semantic information output in step 2, so that the semantic representations in the whole sentence are further interacted,
Attn(x)=(W 2 (tanh(W 1 *x+b 1 ))+b 2 )·x (1),
in formula (1), x represents the semantic representation, W, of the sentence obtained in step 2 1 And W is 2 、b 1 And b 2 In order for the parameters to be trainable,h is the dimension of the last dimension of x, tanh is a hyperbolic tangent function, and serves as an activation function to provide nonlinear capability for the attention layer, the self-attention layer enables the representation vectors in the sentences to further interact, and more attention scores are distributed for the components containing the syntactic relations, so that the model can further model the syntactic relations of the sentences.
4. The english grammar correction technique based on the multitasking learning and attention mechanism of claim 1, wherein: the optimization direction of the model is further constrained in step 4 using a multitask learning approach during the training phase,
three constraint help models are provided in the training stage to achieve better;
Loss=Loss contrast +Loss detect +Loss lobet
equation (2) is the first loss constraint equation, where P d (f i I X) represents the probability of predicting that the i-position word is wrong according to the input sequence X model;
equation (3) is a second loss constraint equation, where P l (y i I X) represents the probability of a tag predicted from the input sequence X for the i-position word;
equation (4) is a third loss constraint equation, where y is a vector of all 1's, determining the direction of optimization;representing the subword x i After the model is input, the model predicts that the current editing label is y i Probability of (2); />The probability of the top5 high label representing the model output; mask l Is a boolean vector for circumventing the case that the model output front 5 contains a real label: if the model outputs y i When the probability value row of (1) is in the first 5, the mask of the corresponding position l The value is 0, the loss calculation is not participated, and the rest 4 positions are all 1; mean is the averaging operation, which averages along the last bit of the vector; margin is the minimum interval between two acceptable probabilities, the reference value of which can be setBetween 0.1 and 0.3; the goal of the MarginLoss is that for each input, the probability of the real label expected to be output by the model is larger than the average value of the top5 probability output by the model, and the confidence of the model classification correctness is adjusted through a proper margin value; finally, through three constraint multitask learning, the model is better and has higher classification confidence.
5. The english grammar correction technique based on the multitasking learning and attention mechanism of claim 1 or 4, characterized by: in the training process of the step 4, multi-stage training is adopted to gradually improve the training difficulty;
in the grammar error correction field, only a small part of sentences have grammar errors, i.e. most of the components need to remain unchanged, which requires that the model should see sentences without grammar errors during training; according to the invention, multi-stage training is adopted in the training stage, so that the training difficulty is gradually improved, and the accuracy of model correction is ensured; training of the model is specifically divided into three stages:
the first stage: training by adopting artificial generated pseudo data, wherein the data is formed by using a verb tense table and deleting and inserting English words immediately, and the total number of the data is 900 ten thousand; all the data contain errors, and the correction capability of the model to grammar error sentences is primarily improved through the fact that a sufficient amount of data are used as pre-training;
and a second stage: erroneous data from university papers and writings from non-english native speakers are employed. Compared with the pseudo data, the data is closer to the real situation, the data quality is relatively higher, and the training in the stage is helpful for modeling English sentences in the real world;
and a third stage: training takes English level certificate exams and erroneous data in papers from the native English speaker. The training data also comprises sentences without grammar errors in the stage, so that the training difficulty is further increased, and the model is required to distinguish whether the true sentences contain grammar errors or not, so that the model is prevented from being excessively corrected in the test stage, and the correction effect is prevented from being influenced; the data of the English native speaker has more difficulty, so that the training difficulty of the model can be further improved, and the English grammar correcting capability is improved.
6. The english grammar correction technique based on the multitasking learning and attention mechanism of claim 1, wherein: in step 5, the grammar error in the sentence is found out to the maximum extent by using an iterative correction mode.
CN202310630375.5A 2023-05-31 2023-05-31 English grammar correction technology based on multitask learning and attention mechanism Pending CN116681061A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310630375.5A CN116681061A (en) 2023-05-31 2023-05-31 English grammar correction technology based on multitask learning and attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310630375.5A CN116681061A (en) 2023-05-31 2023-05-31 English grammar correction technology based on multitask learning and attention mechanism

Publications (1)

Publication Number Publication Date
CN116681061A true CN116681061A (en) 2023-09-01

Family

ID=87784761

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310630375.5A Pending CN116681061A (en) 2023-05-31 2023-05-31 English grammar correction technology based on multitask learning and attention mechanism

Country Status (1)

Country Link
CN (1) CN116681061A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117744635A (en) * 2024-02-12 2024-03-22 长春职业技术学院 English text automatic correction system and method based on intelligent AI

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117744635A (en) * 2024-02-12 2024-03-22 长春职业技术学院 English text automatic correction system and method based on intelligent AI
CN117744635B (en) * 2024-02-12 2024-04-30 长春职业技术学院 English text automatic correction system and method based on intelligent AI

Similar Documents

Publication Publication Date Title
CN110489760B (en) Text automatic correction method and device based on deep neural network
JP5128629B2 (en) Part-of-speech tagging system, part-of-speech tagging model training apparatus and method
CN110008472B (en) Entity extraction method, device, equipment and computer readable storage medium
DeNero et al. Inducing sentence structure from parallel corpora for reordering
CN110276069B (en) Method, system and storage medium for automatically detecting Chinese braille error
CN111767718B (en) Chinese grammar error correction method based on weakened grammar error feature representation
CN114580382A (en) Text error correction method and device
CN112183094A (en) Chinese grammar debugging method and system based on multivariate text features
CN114818668B (en) Name correction method and device for voice transcription text and computer equipment
CN113657123A (en) Mongolian aspect level emotion analysis method based on target template guidance and relation head coding
CN113190219A (en) Code annotation generation method based on recurrent neural network model
CN115034218A (en) Chinese grammar error diagnosis method based on multi-stage training and editing level voting
CN117076653A (en) Knowledge base question-answering method based on thinking chain and visual lifting context learning
CN114925170B (en) Text proofreading model training method and device and computing equipment
CN113449514A (en) Text error correction method and device suitable for specific vertical field
CN116681061A (en) English grammar correction technology based on multitask learning and attention mechanism
CN113553847A (en) Method, device, system and storage medium for parsing address text
CN114742016A (en) Chapter-level event extraction method and device based on multi-granularity entity differential composition
CN110210033B (en) Chinese basic chapter unit identification method based on main bit theory
CN115906818A (en) Grammar knowledge prediction method, grammar knowledge prediction device, electronic equipment and storage medium
Dutta Word-level language identification using subword embeddings for code-mixed Bangla-English social media data
CN113012685B (en) Audio recognition method and device, electronic equipment and storage medium
CN114492396A (en) Text error correction method for automobile proper nouns and readable storage medium
Lv et al. StyleBERT: Chinese pretraining by font style information
Nabende Applying dynamic Bayesian Networks in transliteration detection and generation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination