CN117151084B - Chinese spelling and grammar error correction method, storage medium and equipment - Google Patents
Chinese spelling and grammar error correction method, storage medium and equipment Download PDFInfo
- Publication number
- CN117151084B CN117151084B CN202311425616.9A CN202311425616A CN117151084B CN 117151084 B CN117151084 B CN 117151084B CN 202311425616 A CN202311425616 A CN 202311425616A CN 117151084 B CN117151084 B CN 117151084B
- Authority
- CN
- China
- Prior art keywords
- loss
- sequence
- spelling
- length
- error correction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012937 correction Methods 0.000 title claims abstract description 29
- 238000000034 method Methods 0.000 title claims abstract description 28
- 230000008569 process Effects 0.000 claims abstract description 5
- 230000006870 function Effects 0.000 claims description 12
- 239000013598 vector Substances 0.000 claims description 10
- 230000007246 mechanism Effects 0.000 claims description 9
- 238000013507 mapping Methods 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000011176 pooling Methods 0.000 claims description 3
- 230000008439 repair process Effects 0.000 claims description 3
- 230000002194 synthesizing effect Effects 0.000 claims description 3
- 239000013604 expression vector Substances 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 3
- 230000000694 effects Effects 0.000 description 3
- 238000003058 natural language processing Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/232—Orthographic correction, e.g. spell checking or vowelisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/253—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Machine Translation (AREA)
Abstract
The invention belongs to the technical field of language processing, and particularly relates to a Chinese spelling and grammar error correction method, a storage medium and equipment, which can detect and correct spelling errors and grammar errors in an input text. On the basis that the original RoBERTa model can only process spelling error correction tasks, the text spelling errors and text grammar errors are corrected at the same time by improving the addition generator, so that the error correction efficiency is obviously improved.
Description
Technical Field
The invention belongs to language processing technology, and particularly relates to a Chinese spelling and grammar error correction method, a storage medium and equipment.
Background
In conventional natural language processing, spelling correction and grammar correction are typically two tasks that are handled separately. Spelling error correction is focused on detecting and correcting spelling errors, while grammar error correction aims at repairing grammar errors. However, this method of separate processing may result in loss of information and accumulation of errors. The native RoBERTa model can only handle text spelling error correction, and cannot detect grammar errors at the same time.
Disclosure of Invention
In order to simultaneously process spelling and grammar error correction, the application provides a new method based on RoBERTa, and unified detection and correction of text spelling errors and grammar errors are realized. By entering the text containing the errors into a new model, more comprehensive context information can be obtained and compared to the correct text to discover and correct spelling and grammar errors simultaneously. The comprehensive method can better solve the complex error condition in the text, and has higher accuracy and robustness. The technical proposal is that,
a Chinese spelling and grammar error correction method comprises the following steps:
s1, using a Roberta encoder model, inputting a sequence X (X 1 ,x 2 ,..,x n ) Coding to obtain an output sequence H (H 1 ,h 2 ,..,h n ) Wherein x is n Token, h, being the nth position of input sequence X n A token at the nth position of the output sequence H;
s2, adding a CNN convolution layer after the RoBERTa encoder model outputs a sequence H, and extracting a local feature C output by the encoder through a convolution kernel to obtain a local feature tensor; fusing the local feature tensor and the encoder output sequence H through residual connection to obtain a fused semantic representation sequence H';
s3, carrying out maximum pooling operation on the fused semantic representation sequence H' to obtain a representation vector V with a fixed length;
s4, transmitting the representation vector V into a full connection layer to obtain the prediction distribution of the length of the target sequence;
s5, inputting the output of the encoder and the target word into a decoder module, and enabling the decoder to simultaneously correct spelling errors and repair grammar errors by combining an attention mechanism and a pointer network.
Preferably, in step S2, the local feature C of the output is extracted, and the specific formula is as follows:
C=Conv1D(H);
wherein Conv1D is a 1-dimensional convolution function.
Preferably, in step S2, the extracted and output local feature tensor and the output sequence H of the RoBERTa model are combined and fused through residual connection to obtain a fused semantic representation sequence H ' = (H ' ' 1 ,h′ 2 ,..,h′ n ) And taking the fused semantic representation sequence H' as the input of the step S3.
Preferably, in step S4, the target sequence length prediction is performed on the mapped expression vector V through the full-link layer to obtain the prediction distribution p len ,
p len =WV+b;
Where W is the full connection layer weight and b is the bias term.
Preferably, in step S5,
s51, calculating attention weight a t
Wherein h is i Output for the ith position of the encoder, b attn As a parameter of the bias it is possible,as a learnable weight matrix for mapping context information in the attention mechanism to the appropriate dimension, W h And W is s For a weight matrix that can be learned for h i And the decoder state s of the last time step t-1 Mapping to the appropriate dimension;
s52, based on a t Generating context vector c t
Wherein a is ti Represents the attention weight, H, to the ith position of the input sequence at time t i Representing Roberta
Semantic features of the i-th location;
s53, c t As input, the decoder state s is updated t
s t =RNN([s t-1 ,c t ])
S54, through s t And c t Calculating probability distribution p of generated words vocab Generating a duplication probability distribution P in an input sequence based on H copy
p vocab =softmax(Linear([c t ,s t ]))
P copy =sigmoid(Linear(H))
S55, generating final distribution p by using pointer mechanism
p=p copy *a t +(1-p copy )*p vocab 。
Preferably, in step S5, a Loss function Loss is calculated, where the Loss function includes a generation Loss, a pointer network Loss, and a length prediction Loss, and the calculation process is as follows:
calculating generation loss: the decoder, based on the prediction distribution of vocab, cross entropy loss with the target word,
loss1=CrossEntropyLoss(p,y);
wherein y is a target word;
calculating pointer network loss: the pointer network is utilized to directly copy the loss of the original words, calculate the attention distribution and the cross entropy of the target words,
loss2=CrossEntropyLoss(p copy ,y);
calculating a length prediction loss: a loss between the predicted length and the target length,
loss3=L1Loss(p len ,l tag )
wherein p is len To predict length, l tag Is the target length;
synthesizing a Loss function Loss:
Loss=w 1 loss1+w 2 loss2+w 3 loss3
wherein w is 1 ,w 2 ,w 3 All are loss weights and are trained by the model.
A computer readable storage medium containing program instructions stored thereon that when executed perform a RoBERTa-based spelling, grammar error correction method.
An electronic device comprising a memory and a processor, the memory storing computer readable instructions that, when executed by the processor, perform steps in a RoBERTa-based spelling, grammar error correction method.
Compared with the prior art, the beneficial effects of the application are as follows:
the invention realizes the unification of text spelling check and grammar check by adding the modules such as the convolution layer, residual error connection, pointer network and the like on the RoBERTa model, optimizes the error correction performance and has obvious technical progress.
Drawings
Fig. 1 is a flow chart of the present application.
Detailed Description
The following detailed description is exemplary and is intended to provide further explanation of the present application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments in accordance with the present application.
Roberta is proposed in the paper Roberta A Robustly Optimized BERT Pretraining Approach. The method belongs to the enhanced version of BERT and is also a finer tuning version of BERT model. The RoBERTa model is an improved version of BERT (A Robustly Optimized BERT, from its name, a simple rough BERT method called brute force optimization).
A Chinese spelling and grammar error correction method comprises the following steps:
s1, utilizing a Roberta encoder model, and correcting Chinese sentencesAs a pair of input sequences X (X 1 ,x 2 ,..,x n ) Coding to obtain an output sequence H (H 1 ,h 2 ,..,h n );
S2, adding a CNN convolution layer after the RoBERTa encoder model outputs a sequence H, and extracting a local feature C output by the encoder through a convolution kernel to obtain a local feature tensor; fusing the local feature tensor and the encoder output sequence H through residual connection to obtain a fused semantic representation sequence H';
the local feature C of the output is extracted, and the specific formula is as follows:
C=Conv1D(H);
wherein Conv1D is a 1-dimensional convolution function.
Combining and fusing the extracted and output local feature tensor and the output sequence H of the RoBERTa model through residual connection to obtain a fused semantic representation sequence H ' = (H ') ' 1 ,h′ 2 ,..,h′ n ) And taking the fused semantic representation sequence H' as the input of the step S3.
S3, carrying out maximum pooling on the input H' to obtain a fixed-length representation vector V.
S4, transmitting the representation vector V into a full connection layer to obtain the prediction distribution of the length of the target sequence;
predicting the length of the target sequence by using the representation vector V after pulling through the full connection layer to obtain a prediction distribution p len ,
p len =WV+b;
Where W is the full connection layer weight and b is the bias term.
S5, inputting the output of the encoder and the target word into a decoder module, and enabling the decoder to simultaneously correct spelling errors and repair grammar errors by combining an attention mechanism and a pointer network.
S51, calculating attention weight a t
Wherein h is i Output for the ith position of the encoder, b attn As a parameter of the bias it is possible,as a learnable weight matrix for mapping context information in the attention mechanism to the appropriate dimension, W h And W is s For a weight matrix that can be learned for h i And the decoder state s of the last time step t-1 Mapping to the appropriate dimension;
s52, based on a t Generating context vector c t
Wherein a is ti Represents the attention weight, H, to the ith position of the input sequence at time t i Representing Roberta
Semantic features of the i-th location.
S53, c t As input, the decoder state s is updated t
s t =RNN([s t-1 ,c t ])
S54, through s t And c t Calculating probability distribution p of generated words vocab Generating a duplication probability distribution P in an input sequence based on H copy
p vocab =softmax(Linear([c t ,s t ]))
P copy =sigmoid(Linear(H))
S55, generating final distribution p by using pointer mechanism
p=p copy *a t +(1-p copy )*p vocab 。
Calculating a Loss function Loss, wherein the Loss function comprises a generation Loss, a pointer network Loss and a length prediction Loss, and the calculation process comprises the following steps:
calculating generation loss: the decoder, based on the prediction distribution of vocab, cross entropy loss with the target word,
loss1=CrossEntropyLoss(p,y);
wherein y is a target word;
calculating pointer network loss: the pointer network is utilized to directly copy the loss of the original words, calculate the attention distribution and the cross entropy of the target words,
loss2=CrossEntropyLoss(p copy ,y);
calculating a length prediction loss: a loss between the predicted length and the target length,
loss3=L1Loss(p len ,l tag )
wherein p is len To predict length, l tag Is the target length;
synthesizing a Loss function Loss:
Loss=w 1 loss1+w 2 loss2+w 3 loss3
wherein w is 1 ,w 2 ,w 3 All are loss weights and are trained by the model.
A computer readable storage medium containing program instructions stored thereon that when executed perform a RoBERTa-based spelling, grammar error correction method.
An electronic device comprising a memory and a processor, the memory storing computer readable instructions that, when executed by the processor, perform steps in a RoBERTa-based spelling, grammar error correction method.
Experimental data 1 spelling error correction effect
Model | Precision | Recall | F1 |
Bert | 0.8107 | 0.6390 | 0.7147 |
RoBERTa | 0.825 | 0.7293 | 0.7742 |
The invention is that | 0.8713 | 0.7634 | 0.8138 |
The experimental results show that the spelling error correction Precision, recall and F1 fractions of the method are obviously superior to those of the Bert and original RoBERTa models, and the spelling error correction effect is greatly improved.
Experimental data 2 grammar spelling and error correction effect
Model | Precision | Recall | F0.5 |
Convseq2seq | 0.362 | 0.354 | 0.360 |
T5 | 0.506 | 0.496 | 0.504 |
The invention is that | 0.576 | 0.567 | 0.574 |
As can be seen from experimental results, spelling and grammar error correction are greatly improved in both Precision and Recall compared with Convseq2 seq.
Experiment number 3 efficiency of execution
Model | QPS |
Bert | 3 |
RoBERTa | 3 |
Convseq2seq | 5 |
T5 | 94 |
The invention is that | 51 |
The T5 model is a pre-training language model based on a transducer architecture, and has the advantages of high training efficiency, strong generalization capability, adaptation to various natural language processing tasks and the like.
In the task of natural language generation, most of which are implemented based on the Seq2Seq model, conv Seq2Seq is a relatively new method based on CNN.
In terms of calculation efficiency, the invention has slower speed than the models which can only correct spelling and error by Bert and RoBERTa because of combining spelling and grammar error correction, but has obvious speed improvement compared with the grammar error model.
Claims (6)
1. A Chinese spelling and grammar error correction method is characterized by comprising the following steps:
s1, using a Roberta encoder model, inputting a sequence X (X 1 ,x 2 ,..,x n ) Coding to obtain an output sequence H (H 1 ,h 2 ,..,h n ) Wherein x is n Token, h, being the nth position of input sequence X n A token at the nth position of the output sequence H;
s2, adding a CNN convolution layer after the RoBERTa encoder model outputs a sequence H, and extracting a local feature C output by the encoder through a convolution kernel to obtain a local feature tensor; fusing the local feature tensor and the encoder output sequence H through residual connection to obtain a fused semantic representation sequence H';
s3, carrying out maximum pooling operation on the fused semantic representation sequence H' to obtain a representation vector V with a fixed length;
s4, transmitting the representation vector V into a full connection layer to obtain the prediction distribution of the length of the target sequence;
s5, inputting the output of the encoder and the target word y into a decoder module, and enabling the decoder to simultaneously correct spelling errors and repair grammar errors by combining an attention mechanism and a pointer network;
S51.calculating the attention weight a t
Wherein h is i Output for the ith position of the encoder, b attn As a parameter of the bias it is possible,as a learnable weight matrix for mapping context information in the attention mechanism to the appropriate dimension, W h And Ws is a learnable weight matrix for h i And the decoder state s of the last time step t-1 Mapping to the appropriate dimension;
s52, based on a t Generating context vector c t
Wherein a is ti Represents the attention weight, H, to the ith position of the input sequence at time t i Semantic features representing the ith position of RoBERTa;
s53, c t As input, the decoder state s is updated t
s t =RNN([s t-1 ,c t ])
S54, through s t And c t Calculating probability distribution p of generated words vocab Generating a duplication probability distribution P in an input sequence based on H copy
p vocab =softmax(Linear([c t ,s t ]))
P copy =sigmoid(Linear(H))
S55, generating final distribution p by using pointer mechanism
p=p copy *a t +(1-p copy )*p vocab ;
Calculating a Loss function Loss, wherein the Loss function comprises a generation Loss, a pointer network Loss and a length prediction Loss, and the calculation process comprises the following steps:
calculating generation loss: the decoder, based on the prediction distribution of vocab, cross entropy loss with the target word,
loss1=CrossEntropyLoss(p,y);
wherein y is a target word;
calculating pointer network loss: the pointer network is utilized to directly copy the loss of the original words, calculate the attention distribution and the cross entropy of the target words,
loss2=CrossEntropyLoss(p copy ,y);
calculating a length prediction loss: a loss between the predicted length and the target length,
loss3=L1Loss(p len ,l tag )
wherein p is len To predict length, l tag Is the target length;
synthesizing a Loss function Loss:
Loss=w 1 loss1+w 2 loss2+w 3 loss3
wherein w is 1 ,w 2 ,w 3 All are loss weights and are trained by the model.
2. The method for correcting spelling and grammar of Chinese characters according to claim 1, wherein in step S2, the output local feature C is extracted by the following specific formula:
C=Conv1D(H);
wherein Conv1D is a 1-dimensional convolution function.
3. The method of claim 2, wherein in step S2, the extracted local feature tensor and the output sequence H of the RoBERTa model are combined and fused by residual connection to obtain a fused semantic representation sequence H ' = (H ' ' 1 ,h′ 2 ,..,h′ n ) And taking the fused semantic representation sequence H' as the input of the step S3.
4. A according to claim 1A Chinese spelling and grammar error correction method is characterized in that in step S4, a target sequence length prediction is carried out on a mapped expression vector V through a full connection layer to obtain a prediction distribution p len ,
p len =WV+b;
Where W is the full connection layer weight and b is the bias term.
5. A computer readable storage medium containing program instructions stored thereon, which when executed, are adapted to perform a chinese spelling, grammar error correction method according to any one of claims 1-4.
6. An electronic device, comprising: a memory and a processor, the memory storing computer readable instructions that, when executed by the processor, perform the steps in the method of any of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311425616.9A CN117151084B (en) | 2023-10-31 | 2023-10-31 | Chinese spelling and grammar error correction method, storage medium and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311425616.9A CN117151084B (en) | 2023-10-31 | 2023-10-31 | Chinese spelling and grammar error correction method, storage medium and equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117151084A CN117151084A (en) | 2023-12-01 |
CN117151084B true CN117151084B (en) | 2024-02-23 |
Family
ID=88910495
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311425616.9A Active CN117151084B (en) | 2023-10-31 | 2023-10-31 | Chinese spelling and grammar error correction method, storage medium and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117151084B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117454906B (en) * | 2023-12-22 | 2024-05-24 | 创云融达信息技术(天津)股份有限公司 | Text proofreading method and system based on natural language processing and machine learning |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112861517A (en) * | 2020-12-24 | 2021-05-28 | 杭州电子科技大学 | Chinese spelling error correction model |
WO2021164310A1 (en) * | 2020-02-21 | 2021-08-26 | 华为技术有限公司 | Text error correction method and apparatus, and terminal device and computer storage medium |
CN114417839A (en) * | 2022-01-19 | 2022-04-29 | 北京工业大学 | Entity relation joint extraction method based on global pointer network |
WO2022095563A1 (en) * | 2020-11-06 | 2022-05-12 | 北京世纪好未来教育科技有限公司 | Text error correction adaptation method and apparatus, and electronic device, and storage medium |
CN114912419A (en) * | 2022-04-19 | 2022-08-16 | 中国人民解放军国防科技大学 | Unified machine reading understanding method based on reorganization confrontation |
CN115080715A (en) * | 2022-05-30 | 2022-09-20 | 重庆理工大学 | Span extraction reading understanding method based on residual error structure and bidirectional fusion attention |
CN115438154A (en) * | 2022-09-19 | 2022-12-06 | 上海大学 | Chinese automatic speech recognition text restoration method and system based on representation learning |
CN115690002A (en) * | 2022-10-11 | 2023-02-03 | 河海大学 | Remote sensing image change detection method and system based on Transformer and dense feature fusion |
CN115809655A (en) * | 2021-09-14 | 2023-03-17 | 华东师范大学 | Chinese character symbol correction method and system based on attribution network and BERT |
CN116127952A (en) * | 2023-01-16 | 2023-05-16 | 之江实验室 | Multi-granularity Chinese text error correction method and device |
CN116187334A (en) * | 2023-04-20 | 2023-05-30 | 山东齐鲁壹点传媒有限公司 | Comment generation method based on mt5 model fusion ner entity identification |
CN116757164A (en) * | 2023-06-21 | 2023-09-15 | 张丽莉 | GPT generation language recognition and detection system |
CN116822464A (en) * | 2023-07-03 | 2023-09-29 | 成都数之联科技股份有限公司 | Text error correction method, system, equipment and storage medium |
WO2023184633A1 (en) * | 2022-03-31 | 2023-10-05 | 上海蜜度信息技术有限公司 | Chinese spelling error correction method and system, storage medium, and terminal |
-
2023
- 2023-10-31 CN CN202311425616.9A patent/CN117151084B/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021164310A1 (en) * | 2020-02-21 | 2021-08-26 | 华为技术有限公司 | Text error correction method and apparatus, and terminal device and computer storage medium |
WO2022095563A1 (en) * | 2020-11-06 | 2022-05-12 | 北京世纪好未来教育科技有限公司 | Text error correction adaptation method and apparatus, and electronic device, and storage medium |
CN112861517A (en) * | 2020-12-24 | 2021-05-28 | 杭州电子科技大学 | Chinese spelling error correction model |
CN115809655A (en) * | 2021-09-14 | 2023-03-17 | 华东师范大学 | Chinese character symbol correction method and system based on attribution network and BERT |
CN114417839A (en) * | 2022-01-19 | 2022-04-29 | 北京工业大学 | Entity relation joint extraction method based on global pointer network |
WO2023184633A1 (en) * | 2022-03-31 | 2023-10-05 | 上海蜜度信息技术有限公司 | Chinese spelling error correction method and system, storage medium, and terminal |
CN114912419A (en) * | 2022-04-19 | 2022-08-16 | 中国人民解放军国防科技大学 | Unified machine reading understanding method based on reorganization confrontation |
CN115080715A (en) * | 2022-05-30 | 2022-09-20 | 重庆理工大学 | Span extraction reading understanding method based on residual error structure and bidirectional fusion attention |
CN115438154A (en) * | 2022-09-19 | 2022-12-06 | 上海大学 | Chinese automatic speech recognition text restoration method and system based on representation learning |
CN115690002A (en) * | 2022-10-11 | 2023-02-03 | 河海大学 | Remote sensing image change detection method and system based on Transformer and dense feature fusion |
CN116127952A (en) * | 2023-01-16 | 2023-05-16 | 之江实验室 | Multi-granularity Chinese text error correction method and device |
CN116187334A (en) * | 2023-04-20 | 2023-05-30 | 山东齐鲁壹点传媒有限公司 | Comment generation method based on mt5 model fusion ner entity identification |
CN116757164A (en) * | 2023-06-21 | 2023-09-15 | 张丽莉 | GPT generation language recognition and detection system |
CN116822464A (en) * | 2023-07-03 | 2023-09-29 | 成都数之联科技股份有限公司 | Text error correction method, system, equipment and storage medium |
Non-Patent Citations (3)
Title |
---|
Guo, WD ; Chen, WB ; Chang, CH .Prediction of hourly inflow for reservoirs at mountain catchments using residual error data and multiple-ahead correction technique.HYDROLOGY RESEARCH.2023,第1072-1093页. * |
基于数据增广和复制的中文语法错误纠正方法;汪权彬;谭营;;智能系统学报(01);第105-112页 * |
孙邱杰 ; 梁景贵 ; 李思.基于BART噪声器的中文语法纠错模型.计算机应用.2022,第860-866页. * |
Also Published As
Publication number | Publication date |
---|---|
CN117151084A (en) | 2023-12-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11113479B2 (en) | Utilizing a gated self-attention memory network model for predicting a candidate answer match to a query | |
CN112270379B (en) | Training method of classification model, sample classification method, device and equipment | |
US11210306B2 (en) | Dialogue system, a method of obtaining a response from a dialogue system, and a method of training a dialogue system | |
US11741109B2 (en) | Dialogue system, a method of obtaining a response from a dialogue system, and a method of training a dialogue system | |
CN107836000B (en) | Improved artificial neural network method and electronic device for language modeling and prediction | |
CN109887484B (en) | Dual learning-based voice recognition and voice synthesis method and device | |
WO2022142041A1 (en) | Training method and apparatus for intent recognition model, computer device, and storage medium | |
CN110046248B (en) | Model training method for text analysis, text classification method and device | |
CN117151084B (en) | Chinese spelling and grammar error correction method, storage medium and equipment | |
CN111859978A (en) | Emotion text generation method based on deep learning | |
CN110737764A (en) | personalized dialogue content generating method | |
WO2023197613A1 (en) | Small sample fine-turning method and system and related apparatus | |
CN111354333B (en) | Self-attention-based Chinese prosody level prediction method and system | |
KR20190061488A (en) | A program coding system based on artificial intelligence through voice recognition and a method thereof | |
Pramanik et al. | Text normalization using memory augmented neural networks | |
CN116308754B (en) | Bank credit risk early warning system and method thereof | |
CN114417872A (en) | Contract text named entity recognition method and system | |
CN115293139A (en) | Training method of voice transcription text error correction model and computer equipment | |
CN115293138A (en) | Text error correction method and computer equipment | |
CN114281982B (en) | Book propaganda abstract generation method and system adopting multi-mode fusion technology | |
CN117151121B (en) | Multi-intention spoken language understanding method based on fluctuation threshold and segmentation | |
Huang et al. | Fast Neural Network Language Model Lookups at N-Gram Speeds. | |
CN117668157A (en) | Retrieval enhancement method, device, equipment and medium based on knowledge graph | |
CN110543566B (en) | Intention classification method based on self-attention neighbor relation coding | |
CN115129826B (en) | Electric power field model pre-training method, fine tuning method, device and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information | ||
CB03 | Change of inventor or designer information |
Inventor after: Song Yao Inventor after: Wei Chuanqiang Inventor after: Si Junbo Inventor after: Li Zhe Inventor after: Liu Peng Inventor before: Song Yao Inventor before: Wei Chuanqiang Inventor before: Si Junbo Inventor before: Li Zhe Inventor before: Liu Peng |
|
GR01 | Patent grant | ||
GR01 | Patent grant |