CN117151084B - Chinese spelling and grammar error correction method, storage medium and equipment - Google Patents

Chinese spelling and grammar error correction method, storage medium and equipment Download PDF

Info

Publication number
CN117151084B
CN117151084B CN202311425616.9A CN202311425616A CN117151084B CN 117151084 B CN117151084 B CN 117151084B CN 202311425616 A CN202311425616 A CN 202311425616A CN 117151084 B CN117151084 B CN 117151084B
Authority
CN
China
Prior art keywords
loss
sequence
spelling
length
error correction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311425616.9A
Other languages
Chinese (zh)
Other versions
CN117151084A (en
Inventor
宋耀
魏传强
司君波
李喆
刘鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Qilu Yidian Media Co ltd
Original Assignee
Shandong Qilu Yidian Media Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Qilu Yidian Media Co ltd filed Critical Shandong Qilu Yidian Media Co ltd
Priority to CN202311425616.9A priority Critical patent/CN117151084B/en
Publication of CN117151084A publication Critical patent/CN117151084A/en
Application granted granted Critical
Publication of CN117151084B publication Critical patent/CN117151084B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/232Orthographic correction, e.g. spell checking or vowelisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Machine Translation (AREA)

Abstract

The invention belongs to the technical field of language processing, and particularly relates to a Chinese spelling and grammar error correction method, a storage medium and equipment, which can detect and correct spelling errors and grammar errors in an input text. On the basis that the original RoBERTa model can only process spelling error correction tasks, the text spelling errors and text grammar errors are corrected at the same time by improving the addition generator, so that the error correction efficiency is obviously improved.

Description

Chinese spelling and grammar error correction method, storage medium and equipment
Technical Field
The invention belongs to language processing technology, and particularly relates to a Chinese spelling and grammar error correction method, a storage medium and equipment.
Background
In conventional natural language processing, spelling correction and grammar correction are typically two tasks that are handled separately. Spelling error correction is focused on detecting and correcting spelling errors, while grammar error correction aims at repairing grammar errors. However, this method of separate processing may result in loss of information and accumulation of errors. The native RoBERTa model can only handle text spelling error correction, and cannot detect grammar errors at the same time.
Disclosure of Invention
In order to simultaneously process spelling and grammar error correction, the application provides a new method based on RoBERTa, and unified detection and correction of text spelling errors and grammar errors are realized. By entering the text containing the errors into a new model, more comprehensive context information can be obtained and compared to the correct text to discover and correct spelling and grammar errors simultaneously. The comprehensive method can better solve the complex error condition in the text, and has higher accuracy and robustness. The technical proposal is that,
a Chinese spelling and grammar error correction method comprises the following steps:
s1, using a Roberta encoder model, inputting a sequence X (X 1 ,x 2 ,..,x n ) Coding to obtain an output sequence H (H 1 ,h 2 ,..,h n ) Wherein x is n Token, h, being the nth position of input sequence X n A token at the nth position of the output sequence H;
s2, adding a CNN convolution layer after the RoBERTa encoder model outputs a sequence H, and extracting a local feature C output by the encoder through a convolution kernel to obtain a local feature tensor; fusing the local feature tensor and the encoder output sequence H through residual connection to obtain a fused semantic representation sequence H';
s3, carrying out maximum pooling operation on the fused semantic representation sequence H' to obtain a representation vector V with a fixed length;
s4, transmitting the representation vector V into a full connection layer to obtain the prediction distribution of the length of the target sequence;
s5, inputting the output of the encoder and the target word into a decoder module, and enabling the decoder to simultaneously correct spelling errors and repair grammar errors by combining an attention mechanism and a pointer network.
Preferably, in step S2, the local feature C of the output is extracted, and the specific formula is as follows:
C=Conv1D(H);
wherein Conv1D is a 1-dimensional convolution function.
Preferably, in step S2, the extracted and output local feature tensor and the output sequence H of the RoBERTa model are combined and fused through residual connection to obtain a fused semantic representation sequence H ' = (H ' ' 1 ,h′ 2 ,..,h′ n ) And taking the fused semantic representation sequence H' as the input of the step S3.
Preferably, in step S4, the target sequence length prediction is performed on the mapped expression vector V through the full-link layer to obtain the prediction distribution p len
p len =WV+b;
Where W is the full connection layer weight and b is the bias term.
Preferably, in step S5,
s51, calculating attention weight a t
Wherein h is i Output for the ith position of the encoder, b attn As a parameter of the bias it is possible,as a learnable weight matrix for mapping context information in the attention mechanism to the appropriate dimension, W h And W is s For a weight matrix that can be learned for h i And the decoder state s of the last time step t-1 Mapping to the appropriate dimension;
s52, based on a t Generating context vector c t
Wherein a is ti Represents the attention weight, H, to the ith position of the input sequence at time t i Representing Roberta
Semantic features of the i-th location;
s53, c t As input, the decoder state s is updated t
s t =RNN([s t-1 ,c t ])
S54, through s t And c t Calculating probability distribution p of generated words vocab Generating a duplication probability distribution P in an input sequence based on H copy
p vocab =softmax(Linear([c t ,s t ]))
P copy =sigmoid(Linear(H))
S55, generating final distribution p by using pointer mechanism
p=p copy *a t +(1-p copy )*p vocab
Preferably, in step S5, a Loss function Loss is calculated, where the Loss function includes a generation Loss, a pointer network Loss, and a length prediction Loss, and the calculation process is as follows:
calculating generation loss: the decoder, based on the prediction distribution of vocab, cross entropy loss with the target word,
loss1=CrossEntropyLoss(p,y);
wherein y is a target word;
calculating pointer network loss: the pointer network is utilized to directly copy the loss of the original words, calculate the attention distribution and the cross entropy of the target words,
loss2=CrossEntropyLoss(p copy ,y);
calculating a length prediction loss: a loss between the predicted length and the target length,
loss3=L1Loss(p len ,l tag )
wherein p is len To predict length, l tag Is the target length;
synthesizing a Loss function Loss:
Loss=w 1 loss1+w 2 loss2+w 3 loss3
wherein w is 1 ,w 2 ,w 3 All are loss weights and are trained by the model.
A computer readable storage medium containing program instructions stored thereon that when executed perform a RoBERTa-based spelling, grammar error correction method.
An electronic device comprising a memory and a processor, the memory storing computer readable instructions that, when executed by the processor, perform steps in a RoBERTa-based spelling, grammar error correction method.
Compared with the prior art, the beneficial effects of the application are as follows:
the invention realizes the unification of text spelling check and grammar check by adding the modules such as the convolution layer, residual error connection, pointer network and the like on the RoBERTa model, optimizes the error correction performance and has obvious technical progress.
Drawings
Fig. 1 is a flow chart of the present application.
Detailed Description
The following detailed description is exemplary and is intended to provide further explanation of the present application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments in accordance with the present application.
Roberta is proposed in the paper Roberta A Robustly Optimized BERT Pretraining Approach. The method belongs to the enhanced version of BERT and is also a finer tuning version of BERT model. The RoBERTa model is an improved version of BERT (A Robustly Optimized BERT, from its name, a simple rough BERT method called brute force optimization).
A Chinese spelling and grammar error correction method comprises the following steps:
s1, utilizing a Roberta encoder model, and correcting Chinese sentencesAs a pair of input sequences X (X 1 ,x 2 ,..,x n ) Coding to obtain an output sequence H (H 1 ,h 2 ,..,h n );
S2, adding a CNN convolution layer after the RoBERTa encoder model outputs a sequence H, and extracting a local feature C output by the encoder through a convolution kernel to obtain a local feature tensor; fusing the local feature tensor and the encoder output sequence H through residual connection to obtain a fused semantic representation sequence H';
the local feature C of the output is extracted, and the specific formula is as follows:
C=Conv1D(H);
wherein Conv1D is a 1-dimensional convolution function.
Combining and fusing the extracted and output local feature tensor and the output sequence H of the RoBERTa model through residual connection to obtain a fused semantic representation sequence H ' = (H ') ' 1 ,h′ 2 ,..,h′ n ) And taking the fused semantic representation sequence H' as the input of the step S3.
S3, carrying out maximum pooling on the input H' to obtain a fixed-length representation vector V.
S4, transmitting the representation vector V into a full connection layer to obtain the prediction distribution of the length of the target sequence;
predicting the length of the target sequence by using the representation vector V after pulling through the full connection layer to obtain a prediction distribution p len
p len =WV+b;
Where W is the full connection layer weight and b is the bias term.
S5, inputting the output of the encoder and the target word into a decoder module, and enabling the decoder to simultaneously correct spelling errors and repair grammar errors by combining an attention mechanism and a pointer network.
S51, calculating attention weight a t
Wherein h is i Output for the ith position of the encoder, b attn As a parameter of the bias it is possible,as a learnable weight matrix for mapping context information in the attention mechanism to the appropriate dimension, W h And W is s For a weight matrix that can be learned for h i And the decoder state s of the last time step t-1 Mapping to the appropriate dimension;
s52, based on a t Generating context vector c t
Wherein a is ti Represents the attention weight, H, to the ith position of the input sequence at time t i Representing Roberta
Semantic features of the i-th location.
S53, c t As input, the decoder state s is updated t
s t =RNN([s t-1 ,c t ])
S54, through s t And c t Calculating probability distribution p of generated words vocab Generating a duplication probability distribution P in an input sequence based on H copy
p vocab =softmax(Linear([c t ,s t ]))
P copy =sigmoid(Linear(H))
S55, generating final distribution p by using pointer mechanism
p=p copy *a t +(1-p copy )*p vocab
Calculating a Loss function Loss, wherein the Loss function comprises a generation Loss, a pointer network Loss and a length prediction Loss, and the calculation process comprises the following steps:
calculating generation loss: the decoder, based on the prediction distribution of vocab, cross entropy loss with the target word,
loss1=CrossEntropyLoss(p,y);
wherein y is a target word;
calculating pointer network loss: the pointer network is utilized to directly copy the loss of the original words, calculate the attention distribution and the cross entropy of the target words,
loss2=CrossEntropyLoss(p copy ,y);
calculating a length prediction loss: a loss between the predicted length and the target length,
loss3=L1Loss(p len ,l tag )
wherein p is len To predict length, l tag Is the target length;
synthesizing a Loss function Loss:
Loss=w 1 loss1+w 2 loss2+w 3 loss3
wherein w is 1 ,w 2 ,w 3 All are loss weights and are trained by the model.
A computer readable storage medium containing program instructions stored thereon that when executed perform a RoBERTa-based spelling, grammar error correction method.
An electronic device comprising a memory and a processor, the memory storing computer readable instructions that, when executed by the processor, perform steps in a RoBERTa-based spelling, grammar error correction method.
Experimental data 1 spelling error correction effect
Model Precision Recall F1
Bert 0.8107 0.6390 0.7147
RoBERTa 0.825 0.7293 0.7742
The invention is that 0.8713 0.7634 0.8138
The experimental results show that the spelling error correction Precision, recall and F1 fractions of the method are obviously superior to those of the Bert and original RoBERTa models, and the spelling error correction effect is greatly improved.
Experimental data 2 grammar spelling and error correction effect
Model Precision Recall F0.5
Convseq2seq 0.362 0.354 0.360
T5 0.506 0.496 0.504
The invention is that 0.576 0.567 0.574
As can be seen from experimental results, spelling and grammar error correction are greatly improved in both Precision and Recall compared with Convseq2 seq.
Experiment number 3 efficiency of execution
Model QPS
Bert 3
RoBERTa 3
Convseq2seq 5
T5 94
The invention is that 51
The T5 model is a pre-training language model based on a transducer architecture, and has the advantages of high training efficiency, strong generalization capability, adaptation to various natural language processing tasks and the like.
In the task of natural language generation, most of which are implemented based on the Seq2Seq model, conv Seq2Seq is a relatively new method based on CNN.
In terms of calculation efficiency, the invention has slower speed than the models which can only correct spelling and error by Bert and RoBERTa because of combining spelling and grammar error correction, but has obvious speed improvement compared with the grammar error model.

Claims (6)

1. A Chinese spelling and grammar error correction method is characterized by comprising the following steps:
s1, using a Roberta encoder model, inputting a sequence X (X 1 ,x 2 ,..,x n ) Coding to obtain an output sequence H (H 1 ,h 2 ,..,h n ) Wherein x is n Token, h, being the nth position of input sequence X n A token at the nth position of the output sequence H;
s2, adding a CNN convolution layer after the RoBERTa encoder model outputs a sequence H, and extracting a local feature C output by the encoder through a convolution kernel to obtain a local feature tensor; fusing the local feature tensor and the encoder output sequence H through residual connection to obtain a fused semantic representation sequence H';
s3, carrying out maximum pooling operation on the fused semantic representation sequence H' to obtain a representation vector V with a fixed length;
s4, transmitting the representation vector V into a full connection layer to obtain the prediction distribution of the length of the target sequence;
s5, inputting the output of the encoder and the target word y into a decoder module, and enabling the decoder to simultaneously correct spelling errors and repair grammar errors by combining an attention mechanism and a pointer network;
S51.calculating the attention weight a t
Wherein h is i Output for the ith position of the encoder, b attn As a parameter of the bias it is possible,as a learnable weight matrix for mapping context information in the attention mechanism to the appropriate dimension, W h And Ws is a learnable weight matrix for h i And the decoder state s of the last time step t-1 Mapping to the appropriate dimension;
s52, based on a t Generating context vector c t
Wherein a is ti Represents the attention weight, H, to the ith position of the input sequence at time t i Semantic features representing the ith position of RoBERTa;
s53, c t As input, the decoder state s is updated t
s t =RNN([s t-1 ,c t ])
S54, through s t And c t Calculating probability distribution p of generated words vocab Generating a duplication probability distribution P in an input sequence based on H copy
p vocab =softmax(Linear([c t ,s t ]))
P copy =sigmoid(Linear(H))
S55, generating final distribution p by using pointer mechanism
p=p copy *a t +(1-p copy )*p vocab
Calculating a Loss function Loss, wherein the Loss function comprises a generation Loss, a pointer network Loss and a length prediction Loss, and the calculation process comprises the following steps:
calculating generation loss: the decoder, based on the prediction distribution of vocab, cross entropy loss with the target word,
loss1=CrossEntropyLoss(p,y);
wherein y is a target word;
calculating pointer network loss: the pointer network is utilized to directly copy the loss of the original words, calculate the attention distribution and the cross entropy of the target words,
loss2=CrossEntropyLoss(p copy ,y);
calculating a length prediction loss: a loss between the predicted length and the target length,
loss3=L1Loss(p len ,l tag )
wherein p is len To predict length, l tag Is the target length;
synthesizing a Loss function Loss:
Loss=w 1 loss1+w 2 loss2+w 3 loss3
wherein w is 1 ,w 2 ,w 3 All are loss weights and are trained by the model.
2. The method for correcting spelling and grammar of Chinese characters according to claim 1, wherein in step S2, the output local feature C is extracted by the following specific formula:
C=Conv1D(H);
wherein Conv1D is a 1-dimensional convolution function.
3. The method of claim 2, wherein in step S2, the extracted local feature tensor and the output sequence H of the RoBERTa model are combined and fused by residual connection to obtain a fused semantic representation sequence H ' = (H ' ' 1 ,h′ 2 ,..,h′ n ) And taking the fused semantic representation sequence H' as the input of the step S3.
4. A according to claim 1A Chinese spelling and grammar error correction method is characterized in that in step S4, a target sequence length prediction is carried out on a mapped expression vector V through a full connection layer to obtain a prediction distribution p len
p len =WV+b;
Where W is the full connection layer weight and b is the bias term.
5. A computer readable storage medium containing program instructions stored thereon, which when executed, are adapted to perform a chinese spelling, grammar error correction method according to any one of claims 1-4.
6. An electronic device, comprising: a memory and a processor, the memory storing computer readable instructions that, when executed by the processor, perform the steps in the method of any of claims 1-4.
CN202311425616.9A 2023-10-31 2023-10-31 Chinese spelling and grammar error correction method, storage medium and equipment Active CN117151084B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311425616.9A CN117151084B (en) 2023-10-31 2023-10-31 Chinese spelling and grammar error correction method, storage medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311425616.9A CN117151084B (en) 2023-10-31 2023-10-31 Chinese spelling and grammar error correction method, storage medium and equipment

Publications (2)

Publication Number Publication Date
CN117151084A CN117151084A (en) 2023-12-01
CN117151084B true CN117151084B (en) 2024-02-23

Family

ID=88910495

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311425616.9A Active CN117151084B (en) 2023-10-31 2023-10-31 Chinese spelling and grammar error correction method, storage medium and equipment

Country Status (1)

Country Link
CN (1) CN117151084B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117454906B (en) * 2023-12-22 2024-05-24 创云融达信息技术(天津)股份有限公司 Text proofreading method and system based on natural language processing and machine learning

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112861517A (en) * 2020-12-24 2021-05-28 杭州电子科技大学 Chinese spelling error correction model
WO2021164310A1 (en) * 2020-02-21 2021-08-26 华为技术有限公司 Text error correction method and apparatus, and terminal device and computer storage medium
CN114417839A (en) * 2022-01-19 2022-04-29 北京工业大学 Entity relation joint extraction method based on global pointer network
WO2022095563A1 (en) * 2020-11-06 2022-05-12 北京世纪好未来教育科技有限公司 Text error correction adaptation method and apparatus, and electronic device, and storage medium
CN114912419A (en) * 2022-04-19 2022-08-16 中国人民解放军国防科技大学 Unified machine reading understanding method based on reorganization confrontation
CN115080715A (en) * 2022-05-30 2022-09-20 重庆理工大学 Span extraction reading understanding method based on residual error structure and bidirectional fusion attention
CN115438154A (en) * 2022-09-19 2022-12-06 上海大学 Chinese automatic speech recognition text restoration method and system based on representation learning
CN115690002A (en) * 2022-10-11 2023-02-03 河海大学 Remote sensing image change detection method and system based on Transformer and dense feature fusion
CN115809655A (en) * 2021-09-14 2023-03-17 华东师范大学 Chinese character symbol correction method and system based on attribution network and BERT
CN116127952A (en) * 2023-01-16 2023-05-16 之江实验室 Multi-granularity Chinese text error correction method and device
CN116187334A (en) * 2023-04-20 2023-05-30 山东齐鲁壹点传媒有限公司 Comment generation method based on mt5 model fusion ner entity identification
CN116757164A (en) * 2023-06-21 2023-09-15 张丽莉 GPT generation language recognition and detection system
CN116822464A (en) * 2023-07-03 2023-09-29 成都数之联科技股份有限公司 Text error correction method, system, equipment and storage medium
WO2023184633A1 (en) * 2022-03-31 2023-10-05 上海蜜度信息技术有限公司 Chinese spelling error correction method and system, storage medium, and terminal

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021164310A1 (en) * 2020-02-21 2021-08-26 华为技术有限公司 Text error correction method and apparatus, and terminal device and computer storage medium
WO2022095563A1 (en) * 2020-11-06 2022-05-12 北京世纪好未来教育科技有限公司 Text error correction adaptation method and apparatus, and electronic device, and storage medium
CN112861517A (en) * 2020-12-24 2021-05-28 杭州电子科技大学 Chinese spelling error correction model
CN115809655A (en) * 2021-09-14 2023-03-17 华东师范大学 Chinese character symbol correction method and system based on attribution network and BERT
CN114417839A (en) * 2022-01-19 2022-04-29 北京工业大学 Entity relation joint extraction method based on global pointer network
WO2023184633A1 (en) * 2022-03-31 2023-10-05 上海蜜度信息技术有限公司 Chinese spelling error correction method and system, storage medium, and terminal
CN114912419A (en) * 2022-04-19 2022-08-16 中国人民解放军国防科技大学 Unified machine reading understanding method based on reorganization confrontation
CN115080715A (en) * 2022-05-30 2022-09-20 重庆理工大学 Span extraction reading understanding method based on residual error structure and bidirectional fusion attention
CN115438154A (en) * 2022-09-19 2022-12-06 上海大学 Chinese automatic speech recognition text restoration method and system based on representation learning
CN115690002A (en) * 2022-10-11 2023-02-03 河海大学 Remote sensing image change detection method and system based on Transformer and dense feature fusion
CN116127952A (en) * 2023-01-16 2023-05-16 之江实验室 Multi-granularity Chinese text error correction method and device
CN116187334A (en) * 2023-04-20 2023-05-30 山东齐鲁壹点传媒有限公司 Comment generation method based on mt5 model fusion ner entity identification
CN116757164A (en) * 2023-06-21 2023-09-15 张丽莉 GPT generation language recognition and detection system
CN116822464A (en) * 2023-07-03 2023-09-29 成都数之联科技股份有限公司 Text error correction method, system, equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Guo, WD ; Chen, WB ; Chang, CH .Prediction of hourly inflow for reservoirs at mountain catchments using residual error data and multiple-ahead correction technique.HYDROLOGY RESEARCH.2023,第1072-1093页. *
基于数据增广和复制的中文语法错误纠正方法;汪权彬;谭营;;智能系统学报(01);第105-112页 *
孙邱杰 ; 梁景贵 ; 李思.基于BART噪声器的中文语法纠错模型.计算机应用.2022,第860-866页. *

Also Published As

Publication number Publication date
CN117151084A (en) 2023-12-01

Similar Documents

Publication Publication Date Title
US11113479B2 (en) Utilizing a gated self-attention memory network model for predicting a candidate answer match to a query
CN112270379B (en) Training method of classification model, sample classification method, device and equipment
US11210306B2 (en) Dialogue system, a method of obtaining a response from a dialogue system, and a method of training a dialogue system
US11741109B2 (en) Dialogue system, a method of obtaining a response from a dialogue system, and a method of training a dialogue system
CN107836000B (en) Improved artificial neural network method and electronic device for language modeling and prediction
CN109887484B (en) Dual learning-based voice recognition and voice synthesis method and device
WO2022142041A1 (en) Training method and apparatus for intent recognition model, computer device, and storage medium
CN110046248B (en) Model training method for text analysis, text classification method and device
CN117151084B (en) Chinese spelling and grammar error correction method, storage medium and equipment
CN111859978A (en) Emotion text generation method based on deep learning
CN110737764A (en) personalized dialogue content generating method
WO2023197613A1 (en) Small sample fine-turning method and system and related apparatus
CN111354333B (en) Self-attention-based Chinese prosody level prediction method and system
KR20190061488A (en) A program coding system based on artificial intelligence through voice recognition and a method thereof
Pramanik et al. Text normalization using memory augmented neural networks
CN116308754B (en) Bank credit risk early warning system and method thereof
CN114417872A (en) Contract text named entity recognition method and system
CN115293139A (en) Training method of voice transcription text error correction model and computer equipment
CN115293138A (en) Text error correction method and computer equipment
CN114281982B (en) Book propaganda abstract generation method and system adopting multi-mode fusion technology
CN117151121B (en) Multi-intention spoken language understanding method based on fluctuation threshold and segmentation
Huang et al. Fast Neural Network Language Model Lookups at N-Gram Speeds.
CN117668157A (en) Retrieval enhancement method, device, equipment and medium based on knowledge graph
CN110543566B (en) Intention classification method based on self-attention neighbor relation coding
CN115129826B (en) Electric power field model pre-training method, fine tuning method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Song Yao

Inventor after: Wei Chuanqiang

Inventor after: Si Junbo

Inventor after: Li Zhe

Inventor after: Liu Peng

Inventor before: Song Yao

Inventor before: Wei Chuanqiang

Inventor before: Si Junbo

Inventor before: Li Zhe

Inventor before: Liu Peng

GR01 Patent grant
GR01 Patent grant