CN117787266A - Large language model text error correction method and device based on pre-training knowledge embedding - Google Patents

Large language model text error correction method and device based on pre-training knowledge embedding Download PDF

Info

Publication number
CN117787266A
CN117787266A CN202311810975.6A CN202311810975A CN117787266A CN 117787266 A CN117787266 A CN 117787266A CN 202311810975 A CN202311810975 A CN 202311810975A CN 117787266 A CN117787266 A CN 117787266A
Authority
CN
China
Prior art keywords
text
knowledge
model
error correction
knowledge base
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311810975.6A
Other languages
Chinese (zh)
Inventor
裴唯一
靳国庆
李宏亮
余栋
李君�
张勇东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Konami Sports Club Co Ltd
Original Assignee
People Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by People Co Ltd filed Critical People Co Ltd
Priority to CN202311810975.6A priority Critical patent/CN117787266A/en
Publication of CN117787266A publication Critical patent/CN117787266A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The embodiment of the invention discloses a large language model text error correction method and device based on pre-training knowledge embedding, wherein the method comprises the following steps: inputting a text to be corrected into a knowledge base model obtained by pre-training, inputting text correction task information into a task encoder, and connecting the output of the knowledge base model with the output of the task encoder through a task adapter to obtain a knowledge embedded feature vector; and inputting the text to be corrected and the knowledge embedded feature vector into a pre-trained and obtained preset correction large model to obtain corrected text. And professional knowledge in the knowledge base is integrated into the error correction large model through the task adapter and the knowledge base model, so that the error correction large model can process text error correction tasks more accurately, and the error correction precision and efficiency are improved.

Description

Large language model text error correction method and device based on pre-training knowledge embedding
Technical Field
The embodiment of the invention relates to the technical field of natural language processing, in particular to a large language model text error correction method and device based on pre-training knowledge embedding.
Background
Text correction is an important task in natural language processing, with the goal of correcting errors in the original text, such as mispronounced words, multi-words, few words, misordering, etc. Natural language processing is widely used in the fields of automatic document correction, archiving, security inspection, and the like.
The existing text error correction mainly adopts a two-stage method such as combination of detection and correction and an end-to-end method based on a Transformer structure. For the conventional two-segment method, it is necessary to detect a text portion that may be in error first and then correct each error. This approach can use different models to handle both detection and correction tasks, has flexibility, and can use lighter models. However, this method may crack the context information of the text, resulting in an inaccurate judgment of the error correction model. The end-to-end method based on the transducer suffers from length limitation, model illusion problems, and the like, resulting in failure to ensure accuracy of error correction.
Disclosure of Invention
In view of the foregoing, embodiments of the present invention have been developed to provide a method and apparatus for text correction based on pre-training knowledge embedding for large language models that overcome or at least partially solve the foregoing problems.
According to an aspect of an embodiment of the present invention, there is provided a large language model text correction method based on pre-training knowledge embedding, including:
inputting a text to be corrected into a knowledge base model obtained by pre-training, inputting text correction task information into a task encoder, and connecting the output of the knowledge base model with the output of the task encoder through a task adapter to obtain a knowledge embedded feature vector;
and inputting the text to be corrected and the knowledge embedded feature vector into a pre-trained and obtained preset correction large model to obtain corrected text.
According to another aspect of the embodiment of the present invention, there is provided a large language model text correction device based on pre-training knowledge embedding, the device comprising:
the adapter module is suitable for inputting the text to be corrected into the pre-trained knowledge base model, inputting text correction task information into the task encoder, and connecting the output of the knowledge base model with the output of the task encoder through the task adapter to obtain a knowledge embedded feature vector;
the error correction module is suitable for inputting the text to be corrected and the knowledge embedded feature vector into a pre-trained and obtained preset error correction large model to obtain corrected text.
According to yet another aspect of an embodiment of the present invention, there is provided a computing device including: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the large language model text error correction method embedded based on the pre-training knowledge.
According to still another aspect of the embodiments of the present invention, there is provided a computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the large language model text error correction method embedded based on pre-training knowledge as described above.
According to the text error correction method and device for the large language model based on the embedded pre-training knowledge, which are provided by the embodiment of the invention, professional knowledge in the knowledge base is integrated into the error correction large model through the task adapter and the knowledge base model, so that the error correction large model can process text error correction tasks more accurately, and the error correction precision and efficiency are improved.
The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and may be implemented according to the content of the specification, so that the technical means of the embodiments of the present invention can be more clearly understood, and the following specific implementation of the embodiments of the present invention will be more apparent.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 illustrates a flow diagram of a large language model text correction method based on pre-training knowledge embedding, in accordance with an embodiment of the invention;
FIG. 2 shows a text error correction overall architecture diagram;
FIG. 3 shows a schematic structural diagram of a large language model text correction device based on pre-training knowledge embedding in accordance with an embodiment of the invention;
FIG. 4 illustrates a schematic diagram of a computing device, according to one embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
FIG. 1 shows a flow chart of a large language model text correction method based on pre-training knowledge embedding, according to one embodiment of the invention, as shown in FIG. 1, comprising the steps of:
step S101, inputting a text to be corrected into a pre-trained knowledge base model, inputting text correction task information into a task encoder, and connecting the output of the knowledge base model with the output of the task encoder through a task adapter to obtain a knowledge embedded feature vector.
When the text to be corrected is corrected, the correction result may have unreal information due to model illusion problem. In practical application, the text to be corrected may also contain knowledge in the professional field, so that the proprietary knowledge needs to be incorporated into the language model to improve the accuracy of correction. The text to be corrected may be text in a document, or may be a certain text, which is not limited herein.
The knowledge base model is introduced, and can provide professional knowledge in the professional field for the preset error correction large model, so that the preset error correction large model is helped to improve error correction accuracy. Specifically, the text to be corrected is input into a knowledge base model obtained through pre-training, and the knowledge base model obtains knowledge information containing professional knowledge based on a multi-head attention mechanism and the like through carrying out position coding on the text to be corrected. The knowledge base model can be constructed by adopting a structure such as a Bert model, the knowledge base model can be independently trained, an unsupervised training model can be based on an MLM (Masked Language Modeling, MASK language modeling) mode during training, part of characters in the corpus are randomly replaced by MASK marks or wrong characters, and the original corpus is taken as a model to be output, so that an unsupervised training process is constructed. For the knowledge base model, different levels of error correction tasks such as character level error correction, phrase level error correction, sentence level error correction and the like can be provided according to requirements, for example, single characters are modeled by using a random mask, long text modeling is performed by using a gmask tag, and the like, so that different levels of error correction are realized. Error correction includes different tasks such as correcting wrongly written words, correcting misordering, maintaining correct reference text, etc., and is specifically set according to the implementation, and is not limited herein.
The knowledge base model can be obtained by independent pre-training, and the specific training process comprises the following steps: the knowledge base sample data is collected firstly, the knowledge base sample data can be obtained through collecting proprietary knowledge to which a text error correction task belongs, preprocessing is carried out on the knowledge base sample data, such as duplicate removal, privacy data removal, low-quality text filtering and the like, and then knowledge enhancement (such as expansion of the number of samples used in training in a manner of meaning word replacement, random insertion, random exchange, random deletion and the like, increase of the diversity of training samples and the like) and marking (positive and negative sample setting and the like) are carried out on the preprocessed knowledge base sample data, so that a knowledge base sample set is obtained. And inputting the knowledge base sample set into a preset knowledge base model for training to obtain a trained knowledge base model. The knowledge base model can perform substitution prediction such as MLM on different knowledge base sample data according to the characteristics of different professional fields to which the knowledge base sample set belongs, compare substitution prediction data with the knowledge base sample data to calculate loss, and obtain a trained knowledge base model according to the loss optimization model. The knowledge base model has better domain knowledge coding capability, and can code the text to be corrected to obtain knowledge information containing domain-specific knowledge. And the knowledge base model is obtained by independent training, so that the knowledge base model is more flexible and has high expansibility.
The text error correction task information indicates the direction of text error correction, such as "in XX years, please correct the possible job errors in the lower document", "please correct the possible error expression in the lower document according to the XX report content", "please correct the possible error information in the lower document", etc., the text error correction task information is input to the task encoder, the task encoder encodes it to obtain task information, and the subsequent task information can explicitly inform the preset error correction big model to perform error correction in a targeted manner. The task encoder may use, for example, MLP (Multi-Layer Perceptron) or may use other suitable encoding modes, which are not limited herein.
The task adapter can connect the knowledge information output by the knowledge base model with the task information output by the task encoder to obtain the knowledge embedded feature vector. The method is specifically as follows:
Y=Adapter(KnowledgeEncoder(D),TaskEncoder(T))
Adapter(E K ,E T )=Linear(E K ,d hidden )+Linear(E T ,d hidden )
wherein, the Adapter is the algorithm of the task Adapter, D is the text to be corrected, T is the text correction task information, knowledgeEncoder is the algorithm of the knowledge base model, and TaskEncoder is the algorithm of the task encoder.n is the length of the text to be corrected, and d is the superparameter hidden size. E (E) K For knowledge information, E T For task information, the two can be spliced after being processed by, for example, a Linear algorithm during splicing, d hidden I.e. hidden parameters, are determined according to the implementation, and the above is exemplified, and other manners may be adopted according to the implementation, which is not limited herein.
Step S102, inputting the text to be corrected and the knowledge embedded feature vector into a pre-trained and obtained preset correction large model to obtain corrected text.
The pre-set error correction large model may employ a transducer-based pre-training model such as BERT, GPT, etc., preferably employing a seq2seq sequence-to-sequence model architecture for error correction. Specifically, the preset error correction large model prediction process may be defined as that the preset error correction large model continuously predicts the next character according to the known text until the prediction result is an ending symbol or the prediction length reaches the maximum length, and the character prediction formula is as follows:
x i+1 =BaseModel(D),D={x 0 ,...,x i } (1)
wherein the BaseModel is composed of nLayer TransformerLayer (Layer), each of which contains components such as MultiHeadAttention (MHAttn), feedForward (FF), layerNorm, etc., equation (1) can also be expressed as:
x i+1 =Layer(D) nLayer (2)
Layer(X)=Norm(Add(X,X,X)) (3)
Add(Q,K,V)=LayerNorm(Q+MHAttn(Q,K,V)) (4)
Norm(H)=LayerNorm(H+FF(H)) (5)
wherein, in the formulas, D is a sample text set used for training, and x i Is the ith marking result in D, x i+1 For the next character to be predicted,as the feature vector, each feature vector in the formula is used for illustrating each formula, and the corresponding actual parameter is specifically set according to the implementation situation, which is not limited herein. Layer, norm, layerNorm, add, MHAttn, FF, etc. are algorithms corresponding to the different components.
The preset error correction large model can be independently trained based on the formulas, and the training process comprises the following steps: sample text is collected in advance, and the sample file is preprocessed. Sample texts can be collected from professional documents in various fields, and the sample texts after pretreatment such as duplication removal, privacy data removal, low-quality document filtering and the like are marked to obtain a sample text set. And (3) inputting the sample text set into a preset error correction large model for training, and obtaining the trained preset error correction large model by NSP (next sentence prediction, next structure prediction) in the pre-training process based on the formulas 1-5. The preset error correction large model predicts a subsequent text based on a known text, compares the predicted text with a sample text set to determine loss, and is obtained through repeated iterative training, and has good language understanding capability and generating capability.
After the preset error correction large model is obtained through pre-training, inputting the text to be corrected into the preset error correction large model obtained through pre-training, carrying out position coding and the like on the text to be corrected through the preset error correction large model, meanwhile, inputting the knowledge embedded feature vector obtained through the task adapter into the preset error correction large model as a query key value, and obtaining the text after error correction through the encoder and the decoder through the preset error correction large model. After the knowledge embedded feature vector is added into the preset error correction large model, the prediction process is changed into that:
x i+1 =BaseModel(D|Y) (6)
x i+1 =Layer({x 0 ,...,x i }|Y) nLayer (7)
Layer(X|Y:ω)=Norm(Add(X,X+ωY,X+ωY)) (8)
Y=Adapter(KnowledgeEncoder(D),TaskEncoder(T)) (9)
wherein Y is obtained according to the task adapterThe knowledge is embedded into the feature vector, D is the text to be corrected, the formula 1 used in training is changed to the formula 6, the input Y is added, the formula 2 is changed to the formula 7, the formula 3 is changed to the formula 8,for the preset weight, it is determined according to the implementation, and is not limited herein, and d is the hyperparameter hidden size. And the preset error correction large model is used for splicing the knowledge embedded feature vector and the feature vector of the text to be corrected based on preset weights, and then processing the spliced text to be corrected, and correcting errors such as wrong words, multiple words, few words, disorder and the like which possibly occur in the text to be corrected, so as to obtain the text after error correction.
Further, for the task adapter, after the pre-training is performed to obtain a preset error correction large model and a knowledge base model, the task adapter can be trained according to a sample text set, a knowledge base sample set and text error correction task information, and here, the calculation loss is compared according to the sample text in the sample text set and the error correction text obtained by training, and the counter-propagation is performed, so that fine adjustment of the task adapter is realized, and the task adapter can realize accurate error correction of the text according to the special knowledge in the field.
The whole architecture diagram is shown in fig. 2, the text to be corrected is input into a knowledge base model, knowledge information is output to an adapter (namely a task adapter) through position coding, a multi-head attention mechanism, a feedforward network and the like, text correction task information is input to a task encoder, and task information is output to the adapter through an embedded layer, a linear layer and a linear layer. After the adapter processes knowledge information and task information respectively through the feedforward network, the knowledge information and task information are spliced and output through a fusion layer to be embedded (namely, knowledge embedded feature vectors) into an encoder of the error correction large model. After the error correction text is input, the position of the error correction large model is encoded, and then an answer (namely the text after error correction) is generated through an encoder and a decoder together with knowledge embedding. The error correction large model and the knowledge base model are respectively and independently pre-trained, and the adapter performs fine adjustment according to the pre-trained error correction large model and knowledge base model. The above is illustrative, and each model may be provided with layers according to the implementation, which is not limited herein.
According to the text error correction method for the large language model based on the embedded pre-training knowledge, which is provided by the embodiment of the invention, professional knowledge in the knowledge base is integrated into the error correction large model through the task adapter and the knowledge base model, so that the error correction large model can process text error correction tasks more accurately, and the error correction precision and efficiency are improved.
Fig. 3 shows a schematic structural diagram of a text error correction device of a large language model based on pre-training knowledge embedding according to an embodiment of the present invention. As shown in fig. 3, the apparatus includes:
the adapter module 310 is adapted to input a text to be corrected to a knowledge base model obtained by pre-training, input text correction task information to a task encoder, and connect the output of the knowledge base model with the output of the task encoder via a task adapter to obtain a knowledge embedded feature vector;
the error correction module 320 is adapted to input the text to be corrected and the knowledge embedded feature vector into a pre-trained pre-set error correction large model to obtain corrected text.
Optionally, the apparatus further comprises: the first training module 330 is adapted to collect sample text and pre-process the sample file; marking the preprocessed sample text to obtain a sample text set; inputting the sample text set into a preset error correction large model for training to obtain a trained preset error correction large model; the method comprises the steps of predicting a subsequent text by a preset error correction large model based on a known text, comparing the predicted text with a sample text set to determine loss, and performing iterative training.
Optionally, the apparatus further comprises: the second training module 340 is adapted to collect knowledge base sample data and perform preprocessing on the knowledge base sample data; carrying out knowledge enhancement and marking on the preprocessed knowledge base sample data to obtain a knowledge base sample set; inputting the knowledge base sample set into a preset knowledge base model for training to obtain a trained knowledge base model; and carrying out replacement prediction on the knowledge base sample data according to the knowledge base sample set by the knowledge base model, comparing the replacement prediction data with the knowledge base sample data, calculating loss, and training to obtain the knowledge base model.
Optionally, the knowledge base model includes character-level error correction, phrase-level error correction, and/or sentence-level error correction.
Optionally, the apparatus further comprises: the fine tuning module 350 is adapted to train the task adapter according to the sample text set, the knowledge base sample set, and the text correction task information, and perform the fine tuning by comparing the calculation loss back propagation according to the sample text in the sample text set and the corrected text obtained by training.
Optionally, the error correction module 320 is further adapted to:
inputting the text to be corrected and the knowledge embedded feature vector into a pre-trained large error correction model, and processing the knowledge embedded feature vector and the feature vector of the text to be corrected after splicing the knowledge embedded feature vector and the feature vector of the text to be corrected based on preset weights by the large error correction model to obtain the corrected text.
Optionally, the pre-set error correction large model comprises a sequence-to-sequence model.
The above descriptions of the modules refer to the corresponding descriptions in the method embodiments, and are not repeated herein.
The embodiment of the invention also provides a nonvolatile computer storage medium, and the computer storage medium stores at least one executable instruction which can execute the large language model text error correction method based on the pre-training knowledge embedding in any of the method embodiments.
FIG. 4 illustrates a schematic diagram of a computing device, according to an embodiment of the invention, the particular embodiment of which is not limiting of the particular implementation of the computing device.
As shown in fig. 4, the computing device may include: a processor 402, a communication interface (Communications Interface) 404, a memory 406, and a communication bus 408.
Wherein:
processor 402, communication interface 404, and memory 406 communicate with each other via communication bus 408.
A communication interface 404 for communicating with network elements of other devices, such as clients or other servers.
Processor 402 is configured to execute program 410 and may specifically perform the relevant steps of the foregoing embodiment of the large language model text correction method based on pre-training knowledge embedding.
In particular, program 410 may include program code including computer-operating instructions.
The processor 402 may be a central processing unit CPU, or a specific integrated circuit ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included by the computing device may be the same type of processor, such as one or more CPUs; but may also be different types of processors such as one or more CPUs and one or more ASICs.
Memory 406 for storing programs 410. Memory 406 may comprise high-speed RAM memory or may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
Program 410 may be specifically operable to cause processor 402 to perform the large language model text correction method based on pre-training knowledge embedding in any of the method embodiments described above. Specific implementation of each step in the procedure 410 may refer to corresponding descriptions in the corresponding steps and units in the large language model text error correction embodiment embedded based on the pre-training knowledge, which are not described herein. It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus and modules described above may refer to corresponding procedure descriptions in the foregoing method embodiments, which are not repeated herein.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It should be appreciated that the teachings of embodiments of the present invention described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of preferred embodiments of the present invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the above description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., an embodiment of the invention that is claimed, requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component and, furthermore, they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.
Various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functionality of some or all of the components according to embodiments of the present invention may be implemented in practice using a microprocessor or Digital Signal Processor (DSP). Embodiments of the present invention may also be implemented as a device or apparatus program (e.g., a computer program and a computer program product) for performing a portion or all of the methods described herein. Such a program embodying the embodiments of the present invention may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. Embodiments of the invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specifically stated.

Claims (10)

1. A large language model text error correction method based on pre-training knowledge embedding, which is characterized by comprising the following steps:
inputting a text to be corrected into a knowledge base model obtained by pre-training, inputting text correction task information into a task encoder, and connecting the output of the knowledge base model with the output of the task encoder through a task adapter to obtain a knowledge embedded feature vector;
and inputting the text to be corrected and the knowledge embedded feature vector into a pre-trained and obtained preset correction large model to obtain corrected text.
2. The method according to claim 1, wherein the method further comprises:
collecting a sample text, and preprocessing the sample file;
marking the preprocessed sample text to obtain a sample text set;
inputting the sample text set into a preset error correction large model for training to obtain a trained preset error correction large model; and predicting a subsequent text by the preset error correction large model based on the known text, comparing the predicted text with the sample text set to determine loss, and performing iterative training to obtain the error correction large model.
3. The method according to claim 1, wherein the method further comprises:
collecting knowledge base sample data, and preprocessing the knowledge base sample data;
carrying out knowledge enhancement and marking on the preprocessed knowledge base sample data to obtain a knowledge base sample set;
inputting the knowledge base sample set into a preset knowledge base model for training to obtain a trained knowledge base model; and the knowledge base model carries out replacement prediction on the knowledge base sample data according to the knowledge base sample set, compares the replacement prediction data with the knowledge base sample data, calculates loss, and trains the knowledge base sample data to obtain the knowledge base model.
4. A method according to claim 3, wherein the knowledge base model comprises character-level error correction, phrase-level error correction and/or sentence-level error correction.
5. The method according to any one of claims 2-4, further comprising:
training the task adapter according to the sample text set, the knowledge base sample set and the text correction task information, comparing the calculation loss back propagation according to the sample text in the sample text set and the correction text obtained by training, and performing fine adjustment.
6. The method according to claim 1, wherein the inputting the text to be corrected and the knowledge-embedded feature vector into the pre-trained pre-set correction large model to obtain corrected text further comprises:
inputting the text to be corrected and the knowledge embedded feature vector into a pre-trained and obtained pre-set correction large model, and processing the knowledge embedded feature vector and the feature vector of the text to be corrected after splicing the knowledge embedded feature vector and the feature vector of the text to be corrected based on preset weights by the pre-set correction large model to obtain corrected text.
7. The method of claim 1, wherein the pre-set error correction large model comprises a sequence-to-sequence model.
8. A large language model text error correction device based on pre-training knowledge embedding, the device comprising:
the adapter module is suitable for inputting a text to be corrected into a knowledge base model obtained through pre-training, inputting text correction task information into a task encoder, and connecting the output of the knowledge base model with the output of the task encoder through a task adapter to obtain a knowledge embedded feature vector;
and the error correction module is suitable for inputting the text to be corrected and the knowledge embedded feature vector into a pre-trained and obtained preset error correction large model to obtain corrected text.
9. A computing device, comprising: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is configured to store at least one executable instruction that causes the processor to perform the operations corresponding to the large language model text correction method embedded based on pre-training knowledge as claimed in any one of claims 1 to 7.
10. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the large language model text error correction method embedded based on pre-training knowledge as claimed in any one of claims 1 to 7.
CN202311810975.6A 2023-12-26 2023-12-26 Large language model text error correction method and device based on pre-training knowledge embedding Pending CN117787266A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311810975.6A CN117787266A (en) 2023-12-26 2023-12-26 Large language model text error correction method and device based on pre-training knowledge embedding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311810975.6A CN117787266A (en) 2023-12-26 2023-12-26 Large language model text error correction method and device based on pre-training knowledge embedding

Publications (1)

Publication Number Publication Date
CN117787266A true CN117787266A (en) 2024-03-29

Family

ID=90397680

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311810975.6A Pending CN117787266A (en) 2023-12-26 2023-12-26 Large language model text error correction method and device based on pre-training knowledge embedding

Country Status (1)

Country Link
CN (1) CN117787266A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112036162A (en) * 2020-11-06 2020-12-04 北京世纪好未来教育科技有限公司 Text error correction adaptation method and device, electronic equipment and storage medium
CN112580324A (en) * 2020-12-24 2021-03-30 北京百度网讯科技有限公司 Text error correction method and device, electronic equipment and storage medium
CN113051894A (en) * 2021-03-16 2021-06-29 京东数字科技控股股份有限公司 Text error correction method and device
CN114444479A (en) * 2022-04-11 2022-05-06 南京云问网络技术有限公司 End-to-end Chinese speech text error correction method, device and storage medium
US20220215170A1 (en) * 2021-01-06 2022-07-07 Tencent America LLC Framework for chinese text error identification and correction
CN115759052A (en) * 2022-11-24 2023-03-07 华润数字科技有限公司 Text error correction method and device, electronic equipment and storage medium
CN117217207A (en) * 2023-08-22 2023-12-12 腾讯科技(深圳)有限公司 Text error correction method, device, equipment and medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112036162A (en) * 2020-11-06 2020-12-04 北京世纪好未来教育科技有限公司 Text error correction adaptation method and device, electronic equipment and storage medium
CN112580324A (en) * 2020-12-24 2021-03-30 北京百度网讯科技有限公司 Text error correction method and device, electronic equipment and storage medium
US20220215170A1 (en) * 2021-01-06 2022-07-07 Tencent America LLC Framework for chinese text error identification and correction
CN113051894A (en) * 2021-03-16 2021-06-29 京东数字科技控股股份有限公司 Text error correction method and device
CN114444479A (en) * 2022-04-11 2022-05-06 南京云问网络技术有限公司 End-to-end Chinese speech text error correction method, device and storage medium
CN115759052A (en) * 2022-11-24 2023-03-07 华润数字科技有限公司 Text error correction method and device, electronic equipment and storage medium
CN117217207A (en) * 2023-08-22 2023-12-12 腾讯科技(深圳)有限公司 Text error correction method, device, equipment and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
汪权彬;谭营;: "基于数据增广和复制的中文语法错误纠正方法", 智能系统学报, vol. 15, no. 01, 31 January 2020 (2020-01-31) *

Similar Documents

Publication Publication Date Title
JP7087938B2 (en) Question generator, question generation method and program
Chollampatt et al. A multilayer convolutional encoder-decoder neural network for grammatical error correction
US11816442B2 (en) Multi-turn dialogue response generation with autoregressive transformer models
CN109388793B (en) Entity marking method, intention identification method, corresponding device and computer storage medium
CN111738016B (en) Multi-intention recognition method and related equipment
CN112528637B (en) Text processing model training method, device, computer equipment and storage medium
WO2022188584A1 (en) Similar sentence generation method and apparatus based on pre-trained language model
CN111401079A (en) Training method and device of neural network machine translation model and storage medium
CN113076739A (en) Method and system for realizing cross-domain Chinese text error correction
CN112507695A (en) Text error correction model establishing method, device, medium and electronic equipment
JP2022111261A (en) Question generation device, question generation method and program
CN110334186A (en) Data query method, apparatus, computer equipment and computer readable storage medium
CN115293138B (en) Text error correction method and computer equipment
CN114896983A (en) Model training method, text processing device and computer equipment
CN110678882A (en) Selecting answer spans from electronic documents using machine learning
CN112395888A (en) Machine translation apparatus and method
CN116151132A (en) Intelligent code completion method, system and storage medium for programming learning scene
CN111914061A (en) Radius-based uncertainty sampling method and system for text classification active learning
CN114707492A (en) Vietnamese grammar error correction method and device fusing multi-granularity characteristics
CN111160036A (en) Method and device for updating machine translation model based on neural network
CN117725211A (en) Text classification method and system based on self-constructed prompt template
CN117787266A (en) Large language model text error correction method and device based on pre-training knowledge embedding
US20220254351A1 (en) Method and system for correcting speaker diarization using speaker change detection based on text
EP4323909A1 (en) Character-level attention neural networks
CN115357712A (en) Aspect level emotion analysis method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination