CN112002311A

CN112002311A - Text error correction method and device, computer readable storage medium and terminal equipment

Info

Publication number: CN112002311A
Application number: CN201910387845.3A
Authority: CN
Inventors: 毛俊峰; 李靖阳; 郭泽
Original assignee: TCL Research America Inc
Current assignee: TCL Corp; TCL Research America Inc
Priority date: 2019-05-10
Filing date: 2019-05-10
Publication date: 2020-11-27

Abstract

The invention belongs to the technical field of voice recognition, and particularly relates to a text error correction method and device, a computer readable storage medium and terminal equipment. The method comprises the steps that first text data are obtained, wherein the first text data are output text data after voice recognition is carried out on input voice by preset terminal equipment; determining a usage scenario of the first text data according to the current state of the terminal device, the first text data and context information of the first text data; determining a confusion set of the first text data according to the usage scenario; performing error correction processing on the first text data by using a preset deep learning model and an iterative learning model to obtain second text data; and optimizing the second text data according to the confusion set to obtain third text data. According to the method and the device, based on methods such as deep learning and iterative learning, different error corrections are performed on the error text according to different scenes, and the error correction accuracy rate is greatly improved.

Description

Text error correction method and device, computer readable storage medium and terminal equipment

Technical Field

The invention belongs to the technical field of voice recognition, and particularly relates to a text error correction method and device, a computer readable storage medium and terminal equipment.

Background

Artificial intelligence and AI technology develop rapidly, and voice interaction mode appears on various intelligent terminal devices, but the requirements of users cannot be well realized. The fundamental reason is that the Speech Recognition (ASR) method is not effective, and thus the ASR text error correction method appears. The current ASR text error correction method can finish fixed error correction in some simple scenes, for example, can correct ' I want to order music ' into ' I want to listen to music ', i need to be provided with auxiliary eyes ' into ' I need to be provided with auxiliary glasses ' and the like, but the current ASR text error correction method does not consider that the same sentence has different error correction tasks in different contexts when the complex scenes are interacted, so that correct results cannot be obtained frequently, and the error correction accuracy is low.

Disclosure of Invention

In view of this, embodiments of the present invention provide a text error correction method, a text error correction device, a computer-readable storage medium, and a terminal device, so as to solve the problem that the error correction accuracy of the existing ASR text error correction method is low.

A first aspect of an embodiment of the present invention provides a text error correction method, which may include:

acquiring first text data, wherein the first text data is output text data after voice recognition is carried out on input voice by preset terminal equipment;

determining a usage scenario of the first text data according to the current state of the terminal device, the first text data and context information of the first text data;

determining a confusion set of the first text data according to the usage scenario;

performing error correction processing on the first text data by using a preset deep learning model and an iterative learning model to obtain second text data;

and optimizing the second text data according to the confusion set to obtain third text data.

Further, the determining the usage scenario of the first text data according to the current state of the terminal device, the first text data and the context information of the first text data includes:

determining a first candidate use scene of the first text data according to the current state of the terminal equipment;

determining a second candidate use scene of the first text data according to the first text data and the context information of the first text data;

determining a usage scenario of the first text data according to the first candidate usage scenario and the second candidate usage scenario.

Further, the training process of the deep learning model comprises the following steps:

acquiring a training data set, wherein each piece of training data in the training data set comprises input data and a label, the input data is error text data and context information output by voice recognition, and the label is a correct result obtained after error correction is performed on the error text data;

and training the deep learning model by using the training data set to obtain the trained deep learning model.

Further, the training process of the iterative learning model comprises:

judging whether the iterative learning model is activated or not through a preset identifier, wherein the activated iterative learning model receives feedback information;

and training the iterative model by using the training data set to obtain the trained iterative model.

Further, the optimizing the second text data according to the confusion set to obtain third text data includes:

constructing a candidate set of the second text data according to the confusion set, wherein the candidate set comprises each candidate text data of the second text data;

respectively calculating scores of the second text data and each candidate text data by using a preset score function;

selecting candidate text data with the highest score as preferred text data;

if the difference between the scores of the preferred text data and the second text data is larger than a preset threshold value, determining the preferred text data as the third text data;

and if the difference between the scores of the preferred text data and the second text data is less than or equal to the threshold value, determining the second text data as the third text data.

A second aspect of an embodiment of the present invention provides a text error correction apparatus, which may include:

the text data acquisition module is used for acquiring first text data, wherein the first text data is output text data after voice recognition is carried out on input voice by preset terminal equipment;

the usage scenario determining module is used for determining a usage scenario of the first text data according to the current state of the terminal device, the first text data and the context information of the first text data;

a confusion set determining module for determining a confusion set of the first text data according to the usage scenario;

the error correction module is used for carrying out error correction processing on the first text data by using a preset deep learning model and an iterative learning model to obtain second text data;

and the optimization module is used for optimizing the second text data according to the confusion set to obtain third text data.

Further, the usage scenario determination module may include:

a first candidate usage scenario determining unit, configured to determine a first candidate usage scenario of the first text data according to a current state of the terminal device;

a second candidate usage scenario determination unit, configured to determine a second candidate usage scenario of the first text data according to the first text data and context information of the first text data;

a usage scenario determination unit configured to determine a usage scenario of the first text data according to the first candidate usage scenario and the second candidate usage scenario.

Further, the text correction apparatus may further include:

a training data set obtaining module, configured to obtain a training data set, where each piece of training data in the training data set includes input data and a label, the input data is error text data and context information output by speech recognition, and the label is a correct result obtained after error correction is performed on the error text data;

and the first model training module is used for training the deep learning model by using the training data set to obtain the trained deep learning model.

Further, the text correction apparatus may further include:

the feedback information receiving module is used for judging whether the iterative learning model is activated or not through a preset identifier, and the activated iterative learning model receives feedback information;

and the second model training module is used for training the iterative model by using the training data set to obtain the trained iterative model.

Further, the optimization module may include:

a candidate set constructing unit, configured to construct a candidate set of the second text data according to the confusion set, where the candidate set includes candidate text data of the second text data;

the score calculating unit is used for calculating scores of the second text data and each candidate text data by using a preset score function;

the preferred text data selecting unit is used for selecting the candidate text data with the highest score as the preferred text data;

a first determining unit, configured to determine the preferred text data as the third text data if a difference between scores of the preferred text data and the second text data is greater than a preset threshold;

a second determining unit, configured to determine the second text data as the third text data if a difference between the scores of the preferred text data and the second text data is less than or equal to the threshold.

A third aspect of embodiments of the present invention provides a computer-readable storage medium storing computer-readable instructions, which, when executed by a processor, implement the steps of any one of the above-mentioned text error correction methods.

A fourth aspect of the embodiments of the present invention provides a terminal device, including a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, where the processor implements the steps of any one of the text error correction methods when executing the computer readable instructions.

Compared with the prior art, the embodiment of the invention has the following beneficial effects: acquiring first text data, wherein the first text data is output text data after voice recognition is carried out on input voice by preset terminal equipment; determining a usage scenario of the first text data according to the current state of the terminal device, the first text data and context information of the first text data; determining a confusion set of the first text data according to the usage scenario; performing error correction processing on the first text data by using a preset deep learning model and an iterative learning model to obtain second text data; and optimizing the second text data according to the confusion set to obtain third text data. According to the embodiment of the invention, based on methods such as deep learning and iterative learning, a set of complete optimization methods are designed for the text recognized by ASR in a complex scene, and different error corrections are performed on the error text according to different scenes, so that the error correction accuracy is greatly improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

FIG. 1 is a flowchart of an embodiment of a text error correction method according to an embodiment of the present invention;

FIG. 2 is a schematic flow diagram of a usage scenario for determining first text data;

FIG. 3 is a schematic flow chart of an optimization process performed on second text data;

FIG. 4 is a block diagram of an embodiment of a text correction apparatus according to an embodiment of the present invention;

fig. 5 is a schematic block diagram of a terminal device in an embodiment of the present invention.

Detailed Description

In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, an embodiment of a text error correction method according to an embodiment of the present invention may include:

and step S101, acquiring first text data.

The first text data is output text data after voice recognition is carried out on input voice by preset terminal equipment.

When a user needs to interact with the terminal equipment through voice, the user can speak the content which the user wants to express, the terminal equipment acquires the input voice of the user through the voice acquisition equipment, performs voice recognition on the input voice and outputs a recognition result, namely the first text data. It should be noted that the first text data may not be consistent with the actual content that the user wants to express, for example, the content that the user says "i want to see the langue bar" and the result of the voice recognition output is "i want to see the langue bar", so that further error correction processing needs to be performed on the first text data through subsequent steps.

Step S102, determining a usage scenario of the first text data according to the current state of the terminal device, the first text data and the context information of the first text data.

In this embodiment, specific use scenes such as a movie scene, a shopping scene, a region scene, a news scene, and the like can be divided according to actual conditions, and various scenes can be preset, so that the meaning of the related text content can be judged according to the scenes, because the same text may have different meanings in different scenes.

As shown in fig. 2, step S102 may specifically include the following processes:

step S1021, determining a first candidate use scene of the first text data according to the current state of the terminal equipment.

The current state of the terminal device specifically refers to which application program (app) is currently used, and after the current state of the terminal device is determined, the current state of the terminal device can be converted into an alpha value by a preset rule method, wherein the alpha value represents a usage scenario determined according to the current state of the terminal device, namely the first candidate usage scenario. For example, when the current state of the terminal device is a map app, determining that the α value is a region scene classification, and when the current state of the terminal device is a pan app, determining that the α value is a shopping scene classification, and so on, if the current state of the terminal device cannot obtain the determined α value, setting the α value to 0.

Step S1022, determining a second candidate usage scenario of the first text data according to the first text data and the context information of the first text data.

In this embodiment, a neural network model classifier may be constructed based on deep learning techniques. The number of the training data classification of the classifier is determined by the actual situation of the terminal device, and different labels are respectively set to be 1,2, 3. In training the classifier, pre-processing operations are required on the training data, and these pre-processing operations include, but are not limited to, generating a dictionary using a text segmentation technique, processing text using a stop-word technique, obtaining an Embedding matrix using a text feature extraction technique, and vectorizing the text to generate input data. The neural network model structure used specifically may be a DNN model, a CNN model, an RNN model, a fasttext model, or the like, and is not limited specifically herein, it should be noted that the model needs to perform a synchronous preprocessing operation on the context information of the current error correction statement, and merge the context information into the input training data, and if there is no context information, the model may be filled with meaningless characters (e.g., < pad >). Training is carried out by inputting training data into the model, and a trained scene classification model is obtained.

And inputting the first text data and the context information of the first text data into the scene classification model to obtain an output result beta value, wherein the beta value represents the use scene determined by the scene classification model, namely the second candidate use scene.

Step S103, determining the use scene of the first text data according to the first candidate use scene and the second candidate use scene.

Specifically, a final classification result r value may be calculated according to the following formula, using a classification result α value determined according to the current state of the terminal device and a classification result β value obtained according to a scene classification model: and r ═ α + (1-) β, where the value is a preset empirical value, and the value interval is [0, 1], and the specific value can be set according to practical situations, for example, it can be set to 0.2, 0.3, 0.5, or other values. The r value represents the finally determined usage scenario.

Example one: when a user opens movie and television apps such as an Aichi art app and a Youke app, the voice is input into the 'wo xiang zhao lang ya bang', and the 'I wants to find the wolf teeth stick' is obtained after the ASR recognition. Through the method, the classification r as a movie scene can be obtained by processing the wolf tooth stick and the context information thereof.

Example two: when a user opens map apps such as a Baidu map app and a Gaode map app, the voice is input to 'wo yao zhao lang ya bang', and 'I want to find a wolf tooth stick' is obtained after ASR recognition. By processing the wolf tooth stick to be found by me and the context information thereof through the method, the scene of the classification r as the region can be obtained.

And step S103, determining a confusion set of the first text data according to the use scene.

In this embodiment, a confusion set, which is a set composed of various possible candidate values of words in the first text data, may be constructed by using a method of a conventional machine learning language model. The steps are mainly divided into two parts: building a language model, and building a confusion set. And identifying and extracting wrong words in the ASR identification result by using the constructed language model, constructing n corresponding confusion sets by using the classification number n of the scene classification module, wherein each confusion set has different emphasis in different classification scenes, and the results of the scene classification module and the words obtained by the language model are jointly limited.

The language model construction uses a character-based bidirectional n-gram LM, a method for carrying out error recognition by utilizing maximum entropy classification and the like, the confusion set is also constructed by using a traditional machine learning method, the format of the data set is { key: value } key value pair, and the error category of the confusion set comprises common errors such as adjective errors, pronunciation confusion, shape confusion and the like. And finally, constructing n confusion sets, and selecting the corresponding confusion sets according to the result of the scene classification module for output.

The difference between the language model and the confusion set constructed in the embodiment and the conventional method is that n confusion sets need to be constructed, each confusion set is different in the correction emphasis point of common errors in corresponding classification, and for different terminal devices, the emphasis points of the confusion sets in the same classification are different, and the difference is whether the terminal device can realize related functions. For all the traditional error correction methods, the quality of the confusion sets has a large influence on the final recognition result, n confusion sets are constructed after classification in the embodiment, the construction difficulty of each confusion set is relatively simple, and the accuracy of the final recognition result is improved.

Example one: the classification result r of the wolf teeth stick is obtained by the scene classification module as the movie scene, and the confusion set C corresponding to the classification result r is processed by the language model_{r ═ film and television scenes}The outputs are as follows: c_{r ═ film and television scenes}The Chinese characters are that { "wolf teeth stick" - "Lanya", "wolf teeth stick" - "wolf teeth side", "wolf teeth stick" - "wolf teeth mountain" -, was.

Example two: the classification result r of the wolf teeth stick is obtained by the scene classification module as the region scene, and the confusion set C corresponding to the classification result r is collected through the language model_{r is regional scene}The outputs are as follows: c_{r is regional scene}The two types of the wolf teeth are arranged in the order of priority from high to low.

And S104, performing error correction processing on the first text data by using a preset deep learning model and an iterative learning model to obtain second text data.

The step aims to perform error correction processing on the first text data by using an error correction module composed of a deep learning model and an iterative learning model to obtain an ASR error correction result, namely the second text data. The deep learning model is used in the prior art, and is not described herein again. However, the deep learning model is improved and the iterative learning model is added, so that the deep learning model can continue to perform iterative optimization after training is completed. The iterative training aims to fine-tune and optimize the deep learning model, the more time the user uses the terminal equipment is, the richer the data generated by human-computer interaction is, the better the effect of the deep learning model after iterative training is, the fewer the errors in the error correction result sentence are, the smaller the candidate set which needs to be constructed by subsequent correction processing is, and the efficiency and the accuracy can be correspondingly improved.

The implementation method of the deep learning model is a deep learning technology, a neural network model capable of correcting the ASR result is constructed in the embodiment, and the deep learning model and the iterative learning model are combined to train an optimization result. In the deep learning model, a hierarchical model method is used for improving a coding model structure to enable the coding model structure to have context processing capability, and the overall framework is a BiGRU and Encoder-Decoder mode. The training process of the deep learning model comprises the following steps: acquiring a training data set, wherein each piece of training data in the training data set comprises input data and a label, the input data is error text data and context information output by voice recognition, and the label is a correct result obtained after error correction is performed on the error text data; and training the deep learning model by using the training data set to obtain the trained deep learning model. In the use stage, the iterative learning model can be used to optimize the deep learning model.

The implementation method of the iterative learning model is a reinforcement learning method, and aims to construct an iterative learning model and a deep learning model for combined training. The internal structure of the iterative learning model mainly refers to the ideas of an AC model, a DQN model and a NAF model, the preprocessing work of data is taken charge of by the deep learning model, and the iterative learning model only needs to use the data processed by the deep learning model (namely the training data set). The training process of the iterative learning model comprises the following steps: judging whether the iterative learning model is activated or not through a preset identifier, wherein the activated iterative learning model receives feedback information; and training the iterative model by using the training data set to obtain the trained iterative model.

Example one: inputting the wolf teeth stick wanted to be found and the context information thereof into a model, inputting the preprocessed wolf teeth stick and the context information into a deep learning model to obtain an error correction result S, namely the Langya bang wanted to be found, and performing iterative learning by using an iterative learning model to wait for feedback information.

Example two: inputting the wolf teeth stick to be found and the context information thereof into a model, inputting the preprocessed wolf teeth stick to a deep learning model to obtain an error correction result S, namely the Langya post to be found, and performing iterative learning by using an iterative learning model for waiting for feedback information.

And S105, optimizing the second text data according to the confusion set to obtain third text data.

And the third text data is the final recognition result obtained after the text error correction.

As shown in fig. 3, step S105 may specifically include the following processes:

and S1051, constructing a candidate set of the second text data according to the confusion set.

In this embodiment, the candidate set may be specifically constructed by using a graph model, an HMM, and the like, where the candidate set includes each candidate text data of the second text data.

Step S1052, respectively calculating scores of the second text data and each candidate text data using a preset scoring function.

The scoring function may include, but is not limited to, edit distance, LM, etc. scoring functions.

And S1053, selecting the candidate text data with the highest score as the preferred text data.

Step S1054, determining whether the difference between the scores of the preferred text data and the second text data is greater than a preset threshold.

The threshold may be set according to actual conditions, and this embodiment does not specifically limit the threshold.

If so, step S1055 is executed, and if not, step S1056 is executed.

And step S1055, determining the preferred text data as the third text data.

Step S1056, determining the second text data as the third text data.

That is, if none of the candidate sentences has a score higher than the original sentence or the score is not higher than the threshold value compared with the original sentence, the original sentence is considered to have no error, otherwise, the candidate sentence with the highest score is output.

After the final recognition result is selected, feedback information may also be sent to the iterative learning model according to the following rules: and for the condition that the final recognition result is not the output result of the error correction module, giving negative feedback, for the condition that the final recognition result is the output result of the error correction module, judging the similarity between the final recognition result and the context information thereof (the similarity can be a method of cosine distance, TFIDF, Word2Vec and the like), if the similarity exceeds a threshold value, indicating that the meaning of the sentence expressed by the user is repeated, indicating that the sentence and the sentence with the same meaning are insufficient in error correction capability, giving negative feedback, and for the rest conditions, indicating that the final recognition result is the result of the error correction module and the sentence with the same meaning is not repeatedly expressed by the user, wherein the effect of the iterative learning model meets the requirement of the user and giving positive feedback. Particularly, if a plurality of continuous feedback information are all positive feedback, the construction of the candidate set can be interrupted after the threshold value is reached, and the second text data can be directly output.

Example one: according to the error correction result S, i.e. "i want to find Langya" and the confusion set C_{r ═ film and television scenes}Constructing a candidate set L_{r ═ film and television scenes}That is, the user needs to find the Lanya area.]. For candidate set L_{r ═ film and television scenes}And scoring, namely finding that the original sentence 'i want to find Langya board' is the highest in score, outputting a result and sending forward feedback to the iterative learning model according to rules.

Example two: according to the error correction result S, i.e. "I want to find Langya" and the confusion set C_{r is regional scene}Constructing a candidate set L_{r is regional scene}That is, i want to find the langa area, i want to find the wolf ridge.]. For candidate set L_{r is regional scene}And scoring, namely finding that the scoring of the area where I want to find Langya is the highest, outputting a result and sending negative feedback to the iterative learning model according to a rule.

The method in this embodiment may be used after the ASR method on the terminal device, where the input data is text data and the output data is also text data. Because the deep learning technology is not mature at present, the problem of poor generalization and the like exists, the deep learning model effect is basically guaranteed by using the traditional machine learning method, the deep learning model can be subjected to iterative optimization and fine tuning by using the reinforcement learning method, and when the method in the embodiment is applied to different terminal devices such as mobile phones, smart televisions and smart homes, corresponding adjustment can be made on training data, classification quantity and confusion sets so as to adapt to the currently used terminal devices.

Through the embodiment, when the wolf teeth stick in the wolf teeth stick is corrected, the wolf teeth stick in the movie scene can be corrected into Langya brand; the Langerhan stick in the regional scene can be corrected into a proper one of regional names such as Langya area, Langerhan and the like; the Langerhans' sticks in the news scene can be corrected into Langya networks. When the wakening in the 'I want to wake up' is corrected, the wakening in the astronomical scene is corrected into a 'star'; the "wake up" in the animal scene is corrected to become "orangutan"; the "wake up" in the entertainment scene will correct the error to become "harmonious star".

In summary, in the embodiments of the present invention, first text data is obtained, where the first text data is text data that is output after a preset terminal device performs voice recognition on an input voice; determining a usage scenario of the first text data according to the current state of the terminal device, the first text data and context information of the first text data; determining a confusion set of the first text data according to the usage scenario; performing error correction processing on the first text data by using a preset deep learning model and an iterative learning model to obtain second text data; and optimizing the second text data according to the confusion set to obtain third text data. According to the embodiment of the invention, based on methods such as deep learning and iterative learning, a set of complete optimization methods are designed for the text recognized by ASR in a complex scene, and different error corrections are performed on the error text according to different scenes, so that the error correction accuracy is greatly improved.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

Fig. 4 is a structural diagram of an embodiment of a text error correction apparatus according to an embodiment of the present invention, which corresponds to the text error correction method described in the foregoing embodiment.

In this embodiment, a text error correction apparatus may include:

the text data acquisition module 401 is configured to acquire first text data, where the first text data is text data that is output after a preset terminal device performs voice recognition on input voice;

a usage scenario determining module 402, configured to determine a usage scenario of the first text data according to a current state of the terminal device, the first text data, and context information of the first text data;

a confusion set determining module 403, configured to determine a confusion set of the first text data according to the usage scenario;

an error correction module 404, configured to perform error correction processing on the first text data by using a preset deep learning model and an iterative learning model to obtain second text data;

and an optimizing module 405, configured to perform optimization processing on the second text data according to the confusion set, so as to obtain third text data.

Further, the usage scenario determination module may include:

Further, the text correction apparatus may further include:

Further, the optimization module may include:

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, modules and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Fig. 5 shows a schematic block diagram of a terminal device according to an embodiment of the present invention, and for convenience of description, only the relevant parts related to the embodiment of the present invention are shown.

As shown in fig. 5, the terminal device 5 of this embodiment includes: a processor 50, a memory 51 and a computer program 52 stored in said memory 51 and executable on said processor 50. The processor 50, when executing the computer program 52, implements the steps in the various text error correction method embodiments described above, such as the steps S101 to S105 shown in fig. 1. Alternatively, the processor 50, when executing the computer program 52, implements the functions of each module/unit in the above-mentioned device embodiments, for example, the functions of the modules 401 to 405 shown in fig. 4.

Illustratively, the computer program 52 may be partitioned into one or more modules/units that are stored in the memory 51 and executed by the processor 50 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program 52 in the terminal device 5.

The terminal device 5 may be a mobile phone, a tablet computer, a desktop computer, a notebook computer, a palm computer, a cloud server, or other computing devices. It will be understood by those skilled in the art that fig. 5 is only an example of the terminal device 5, and does not constitute a limitation to the terminal device 5, and may include more or less components than those shown, or combine some components, or different components, for example, the terminal device 5 may further include an input-output device, a network access device, a bus, etc.

The Processor 50 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 51 may be an internal storage unit of the terminal device 5, such as a hard disk or a memory of the terminal device 5. The memory 51 may also be an external storage device of the terminal device 5, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 5. Further, the memory 51 may also include both an internal storage unit and an external storage device of the terminal device 5. The memory 51 is used for storing the computer programs and other programs and data required by the terminal device 5. The memory 51 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims

1. A text error correction method, comprising:

2. The text error correction method according to claim 1, wherein the determining the usage scenario of the first text data according to the current state of the terminal device, the first text data, and the context information of the first text data comprises:

3. The text correction method of claim 1, wherein the training process of the deep learning model comprises:

4. The text correction method of claim 3, wherein the training process of the iterative learning model comprises:

5. The text error correction method according to any one of claims 1 to 4, wherein the optimizing the second text data according to the confusion set to obtain third text data comprises:

selecting candidate text data with the highest score as preferred text data;

6. A text correction apparatus, comprising:

7. The text correction apparatus of claim 6, wherein the usage scenario determination module comprises:

8. The text correction apparatus according to claim 6 or 7, wherein the optimization module comprises:

9. A computer readable storage medium storing computer readable instructions, wherein the computer readable instructions, when executed by a processor, implement the steps of the text correction method according to any one of claims 1 to 5.

10. A terminal device comprising a memory, a processor and computer readable instructions stored in the memory and executable on the processor, characterized in that the processor implements the steps of the text correction method according to any one of claims 1 to 5 when executing the computer readable instructions.