CN113779970A - Text error correction method and related equipment thereof - Google Patents

Text error correction method and related equipment thereof Download PDF

Info

Publication number
CN113779970A
CN113779970A CN202111122968.8A CN202111122968A CN113779970A CN 113779970 A CN113779970 A CN 113779970A CN 202111122968 A CN202111122968 A CN 202111122968A CN 113779970 A CN113779970 A CN 113779970A
Authority
CN
China
Prior art keywords
error correction
result
text
correction
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111122968.8A
Other languages
Chinese (zh)
Other versions
CN113779970B (en
Inventor
李�浩
龚笠
杨晶生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zitiao Network Technology Co Ltd
Original Assignee
Beijing Zitiao Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zitiao Network Technology Co Ltd filed Critical Beijing Zitiao Network Technology Co Ltd
Priority to CN202111122968.8A priority Critical patent/CN113779970B/en
Publication of CN113779970A publication Critical patent/CN113779970A/en
Priority to PCT/CN2022/119636 priority patent/WO2023045868A1/en
Application granted granted Critical
Publication of CN113779970B publication Critical patent/CN113779970B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/226Validation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/232Orthographic correction, e.g. spell checking or vowelisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses a text error correction method and related equipment thereof, wherein the method comprises the following steps: after the text to be processed is obtained, firstly determining an error correction result of the text to be processed and correction reference information of the text to be processed; then, preset correction processing is carried out on the error correction result by utilizing the correction reference information to obtain an error correction result to be used, so that the modification suggestion in the error correction result to be used is more accurate; finally, according to the error correction result to be used, determining the text error correction information of the text to be processed, so that the text error correction information can more accurately represent the modification suggestion of at least one error character in the text to be processed, and the probability of error correction can be reduced as much as possible when the text to be processed is modified by using the text error correction information, thereby improving the text error correction effect and improving the text input experience of a user.

Description

Text error correction method and related equipment thereof
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a text error correction method and a related device.
Background
In some text input scenarios (e.g., document editing, search engine, etc.), text input errors (e.g., misspellings, misselections of similar pronunciations, misinputs of similar glyphs, etc.) are prone to occur, and therefore, in order to improve the user experience, correction (e.g., the modification suggestions shown in fig. 1) needs to be performed on the text input errors so as to avoid adverse effects of the text input errors on subsequent text processing procedures (e.g., search recommendation procedures, etc.).
However, due to the defects of the text error correction technology, the text error correction effect of the text error correction technology is poor, so that the text input experience of the user is also poor.
Disclosure of Invention
In order to solve the technical problem, the application provides a text error correction method and related equipment thereof, which can improve the text error correction effect, thereby improving the text input experience of a user.
In order to achieve the above purpose, the technical solutions provided in the embodiments of the present application are as follows:
the embodiment of the application provides a text error correction method, which comprises the following steps: after a text to be processed is obtained, determining an error correction result of the text to be processed and correction reference information of the text to be processed; performing preset correction processing on the error correction result by using the correction reference information to obtain an error correction result to be used; and determining text error correction information of the text to be processed according to the error correction result to be used.
In a possible embodiment, the correction reference information includes an error detection result and/or a protection character recognition result; the error correction result is used for representing the position of at least one error character in the text to be processed; the protection character recognition result is used for representing the position of at least one protected character in the text to be processed.
In a possible embodiment, the correction reference information comprises an error detection result; the using the correction reference information to perform preset correction processing on the error correction result to obtain an error correction result to be used includes: and performing first correction processing on the error correction result by using the error detection result and a first correction rule corresponding to the error detection result to obtain the error correction result to be used.
In one possible embodiment, the correction reference information includes a protection character recognition result; the using the correction reference information to perform preset correction processing on the error correction result to obtain an error correction result to be used includes: and performing second correction processing on the error correction result by using the protection character recognition result and a second correction rule corresponding to the protection character recognition result to obtain the error correction result to be used.
In one possible embodiment, the correction reference information includes an error correction result and a protection character recognition result; the determination process of the error correction result to be used comprises the following steps: performing first correction processing on the error correction result by using the error detection result and a first correction rule corresponding to the error detection result to obtain a first error correction result; and performing second correction processing on the first error correction result by using the protection character recognition result and a second correction rule corresponding to the protection character recognition result to obtain the error correction result to be used.
In one possible embodiment, the correction reference information includes an error correction result and a protection character recognition result; the determination process of the error correction result to be used comprises the following steps: performing first correction processing on the error correction result by using the error detection result and a first correction rule corresponding to the error detection result to obtain a first error correction result; performing second correction processing on the error correction result by using the protection character recognition result and a second correction rule corresponding to the protection character recognition result to obtain a second correction result; and determining the error correction result to be used according to the first error correction result and the second error correction result.
In one possible embodiment, the correction reference information includes an error correction result and a protection character recognition result; the determination process of the error correction result to be used comprises the following steps: performing second correction processing on the error correction result by using the protection character recognition result and a second correction rule corresponding to the protection character recognition result to obtain a second correction result; and performing first correction processing on the second error correction result by using the error detection result and a first correction rule corresponding to the error detection result to obtain the error correction result to be used.
In a possible embodiment, the number of the error detection results is N, and the number of the first objects to be corrected is M; wherein N is a positive integer; m is a positive integer; the first correction processing includes: voting a kth modification suggestion in the mth first object to be corrected by using the N error detection results and at least one other first object to be corrected except the mth first object to be corrected in the M first objects to be corrected to obtain a reserved voting result of the kth modification suggestion; wherein M is a positive integer, M is less than or equal to M, K is a positive integer, and K is less than or equal to Km,KmIs a positive integer, KmRepresenting the number of modified proposals in the mth first object to be corrected; if the reserved voting result of the kth modification suggestion does not meet the first condition, deleting the kth modification suggestion from the mth first object to be corrected; wherein M is a positive integer, M is less than or equal to M, K is a positive integer, and K is less than or equal to Km,KmIs a positive integer.
In a possible implementation manner, the number of the protection character recognition results is Q; wherein Q is a positive integer; the second correction processing includes: voting the r-th modification suggestion in the second object to be corrected by using Q protection character recognition results to obtain a deletion voting result of the r-th modification suggestion; wherein R is a positive integer, R is not more than R, R is a positive integer, and R represents the number of modification proposals in the second object to be corrected; and if the deletion voting result of the r modification suggestion meets a second condition, deleting the r modification suggestion from the second object to be corrected.
In one possible embodiment, the method further comprises: if the deletion voting result of the r modification suggestion meets a second condition, determining a screening condition to be used according to the r modification suggestion; searching a target modification suggestion meeting the to-be-used screening condition from the second to-be-corrected object to obtain a search result; and if the search result shows that at least one target modification suggestion exists in the second object to be corrected, deleting the at least one target modification suggestion from the second object to be corrected.
In one possible embodiment, the r-th modification suggestion includes modifying the first character information into the second character information; the screening condition to be used is determined according to the first character information.
In a possible implementation, the determining of the error detection result includes: carrying out error detection processing on the text to be processed by utilizing at least one pre-constructed error detection model and/or at least one error detection rule to obtain at least one error detection result of the text to be processed; the determination process of the protection character recognition result comprises the following steps: and performing protection character recognition processing on the text to be processed by utilizing at least one pre-constructed protection character recognition model and/or at least one protection character recognition rule to obtain at least one protection character recognition result of the text to be processed.
In one possible embodiment, the determining of the error correction result includes: and carrying out error correction processing on the text to be processed by utilizing at least one pre-constructed error correction model and/or at least one error correction rule to obtain at least one error correction result of the text to be processed.
In a possible implementation manner, the determining text correction information of the text to be processed according to the result of the correction to be used includes: carrying out preset suggestion screening processing on the error correction result to be used to obtain a third error correction result; and determining text error correction information of the text to be processed according to the third error correction result.
In a possible implementation, the determining of the third error correction result includes: determining rewriting probabilities of various modification suggestions in the error correction result to be used; judging whether the rewriting probability of each modification suggestion in the error correction result to be used meets a third condition or not to obtain a judgment result of each modification suggestion; and according to the judgment result of each modification suggestion, carrying out rewriting suggestion deletion processing on the error correction result to be used to obtain the third error correction result.
In one possible embodiment, the to-be-used error correction result includes a to-be-used suggestion, and the to-be-used suggestion includes: modifying the third character information into fourth character information; the determination process of the probability of rewriting proposed to be used includes: determining a feature difference degree between the third character information and the fourth character information according to the character feature information of the third character information and the character feature information of the fourth character information; and determining the rewriting probability of the suggestion to be used according to the feature difference degree between the third character information and the fourth character information.
In one possible implementation, the character feature information includes: at least one of the input operation information, the pronunciation characterization information, and the character shape information.
In a possible implementation manner, the determining process of the text correction information of the text to be processed includes: carrying out text modification processing on the text to be processed by utilizing the tth error correction result to obtain a tth candidate error correction text; wherein T is a positive integer, T is less than or equal to T, T is a positive integer, and T represents the number of the error correction results to be processed; determining the smoothness score of the tth candidate error correction text; wherein T is a positive integer, T is less than or equal to T, and T is a positive integer; screening the corrected texts meeting a fourth condition from the T candidate error correction texts according to the smoothness scores of the T candidate error correction texts; and determining text error correction information of the text to be processed according to the to-be-processed error correction result corresponding to the text after error correction.
In a possible implementation manner, the screening, according to the smoothness scores of the T candidate corrected texts, the corrected texts meeting a fourth condition from the T candidate corrected texts includes: determining the maximum value of the smoothness scores according to the smoothness scores of the T candidate error correction texts; and if the fact that the highest smoothness score and the smoothness score of the text to be processed meet a fifth condition is determined, determining the candidate error correction text with the highest smoothness score as the error corrected text.
In a possible implementation manner, the screening, according to the smoothness scores of the T candidate corrected texts, the corrected texts meeting a fourth condition from the T candidate corrected texts includes: screening at least one target error correction text meeting a sixth condition from the T candidate error correction texts according to the smoothness scores of the T candidate error correction texts; and screening the corrected text meeting a seventh condition from the at least one target corrected text.
An embodiment of the present application further provides a text error correction apparatus, including: the result determining unit is used for determining an error correction result of the text to be processed and correction reference information of the text to be processed after the text to be processed is obtained; the result correction unit is used for carrying out preset correction processing on the error correction result by utilizing the correction reference information to obtain an error correction result to be used; and the information determining unit is used for determining the text error correction information of the text to be processed according to the error correction result to be used.
An embodiment of the present application further provides an apparatus, where the apparatus includes a processor and a memory: the memory is used for storing a computer program; the processor is configured to perform the method of any one of claims 1-17 in accordance with the computer program.
The embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium is used for storing a computer program, and the computer program is used for executing any implementation manner of the text error correction method provided in the embodiment of the present application.
The embodiment of the present application further provides a computer program product, which when running on a terminal device, enables the terminal device to execute any implementation manner of the text error correction method provided by the embodiment of the present application.
Compared with the prior art, the embodiment of the application has at least the following advantages:
according to the technical scheme provided by the embodiment of the application, after the text to be processed is obtained, firstly, an error correction result of the text to be processed and correction reference information of the text to be processed are determined; then, preset correction processing is carried out on the error correction result by utilizing the correction reference information to obtain an error correction result to be used, so that the modification suggestion in the error correction result to be used is more accurate; finally, according to the error correction result to be used, determining the text error correction information of the text to be processed, so that the text error correction information can more accurately represent the modification suggestion of at least one error character in the text to be processed, and the probability of error correction can be reduced as much as possible when the text to be processed is modified by using the text error correction information, thereby improving the text error correction effect and improving the text input experience of a user.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of a modification proposal provided by an embodiment of the present application;
fig. 2 is a flowchart of a text error correction method according to an embodiment of the present application;
fig. 3 is a schematic diagram of a text error correction process according to an embodiment of the present application;
fig. 4 is a schematic diagram of a first calibration process according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a text error correction apparatus according to an embodiment of the present application.
Detailed Description
The inventor finds that some text error correction technologies have defects in research on text error correction technologies, so that modification suggestions determined based on the text error correction technologies are prone to errors, for example, errors such as modification of some correct characters (for example, correct characters in entity names such as names, place names, organization names and object names, correct characters in dates) are prone to occur, and therefore the text error correction effects of the text error correction technologies are poor, and therefore the text input experience of a user is poor.
Based on the above findings, in order to overcome the technical problems shown in the background section, an embodiment of the present application provides a text error correction method, including: after the text to be processed is obtained, firstly determining an error correction result of the text to be processed and correction reference information of the text to be processed; then, preset correction processing is carried out on the error correction result by utilizing the correction reference information to obtain an error correction result to be used, so that the modification suggestion in the error correction result to be used is more accurate; finally, according to the error correction result to be used, determining the text error correction information of the text to be processed, so that the text error correction information can more accurately represent the modification suggestion of at least one error character in the text to be processed, and the probability of error correction can be reduced as much as possible when the text to be processed is modified by using the text error correction information, thereby improving the text error correction effect and improving the text input experience of a user.
In addition, the embodiment of the present application does not limit the execution subject of the text error correction method, and for example, the text error correction method provided by the embodiment of the present application may be applied to a data processing device such as a terminal device or a server. The terminal device may be a voice processing terminal, a smart phone, a computer, a Personal Digital Assistant (PDA), a tablet computer, or the like. The server may be a stand-alone server, a cluster server, or a cloud server.
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In order to facilitate understanding of the present application, a text error correction method provided by the embodiments of the present application is described below with reference to the accompanying drawings.
Referring to fig. 2, the figure is a flowchart of a text error correction method provided in an embodiment of the present application.
The text error correction method provided by the embodiment of the application comprises the following steps of S1-S3:
s1: after the text to be processed is obtained, determining an error correction result of the text to be processed and correction reference information of the text to be processed.
The text to be processed refers to text data needing text error correction processing; the embodiment of the present application does not limit the manner of obtaining the text to be processed, and may be implemented by any character input device (e.g., a keyboard, a stylus, a handwriting touch pad, etc.), for example. As another example, multimedia data (e.g., voice data, image data, video data, etc.) may be collected via any multimedia input device (e.g., microphone, camera, etc.); and then, carrying out character recognition processing on the multimedia data to obtain a text to be processed, so that the text to be processed is used for representing character information carried by the multimedia data.
The above "error correction result of the text to be processed" is used to indicate a modification suggestion for at least one error character in the text to be processed; moreover, the embodiment of the present application does not limit the "error correction result of the text to be processed", for example, it may include at least one modification suggestion.
In addition, the determination process of the "error correction result of the text to be processed" is not limited in the embodiments of the present application, and for example, any method that can perform error correction processing on one text data, which is present or occurs in the future, may be used. For another example, in order to further improve the text error correction effect, any embodiment of the determination process of the error correction result shown below may be adopted.
The "correction reference information of the text to be processed" is used to indicate reference information (for example, position information of an error character in the text to be processed, and/or position information of a protected character in the text to be processed, and the like) that is required to be used when performing preset correction processing on the "error correction result of the text to be processed"; the determination process of the "correction reference information of the text to be processed" is not limited in the embodiments of the present application, and may be implemented by means of a pre-constructed correction reference information recognition model, for example. Here, the "correction reference information recognition model" is used to perform correction reference information recognition processing on input data of the correction reference information recognition model.
It should be noted that the "correction reference information recognition model" may be constructed according to the first sample text and the actual correction reference information of the first sample text. The "actual correction reference information of the first sample text" is used for representing reference information which is actually required to be used for carrying out preset correction processing on the "error correction result of the text to be processed"; the embodiment of the present application does not limit the manner of acquiring the "actual correction reference information of the first sample text", and may be implemented, for example, by manually labeling. In addition, the embodiment of the present application also does not limit the building process of the "calibration reference information identification model", and may be implemented by any existing or future model building method.
Based on the related content of S1, after the text to be processed is obtained, performing text error correction processing on the text to be processed to obtain an error correction result of the text to be processed, so that the error correction result can indicate a modification suggestion for at least one error character in the text to be processed; and performing correction reference information identification processing on the text to be processed to obtain correction reference information of the text to be processed, so that the correction reference information can show reference information required by performing preset correction processing on the error correction result of the text to be processed, and the error correction result of the text to be processed can be subjected to preset correction processing on the basis of the correction reference information subsequently, so that modification suggestions with errors can be eliminated from the error correction result of the text to be processed as much as possible.
S2: and performing preset correction processing on the error correction result of the text to be processed by using the correction reference information of the text to be processed to obtain an error correction result to be used.
And the preset correction processing is used for performing correction processing on the text error correction result.
In addition, the embodiment of the present application is not limited to the preset correction processing, and for example, the preset correction processing may specifically include: and correcting the error correction result of the text to be processed according to a preset error correction rule by using the correction reference information of the text to be processed to obtain an error correction result to be used. The "correction rule" is a rule that is required for performing correction processing on a text correction result set according to an application scenario.
The above "to-be-used error correction result" refers to a corrected text error correction result obtained after performing preset correction processing on the above "error correction result of the text to be processed", so that the modification suggestion of the error occurrence in the "to-be-used error correction result" is far less than the modification suggestion of the error occurrence in the above "error correction result of the text to be processed" (even, so that there is almost no modification suggestion of the error occurrence in the "to-be-used error correction result").
Based on the related content of S2, after the error correction result and the correction reference information of the text to be processed are obtained, the error correction result may be subjected to preset correction processing according to the correction reference information to obtain a result to be used for correcting, so that the error modification suggestion recorded in the result to be used for correcting is far less than the modification suggestion in the "result to be processed for correcting the text to be processed for the error (even, the modification suggestion in the result to be used for correcting the error is almost not present for the error), so that the result to be used for correcting can more accurately represent the modification suggestion for at least one error character in the text to be processed.
S3: and determining text error correction information of the text to be processed according to the error correction result to be used.
Wherein the text error correction information of the text to be processed is used for representing modification suggestions aiming at least one wrong character in the text to be processed.
In addition, the embodiment of S3 is not limited in this application, and for example, it may specifically include: and determining the error correction result to be used as the text error correction information of the text to be processed. For another example, in order to further improve the text error correction effect, it may be implemented by using any one of the embodiments of S3 shown below.
Based on the related contents of S1 to S3, it can be known that, in the text error correction method provided in the embodiment of the present application, after the text to be processed is acquired, the error correction result of the text to be processed and the correction reference information of the text to be processed are determined first; then, preset correction processing is carried out on the error correction result by utilizing the correction reference information to obtain an error correction result to be used, so that the modification suggestion in the error correction result to be used is more accurate; finally, according to the error correction result to be used, determining the text error correction information of the text to be processed, so that the text error correction information can more accurately represent the modification suggestion of at least one error character in the text to be processed, and the probability of error correction can be reduced as much as possible when the text to be processed is modified by using the text error correction information, thereby improving the text error correction effect and improving the text input experience of a user.
In order to further improve the text error correction effect, an embodiment of the present application further provides a possible implementation manner of determining the "error correction result," which may specifically include: and carrying out error correction processing on the text to be processed by utilizing at least one pre-constructed error correction model and/or at least one error correction rule to obtain at least one error correction result of the text to be processed.
The "error correction model" is used to perform error correction processing on input data of the error correction model; moreover, the embodiments of the present application do not limit the "error correction model," and may be implemented using any machine learning model (e.g., a language model), for example. In addition, the embodiment of the present application does not limit the construction process of the "error correction model", and may be implemented by using any existing or future model construction method.
It should be noted that, the network structure of different "error correction models" in the "at least one pre-constructed error correction model" may be different, and/or the construction process of different "error correction models" may also be different, so that there is also a difference in error correction processing performance between different "error correction models".
The above-mentioned "error correction rule" refers to a rule according to which error correction processing (for example, regular matching) is performed on one text data; the embodiment of the present application does not limit the manner of acquiring the "error correction rule", and for example, the method may be preset by applying a scenario.
To facilitate understanding of the above-described determination of "at least one error correction result", the following description is made in conjunction with fig. 3.
As an example, as shown in fig. 3, the determination process of the "at least one error correction result" may include: and utilizing an error correction module d to carry out error correction processing on the text to be processed to obtain the d-th error correction result of the text to be processed. The error correction module d is used for carrying out error correction processing on one text datum; the working principle of the "error correction module d" is not limited in the embodiments of the present application, and for example, the error detection module may perform the error detection processing by using a pre-constructed error detection model, or may also perform the error detection processing by using an error detection rule. D is a positive integer, D ≦ D, D is a positive integer, and D represents the number of error correction modules in FIG. 3 (i.e., the number of error correction results in the "at least one error correction result" described above).
Based on the relevant content of the determination process of the error correction result, the error correction process can be respectively performed on the text to be processed by means of a plurality of pre-constructed error correction models and/or a plurality of error correction rules to obtain a plurality of error correction results of the text to be processed, so that the error correction results can better represent the modification suggestion for at least one error character in the text to be processed, and the text error correction effect is favorably improved.
In fact, in order to improve the text error correction effect, the position of the error character in one text data can be referred to, and the correction reference information of the text data can be determined. Based on this, in a possible implementation, the "correction reference information of the text to be processed" may include an error detection result of the text to be processed.
The "error detection result of the text to be processed" is used to indicate a position of at least one error character in the text to be processed; moreover, the embodiment of the present application does not limit the "error detection result of the text to be processed", for example, it may include at least one error character position.
In addition, the determination process of the "error detection result of the text to be processed" is not limited in the embodiments of the present application, and for example, any method that can perform error detection processing on one piece of text data, which is currently available or will appear in the future, may be used.
In addition, in order to improve the accuracy of the error detection result, the embodiment of the present application further provides another possible implementation manner of determining the "error detection result", which may specifically include: and carrying out error detection processing on the text to be processed by utilizing at least one pre-constructed error detection model and/or at least one error detection rule to obtain at least one error detection result of the text to be processed.
The "error detection model" is used to perform error detection processing on input data of the error detection model; moreover, the embodiments of the present application do not limit the "error detection model," and may be implemented using any machine learning model (e.g., a language model), for example. In addition, the embodiment of the present application does not limit the construction process of the "error detection model", and may be implemented by any existing or future model construction method.
It should be noted that the network structure of the "using different" error detection models "in the at least one pre-constructed error detection model" may be different (for example, the network structure of the "at least one pre-constructed error detection model" may include Bidirectional encoding from transforms (BERTs), Neural Network Machine Translation (NMT), and the like), and/or the construction process of different "error detection models" may also be different, so that there is also a difference in error detection processing performance between different "error detection models".
The "error detection rule" refers to a rule that is required to be followed when error detection processing (for example, regular matching processing) is performed on one text data; the embodiment of the present application does not limit the manner of acquiring the "error detection rule", and for example, the method may be preset by applying a scenario.
In order to facilitate understanding of the above determination process of "at least one error detection result", the following description is made in conjunction with fig. 3.
As an example, as shown in fig. 3, the determination process of the "at least one error detection result" may include: and utilizing the error detection module n to carry out error detection processing on the text to be processed to obtain the nth error detection result of the text to be processed. The error detection module n is used for carrying out error detection processing on one text datum; the working principle of the error detection module n is not limited in the embodiments of the present application, and for example, the error detection module n may perform error detection processing by using a pre-constructed error detection model, or may perform error detection processing by using an error detection rule. N is a positive integer, N is not greater than N, N is a positive integer, and N represents the number of error detection modules in fig. 3 (i.e., the number of error detection results in the above-mentioned "at least one error detection result").
Based on the relevant content of the determination process of the error detection result, the error detection processing can be performed on the text to be processed respectively by means of a plurality of error detection models and/or a plurality of error detection rules which are constructed in advance, so that a plurality of error detection results of the text to be processed can be obtained, the error detection results can better show the position of at least one error character in the text to be processed, the accuracy of correcting the reference information is improved, and the text error correction effect is improved.
In fact, in order to improve the text error correction effect, the correction reference information of one text data can be determined according to the position of the protected character in the text data. Based on this, in one possible implementation, the above "correction reference information of the text to be processed" may include a protection character recognition result of the text to be processed.
The "recognition result of the protected characters of the text to be processed" is used to indicate the position of at least one protected character in the text to be processed; moreover, the embodiment of the present application does not limit the "protected character recognition result of the text to be processed", for example, it may include at least one protected character position.
In addition, the determination process of the "protected character recognition result of the text to be processed" is not limited in the embodiment of the present application, and for example, any existing or future method that can perform protected character recognition processing on one text data may be used.
In addition, in order to improve the accuracy of the error detection result, the embodiment of the present application further provides another possible implementation manner of determining the "protected character recognition result", which may specifically include: and performing protection character recognition processing on the text to be processed by utilizing at least one pre-constructed protection character recognition model and/or at least one protection character recognition rule to obtain at least one protection character recognition result of the text to be processed.
The "protection character recognition model" is used to perform protected character recognition processing on the input data of the protection character recognition model (for example, named entities such as names, place names, organization names, and the like in the input data of the protection character recognition model can be located and recognized); moreover, the embodiment of the present application does not limit the "protected character recognition model", and for example, it may be implemented by using any machine learning model (e.g., a language model based on named entity recognition). In addition, the embodiment of the present application does not limit the construction process of the "protected character recognition model", and may be implemented by any existing or future model construction method.
It should be noted that, the network structures of different "protection character recognition models" in the "at least one pre-constructed protection character recognition model" may be different, and/or the construction processes of different "protection character recognition models" may also be different, so that there is a difference in the protected character recognition processing performance between different "protection character recognition models".
The "protection character recognition rule" refers to a rule that is required to be used when performing protected character recognition processing (for example, performing regular matching processing) on one piece of text data, so that the "protection character recognition rule" can protect some special character information (for example, date, comment content, user-defined special character information, and the like); in addition, the embodiment of the present application does not limit the obtaining manner of the "protected character recognition rule", and for example, the obtaining manner may be preset by applying a scenario.
In order to facilitate understanding of the determination process of the "at least one protected character recognition result" described above, the following description is made in conjunction with fig. 3.
As an example, as shown in fig. 3, the determination process of the "at least one protection character recognition result" may include: and performing protection character recognition processing on the text to be processed by using a protection module q to obtain a q-th protection character recognition result of the text to be processed. The protection module q is used for carrying out protected character recognition processing on one text datum; the working principle of the "protection module q" is not limited in the embodiment of the present application, for example, the protection module q may perform protected character recognition processing by using a pre-constructed protection character recognition model, or may also perform protected character recognition processing by using a protection character recognition rule. Q is a positive integer, Q is not greater than Q, Q is a positive integer, and Q represents the number of protection modules in fig. 3 (i.e., the number of protection character recognition results in the above-described "at least one protection character recognition result").
Based on the relevant content of the determination process of the "protection character recognition result", protected character recognition processing can be respectively performed on the text to be processed by means of a plurality of pre-constructed protection character recognition models and/or a plurality of protection character recognition rules to obtain a plurality of protection character recognition results of the text to be processed, so that the protection character recognition results can better show the position of at least one protected character in the text to be processed, and therefore, the accuracy of correcting reference information is improved, and the text error correction effect is improved.
In fact, in order to better improve the text error correction effect, the position of the error character and the position of the protected character in one text data can be synthesized, and the correction reference information of the text data can be determined. Based on this, in one possible implementation, the "correction reference information of the text to be processed" may include an error detection result of the text to be processed and a protection character recognition result of the text to be processed.
It should be noted that, the relevant content of the "error detection result" refers to the relevant content of the "error detection result" above; the content of the "protected character recognition result" is referred to the content of the "protected character recognition result" above.
Therefore, under some conditions, the position of the error character of one text data and the position of the protected character can be referred to at the same time, and the correction reference information of the text data is determined, so that the correction reference information can more accurately describe the reference information required to be used for carrying out the preset correction processing on the error correction result of the text to be processed, and the correction effect of the preset correction processing is improved, and the text error correction effect is improved.
In fact, to further improve the text error correction effect, different correction processing procedures may be employed when correcting the error correction result using different correction reference information. Based on this, the present application also provides five possible implementations of S2, which are described below separately.
In a first possible implementation manner, when the "corrected reference information of the text to be processed" includes the error detection result of the text to be processed, S2 may specifically include: and performing first correction processing on the error correction result by using the error detection result and a first correction rule corresponding to the error detection result to obtain an error correction result to be used.
Here, the "first correction rule" refers to a correction rule on which correction processing is required when an error correction result of one text data is subjected to correction processing using an error detection result of the text data (that is, when the first correction processing is performed); the present application is not limited to the "first correction rule", and may be set in advance according to an application scenario, for example.
The above-mentioned "first correction processing" is for performing correction processing on an error correction result of one text data in accordance with the above-mentioned "first correction rule".
In addition, the working principle of the "first correction processing" is not limited in the embodiment of the present application, and for the sake of easy understanding, a description will be given below taking a correction process of the first object to be corrected as an example. Wherein, the "first object to be corrected" refers to an object to be corrected involved in the above-mentioned "first correction processing", so that the "first object to be corrected" is used to indicate a text error correction result that needs to be subjected to the first correction processing; also, the embodiment of the present application is not limited to the "first object to be corrected", and may be, for example, the above "error correction result" or the "second error correction result" shown in step 72 below.
As an example, when the number of the above "error detection results of a text to be processed" is N and the number of first objects to be corrected is M, the above "first correction processing" may specifically include steps 11 to 13:
step 11: and voting the kth modification suggestion in the mth first object to be corrected by using the N error detection results and at least one other first object to be corrected except the mth first object to be corrected in the M first objects to be corrected to obtain a reserved voting result of the kth modification suggestion. Wherein M is a positive integer, M is less than or equal to M, K is a positive integer, and K is less than or equal to Km,KmIs a positive integer, KmRepresents the m-th first pair to be correctedModifying the number of proposals in the elephant, wherein N is a positive integer; m is a positive integer.
Wherein, the "N error detection results" include: the 1 st error detection result of the text to be processed, the 2 nd error detection result of the text to be processed, … …, and the nth error detection result of the text to be processed.
The "M first objects to be corrected" refer to M text error correction results that require first correction processing; also, the present embodiment is not limited to the "M first objects to be corrected", for example, if steps 11 to 13 are used to perform the first correction processing on the D error correction results shown in fig. 3, the "M first objects to be corrected" may include the 1 st error correction result of the text to be processed, the 2 nd error correction result of the text to be processed, … …, and the D th error correction result of the text to be processed.
In addition, the embodiment of the present application is not limited to the above-mentioned "at least one first object to be corrected other than the mth first object to be corrected" in the M first objects to be corrected, for example, it may specifically include: all the other first objects to be corrected except the mth first object to be corrected among the M first objects to be corrected.
The above-mentioned "retained voting result of the kth modification suggestion" is used to indicate the possibility that the kth modification suggestion in the mth first object to be corrected is retained; and if the numerical value of the reserved voting result of the k-th modification suggestion is larger, the probability that the k-th modification suggestion is reserved is higher; if the value of the retained voting result of the kth modification suggestion is smaller, the kth modification suggestion is more likely to be deleted.
In addition, the embodiment of the present application does not limit the determination process of the "retained voting result of the kth modification suggestion", for example, when "at least one first object to be corrected other than the mth first object to be corrected" in the M first objects to be corrected includes F reference objects, the determination process of the "retained voting result of the kth modification suggestion" may specifically include steps 21 to 23:
step 21: and determining the number of votes of the nth error detection result for the kth modification suggestion according to the intersection of at least one error character position in the nth error detection result and at least one modified character position in the kth modification suggestion. Wherein N is a positive integer, N is less than or equal to N, and N is a positive integer.
As an example, when the above-mentioned "nth error detection result" is "error detection result 1" shown in fig. 4, the above-mentioned "mth first object to be corrected" is "first object to be corrected 1" shown in fig. 4, and the above-mentioned "kth modification suggestion in the mth first object to be corrected" includes "C" to be added5Modified to be C10"time, since the wrong character position in" error detection result 1 "shown in FIG. 4 includes the character" C5The "location" is such that there is an element in the intersection between at least one error character location in the "nth error detection result" and at least one modified character location in the kth modification suggestion, so that it can be determined that the number of votes voted by the nth error detection result for the kth modification suggestion is 1.
In FIG. 4, "C" isx"is used to indicate one character information; furthermore, the examples of the present application do not limit the CxE.g. the CxMay be the smallest semantic unit in a certain language (e.g., chinese kanji or english word). Wherein x is a positive integer. In addition, C in FIG. 41To C8Any two character information can be the same or different; c9And C4Different; c10And C5Different; c11And C8Different. In addition, the embodiment of the present application does not limit C in fig. 41To C11E.g. C1To "this", C2Is "yes", C3Is one, C4Is "one", C5Is rolled and C6Is shown, C7Is "sample", C8Is a calendar C9Is "sheet", C10Is "exhibition" C11Is taken as an example.
Based on the above-mentioned related content of step 21, when voting is performed on the kth modification suggestion using the nth error detection result, the number of votes voted by the nth error detection result for the kth modification suggestion may be determined based on an intersection between at least one erroneous character position in the nth error detection result and at least one modified character position in the kth modification suggestion (e.g., the number of votes voted by the nth error detection result for the kth modification suggestion may be determined directly from the number of elements in the intersection between at least one erroneous character position in the nth error detection result and at least one modified character position in the kth modification suggestion), so that the "number of votes for the k-th modification suggestion by the nth error detection result" can represent the recognition degree of the kth modification suggestion by the nth error detection result. Wherein N is a positive integer, N is less than or equal to N, and N is a positive integer.
Step 22: and determining the number of votes of the f-th reference object for the k-th modification suggestion according to the intersection of at least one modified character position in the f-th reference object and at least one modified character position in the k-th modification suggestion. Wherein F is a positive integer, F is less than or equal to F, and F is a positive integer.
Here, the "reference object" is used to indicate any one of the M first objects to be corrected other than the mth first object to be corrected.
To facilitate understanding of step 22, the following description is made with reference to an example.
As an example, when the above-mentioned "f-th reference object" is the "first object to be corrected 2" shown in fig. 4, the above-mentioned "m-th first object to be corrected" is the "first object to be corrected 1" shown in fig. 4, and the above-mentioned "k-th modification proposal in the m-th first object to be corrected" includes "C" to be added5Modified to be C10"time, since the modified character position in the" first object to be corrected 2 "shown in fig. 4 does not include the character" C5The "location" is such that the intersection between at least one modified character location in the "f-th reference object" and at least one modified character location in the k-th modification suggestion is an empty set, so that the number of votes voted by the f-th reference object for the k-th modification suggestion can be determined to be 0.
Based on the above-mentioned related content of step 22, when voting is performed on the kth modification suggestion by using the f-th reference object, the voting number of the f-th reference object for the kth modification suggestion may be determined according to the intersection between at least one modified character position in the f-th reference object and at least one modified character position in the kth modification suggestion (for example, the voting number of the f-th reference object for the kth modification suggestion may be directly determined as the voting number of the f-th reference object for the kth modification suggestion by the intersection between at least one modified character position in the f-th reference object and at least one modified character position in the kth modification suggestion), so that the "voting number of the f-th reference object for the kth modification suggestion" can indicate the recognition degree of the f-th reference object for the kth modification suggestion. Wherein F is a positive integer, F is less than or equal to F, and F is a positive integer.
Step 23: and performing first statistical analysis processing on the number of votes of the kth modification suggestion by the N error detection results and the number of votes of the kth modification suggestion by the F reference objects to obtain a reserved voting result of the kth modification suggestion.
Wherein, the "first statistical analysis processing" may be preset; also, the embodiment of the present application does not limit the "first statistical analysis process", and may be, for example, an addition process. For another example, it may be an averaging process, a maximum value process, or the like.
Based on the related content of the foregoing step 11, in the embodiment of the present application, the positions of the error characters in the error detection results and the positions of the modified characters in at least one of the M first objects to be corrected except the mth first object to be corrected may be utilized to perform voting on the modified character position in the kth modification suggestion in the mth first object to be corrected, so as to obtain the remaining voting result of the kth modification suggestion, so that the "remaining voting result of the kth modification suggestion" can indicate the recognition degree of the "N error detection results" and the "at least one of the M first objects to be corrected except the mth first object" on the kth modification suggestion, thereby enabling the degree of recognition of the "N error detection results" and the "at least one of the M first objects to be corrected" on the kth modification suggestionThe "retained voting result of the k-th modification suggestion" can more accurately represent the possibility that the k-th modification suggestion is retained. Wherein M is a positive integer, M is less than or equal to M, K is a positive integer, and K is less than or equal to Km,KmIs a positive integer, KmRepresenting the number of modified proposals in the mth first object to be corrected, wherein N is a positive integer; m is a positive integer.
Step 12: if the reserved voting result of the kth modification suggestion in the mth first object to be corrected does not meet the first condition, deleting the kth modification suggestion from the mth first object to be corrected; wherein M is a positive integer, M is less than or equal to M, K is a positive integer, and K is less than or equal to Km,KmIs a positive integer.
Wherein, the "first condition" may be preset; in addition, the embodiment of the present application does not limit the "first condition", and for example, it may specifically be: the reserve vote count threshold is reached (e.g., the reserve vote count threshold used in fig. 4 is 1).
The "threshold value of the number of remaining votes" may be set in advance. In addition, in order to improve the flexibility of the "threshold of the number of retained votes", another possible implementation manner of determining the "threshold of the number of retained votes" is provided in this embodiment of the application, which may specifically include: firstly, adding the number (for example, N) of error detection results and the number (for example, M-1) of first objects to be corrected in at least one other first object to be corrected except the mth first object to be corrected among the M first objects to be corrected to obtain a number sum value; then, the reserved vote count threshold is determined according to the product value between the sum of numbers and the first coefficient (for example, the product value between the sum of numbers and the first coefficient may be directly determined as the reserved vote count threshold). Here, the "first coefficient" may be set in advance (for example, the "first coefficient" may be 0.5).
Based on the related content in step 12, after the retained voting result of the kth modification suggestion in the mth first object to be corrected is obtained, it may be determined whether the retained voting result of the kth modification suggestion reaches the retained voting number threshold value, and if not, it may be determined that the kth modification suggestion reaches the retained voting number threshold valueThe retained voting result of (b) does not satisfy the first condition, so the kth modification suggestion can be directly deleted from the mth first object to be corrected, so that the kth modification suggestion no longer exists in the mth first object to be corrected (for example, the modification suggestions for "modifying" into "recorded in" first object to be corrected 1 "of fig. 4). Wherein M is a positive integer, M is less than or equal to M, K is a positive integer, and K is less than or equal to Km,KmIs a positive integer, KmRepresenting the number of modified proposals in the mth first object to be corrected, wherein N is a positive integer; m is a positive integer.
Step 13: if the reserved voting result of the kth modification suggestion in the mth first object to be corrected meets the first condition, reserving the kth modification suggestion in the mth first object to be corrected; wherein M is a positive integer, M is less than or equal to M, K is a positive integer, and K is less than or equal to Km,KmIs a positive integer.
In this embodiment of the present application, after obtaining the retained voting result of the kth modification suggestion in the mth first object to be corrected, it may be determined whether the retained voting result of the kth modification suggestion reaches the retained voting number threshold, and if so, it may be determined that the retained voting result of the kth modification suggestion meets the first condition, so that the kth modification suggestion in the mth first object to be corrected may be directly retained, so that the kth modification suggestion continues to exist in the mth first object to be corrected (for example, the kth modification suggestion continues to exist in "first object to be corrected 1" recorded in fig. 45Modified to be C10"modification suggestion). Wherein M is a positive integer, M is less than or equal to M, K is a positive integer, and K is less than or equal to Km,KmIs a positive integer, KmRepresenting the number of modified proposals in the mth first object to be corrected, wherein N is a positive integer; m is a positive integer.
Based on the related contents of the above steps 11 to 13, for the first correction processing, the voting result of each error detection result on the kth modification suggestion in the mth first object to be corrected and the voting result of each first object to be corrected other than the mth first object to be corrected in the M first objects to be corrected on the kth modification suggestion in the mth first object to be corrected may be integratedDetermining a retention voting result of the kth modification suggestion so that the "retention voting result of the kth modification suggestion" can indicate a possibility that the kth modification suggestion is retained; judging whether the kth modification suggestion in the mth first object to be corrected is reserved according to the relative size between the reserved voting result of the kth modification suggestion and the reserved voting number threshold value to obtain a reserved judgment result of the kth modification suggestion; and finally, determining the corrected mth first object to be corrected according to the reserved judgment results of all the modification suggestions in the mth first object to be corrected, so that the reserved voting results of all the modification suggestions in the mth first object to be corrected reach the reserved voting number threshold value. Wherein M is a positive integer, M is less than or equal to M, K is a positive integer, and K is less than or equal to Km,KmIs a positive integer, KmRepresenting the number of modified proposals in the mth first object to be corrected, wherein N is a positive integer; m is a positive integer.
It should be noted that the first possible embodiment of S2 may be implemented by the first correction processing shown in steps 11 to 13, and it is only necessary to replace the "first object to be corrected" in the first correction processing shown in steps 11 to 13 with the "error correction result", replace the "M" with the "D", and replace the "M" with the "D".
Based on the related contents of the first possible implementation manner of S2, when the "reference information for correction of the text to be processed" includes the error detection result of the text to be processed, the first correction processing may be performed on the error correction result by using the error detection result and the first correction rule corresponding to the error detection result, so as to obtain a corrected error correction result, and the corrected error correction result may be determined as the error correction result to be used, so that the error correction result to be used can more accurately indicate the modification suggestion of at least one error character in the text to be processed.
In a second possible implementation, when the "correction reference information of the text to be processed" includes the result of recognizing the protection character of the text to be processed, S2 may specifically include: and performing second correction processing on the error correction result by using the protection character recognition result and a second correction rule corresponding to the protection character recognition result to obtain a to-be-used error correction result.
Wherein the "second correction rule" refers to a correction rule on which correction processing is required when an error correction result of one text data is subjected to correction processing using a protection character recognition result of the text data (that is, when the second correction processing is executed); the present application is not limited to the "second calibration rule", and may be set in advance according to an application scenario, for example.
The "second correction processing" is for performing correction processing on an error correction result of one text data in accordance with the "second correction rule".
In addition, the working principle of the "second correction processing" is not limited in the embodiment of the present application, and for the sake of easy understanding, a correction process of the second object to be corrected is explained below as an example. The "second object to be corrected" refers to the object to be corrected related to the "second correction processing" so that the "second object to be corrected" is used to indicate a text error correction result that needs to be subjected to the second correction processing; further, the embodiment of the present application does not limit the "second object to be corrected", and for example, it may be the above "error correction result", or may be the "first error correction result" shown in step 52 below.
As an example, when the number of the "protected character recognition results of the text to be processed" is Q, the "second correction processing" may specifically include steps 31 to 33:
step 31: and voting the r-th modification suggestion in the second object to be corrected by utilizing the Q protection character recognition results to obtain a deletion voting result of the r-th modification suggestion. Wherein R is a positive integer, R is not more than R, R is a positive integer, and R represents the number of modification proposals in the second object to be corrected.
Wherein the "Q protection character recognition results" include: the 1 st guard character recognition result of the text to be processed, the 2 nd guard character recognition result of the text to be processed, … …, and the Q-th guard character recognition result of the text to be processed.
The above-mentioned "deletion voting result of the r-th modification suggestion" is used to indicate the possibility that the r-th modification suggestion in the second object to be corrected is deleted; and if the numerical value of the deletion voting result of the r-th modification suggestion is larger, the probability that the r-th modification suggestion is deleted is higher; if the numerical value of "deletion voting result of the r-th modification suggestion" is smaller, it indicates that the r-th modification suggestion is more likely to be retained.
In addition, the embodiment of the present application does not limit the determination process of the "deletion voting result of the r-th modification suggestion", for example, it may specifically include steps 41 to 42:
step 41: and determining the number of votes of the qth protective character recognition result for the r-th modification suggestion according to the intersection of at least one protected character position in the qth protective character recognition result and at least one modified character position in the r-th modification suggestion. Wherein Q is a positive integer, Q is not more than Q, and Q is a positive integer.
In the embodiment of the application, when the qth modification suggestion is voted by using the qth protective character recognition result, the number of votes by the qth protected character recognition result for the r-th modification suggestion may be determined based on the intersection between at least one protected character position in the qth protected character recognition result and at least one modified character position in the r-th modification suggestion (e.g., the number of elements in the intersection between at least one protected character position in the qth protected character recognition result and at least one modified character position in the r-th modification suggestion may be directly determined as the number of votes by the qth protected character recognition result for the r-th modification suggestion), so that the "number of votes for the r-th modification suggestion by the q-th protected character recognition result" can indicate the degree of objection of the q-th protected character recognition result to the r-th modification suggestion. Wherein Q is a positive integer, Q is not more than Q, and Q is a positive integer.
Step 42: and carrying out second statistical analysis processing on the voting number of the r modification suggestion according to the Q protection character recognition results to obtain a deletion voting result of the r modification suggestion.
Wherein, the "second statistical analysis processing" may be set in advance; also, the embodiment of the present application does not limit the "second statistical analysis process", and may be, for example, an addition process. For another example, it may be an averaging process, a maximum value process, or the like.
Based on the related contents of the above steps 41 to 42, in the embodiment of the present application, the deletion voting result of the r-th modification suggestion may be determined by referring to the intersection between the protected character position in each protection character recognition result and the modified character position in the r-th modification suggestion, so that the "deletion voting result of the r-th modification suggestion" can indicate the degree of objection of the above-mentioned "Q protection character recognition results" to the r-th modification suggestion, and thus the "deletion voting result of the r-th modification suggestion" can indicate the possibility of deletion of the r-th modification suggestion.
Based on the related content in step 31, if the number of the "second objects to be corrected" is J, for the jth second object to be corrected, the voting process may be performed on the r-th modification suggestion in the jth second object to be corrected by using the Q protection character recognition results, so as to obtain the voting result for deleting the r-th modification suggestion. Wherein R is a positive integer, R is not more than Rj,RjIs a positive integer, RjRepresenting the number of modified proposals in the jth second object to be corrected; j is a positive integer, J is less than or equal to J, and J is a positive integer.
It should be noted that, if steps 31 to 33 are used to perform the second correction processing on the D error correction results shown in fig. 3, the J second objects to be corrected may include the 1 st error correction result of the text to be processed, the 2 nd error correction result of the text to be processed, … …, and the D th error correction result of the text to be processed.
Step 32: and if the deletion voting result of the r modification suggestion in the second object to be corrected meets a second condition, deleting the r modification suggestion from the second object to be corrected. Wherein R is a positive integer, R is not more than R, R is a positive integer, and R represents the number of modification proposals in the second object to be corrected.
Wherein, the "second condition" may be preset; in addition, the embodiment of the present application does not limit the "second condition", and for example, it may specifically be: the threshold value of the number of deletion votes set in advance is reached (for example, 1).
Based on the related content in step 32, if the number of the "second objects to be corrected" is J, for the jth second object to be corrected, after the deletion voting result of the r-th modification suggestion in the jth second object to be corrected is obtained, it may be determined whether the deletion voting result of the r-th modification suggestion reaches the preset deletion voting number threshold value; if the result meets the second condition, it can be determined that the deletion voting result of the r-th modification suggestion meets the second condition, so that it can be determined that the modified position in the r-th modification suggestion should be protected, and therefore, the r-th modification suggestion can be directly deleted from the j-th second object to be corrected, so that the r-th modification suggestion no longer exists in the j-th second object to be corrected, and the purpose of protecting the modified character information related to the r-th modification suggestion can be achieved. Wherein R is a positive integer, R is not more than Rj,RjIs a positive integer, RjRepresenting the number of modified proposals in the jth second object to be corrected; j is a positive integer, J is less than or equal to J, and J is a positive integer.
Note that the embodiment of the present application does not limit the above "character information", and may be a word, a symbol, or the like.
Step 33: and if the deletion voting result of the r modification suggestion in the second object to be corrected does not meet the second condition, retaining the r modification suggestion in the second object to be corrected. Wherein R is a positive integer, R is not more than R, R is a positive integer, and R represents the number of modification proposals in the second object to be corrected.
In this embodiment of the present application, if the number of the "second objects to be corrected" is J, for a jth second object to be corrected, after obtaining a deletion voting result of an r-th modification suggestion in the jth second object to be corrected, it may be determined whether the deletion voting result of the r-th modification suggestion reaches a preset deletion voting number threshold value; if not, the r-th modification can be determinedThe deletion voting result of the suggestion does not satisfy the second condition, so that the modified position in the r modification suggestion can be determined not to be protected, and the r modification suggestion in the j second object to be corrected can be kept continuously, so that the r modification suggestion continues to exist in the j second object to be corrected. Wherein R is a positive integer, R is not more than Rj,RjIs a positive integer, RjRepresenting the number of modified proposals in the jth second object to be corrected; j is a positive integer, J is less than or equal to J, and J is a positive integer.
Based on the related contents in the above steps 31 to 33, if the number of the above "second objects to be corrected" is J, for the second correction processing, the voting result for deleting the r-th modification suggestion may be determined by first integrating the voting results of the respective protection character recognition results for the r-th modification suggestion in the J-th second object to be corrected, so that the "voting result for deleting the r-th modification suggestion" can indicate the possibility that the r-th modification suggestion is deleted; determining whether to delete the r-th modification suggestion from the jth second object to be corrected according to the relative size between the deletion voting result of the r-th modification suggestion and a preset deletion voting number threshold value, so as to obtain a deletion judgment result of the r-th modification suggestion; and finally, determining the corrected jth second object to be corrected according to the deletion judgment results of all the modification suggestions in the jth second object to be corrected, so that the deletion voting result of each modification suggestion in the corrected jth second object to be corrected is lower than the deletion voting number threshold. Wherein R is a positive integer, R is not more than Rj,RjIs a positive integer, RjRepresenting the number of modified proposals in the jth second object to be corrected; j is a positive integer, J is less than or equal to J, and J is a positive integer.
In fact, when it is determined that a certain modification character (e.g., "pass") in one modification suggestion needs to be protected, then words (e.g., "pass" etc.) that include the modification character in other modification suggestions also need to be protected. Based on this, in order to improve the correction effect (for example, correction efficiency and correction accuracy) of the second correction processing, the present embodiment provides another possible implementation manner of the above "second correction processing", and in this implementation manner, the "second correction processing" may further include, in addition to the above steps 31 to 33, steps 34 to 36:
step 34: and if the deletion voting result of the r-th modification suggestion in the second object to be corrected meets the second condition, determining a screening condition to be used according to the r-th modification suggestion.
Wherein the "to-be-used filtering condition" is used to filter the modified character information related to the modified character information in the above "r-th modification suggestion".
In addition, the determination process of the "to-be-used filtering condition" is not limited in the embodiment of the present application, for example, when the r-th modification suggestion includes modifying the first character information into the second character information, the "to-be-used filtering condition" may be determined according to the first character information, so that the "to-be-used filtering condition" may be used to filter the modified character information related to the first character information, thereby enabling protection of the modified character information related to the first character information in any modification suggestion except for the "r-th modification suggestion" to be subsequently implemented based on the "to-be-used filtering condition".
Step 35: and searching a target modification suggestion meeting the screening conditions to be used from the second object to be corrected to obtain a search result.
Wherein, the "target modification suggestion" refers to a modification suggestion satisfying the filtering condition to be used in the second object to be corrected; furthermore, the embodiment of the present application does not limit the search range of the above-mentioned "target modification suggestion", and for example, the search may be performed only from the second objects to be corrected including the above-mentioned "r-th modification suggestion", or may be performed from all the second objects to be corrected (for example, the above-mentioned "J second objects to be corrected").
Step 36: and if the search result shows that at least one target modification suggestion exists in the second object to be corrected, deleting the at least one target modification suggestion from the second object to be corrected.
In the embodiment of the present application, after the search result is obtained, it may be determined whether the search result indicates that at least one target modification suggestion exists in the second to-be-corrected, so that after it is determined that the search result indicates that at least one target modification suggestion exists in the second to-be-corrected, it may be determined that modified character information related to the target modification suggestions is all related to a modified character in the "r-th modification suggestion", so that it may be determined that modified character information related to the target modification suggestions also needs to be protected, and therefore, the target modification suggestions may be directly deleted from the second to-be-corrected object, so that the target modification suggestions no longer exist in the second to-be-corrected object, and thus, the purpose of protecting the modified character information related to the target modification suggestions may be achieved.
As can be seen from the related contents of the above steps 34 to 36, for the second correction processing, when it is determined that the modified character information related to a modification suggestion needs to be protected, a to-be-used filtering condition may be determined based on the modified character information, so that the to-be-used filtering condition is used for filtering the modified character information related to the modified character information; and then, other modification suggestions related to the modified character information are also protected by utilizing the screening condition to be used, so that the correction effect of the second correction processing is favorably improved.
It should be noted that the second possible implementation manner of S2 may be implemented by the second correction processing shown in the above-mentioned step 31 to step 33 (or the second correction processing shown in the above-mentioned step 31 to step 36), and it is only necessary to replace the "second object to be corrected" in the second correction processing shown in the above-mentioned step 31 to step 33 (or the second correction processing shown in the above-mentioned step 31 to step 36) with the "error correction result".
Based on the related contents of the second possible implementation manner of S2, when the "correction reference information of the text to be processed" includes the result of recognizing the protection character of the text to be processed, the second correction processing may be performed on the result of error correction by using the result of recognition of the protection character and the second correction rule corresponding to the result of recognition of the protection character, so as to obtain a corrected result of error correction, and the corrected result of error correction may be determined as the result of error correction to be used, so that the result of error correction to be used can more accurately represent the suggestion of modification of at least one error character in the text to be processed.
In a third possible implementation manner, when the "corrected reference information of the text to be processed" includes the error detection result of the text to be processed and the protection character recognition result of the text to be processed, S2 may specifically include steps 51 to 52:
step 51: and performing first correction processing on the error correction result of the text to be processed by using the error detection result of the text to be processed and a first correction rule corresponding to the error detection result to obtain a first error correction result.
The "first error correction result" is a corrected text error correction result obtained by performing the first correction process on one of the "error correction results".
It should be noted that, step 51 may be implemented by any one of the embodiments shown in the first possible implementation manner of S2, and only the "error correction result to be used" in any one of the embodiments shown in the first possible implementation manner of S2 needs to be replaced by the "first error correction result". For example, step 51 may be implemented by the first correction processing shown in steps 11 to 13, and it is sufficient to replace "the first object to be corrected" in the first correction processing shown in steps 11 to 13 with "the error correction result", "M" with "D", and "M" with "D".
Step 52: and carrying out second correction processing on the first error correction result by utilizing the protection character recognition result of the text to be processed and a second correction rule corresponding to the protection character recognition result to obtain an error correction result to be used.
It should be noted that step 52 may be implemented by any of the embodiments shown in the second possible embodiment of S2, and only the "error correction result" in any of the embodiments shown in the second possible embodiment of S2 needs to be replaced with the "first error correction result". For example, the step 52 may be implemented by the second correction processing shown in the above steps 31 to 33 (or the second correction processing shown in the above steps 31 to 36), and it is sufficient to replace the "second object to be corrected" in the second correction processing shown in the above steps 31 to 33 (or the second correction processing shown in the above steps 31 to 36) with the "first error correction result".
Based on the related contents of the above steps 51 to 52, as shown in fig. 3, when the number of the above "error correction results" is D, the number of the above "error detection results" is N, and the number of the above "protected character recognition results" is Q, N error detection results and D error correction results may be synthesized first, and the D error correction results are subjected to the first correction processing to obtain B first error correction results, so that the modification suggestions in the B first error correction results are only partial modification suggestions in the D error correction results, so that the number of modification suggestions in which errors occur in the B first error correction results is less than the number of modification suggestions in which errors occur in the D error correction results; and then referring to the Q protective character recognition results, performing second correction processing on each first error correction result to obtain at least one error correction result to be used, so that the modification suggestions in the error correction results to be used are only partial modification suggestions in the B first error correction results, and the number of the modification suggestions with errors in the error correction results to be used is less than that of the modification suggestions with errors in the B first error correction results, so that the possibility that the error modification suggestions exist in the error correction results to be used can be effectively reduced, and the text error correction effect is favorably improved. Wherein B is a positive integer, and B is not more than D.
In a fourth possible implementation, when the "corrected reference information of the text to be processed" includes the error detection result of the text to be processed and the protection character recognition result of the text to be processed, S2 may specifically include steps 61 to 63:
step 61: and performing first correction processing on the error correction result of the text to be processed by using the error detection result of the text to be processed and a first correction rule corresponding to the error detection result to obtain a first error correction result.
It should be noted that the relevant content of step 61 refers to the relevant content of step 51 above.
Step 62: and carrying out second correction processing on the error correction result of the text to be processed by utilizing the protection character recognition result of the text to be processed and a second correction rule corresponding to the protection character recognition result to obtain a second correction result.
The "second error correction result" is a corrected text error correction result obtained by performing the second correction process on one of the "error correction results".
It should be noted that, step 62 may be implemented by any one of the embodiments shown in the second possible implementation manner of S2, and only the "error correction result to be used" in any one of the embodiments shown in the second possible implementation manner of S2 needs to be replaced by the "second error correction result". For example, the step 62 may be implemented by the second correction processing shown in the above steps 31 to 33 (or the second correction processing shown in the above steps 31 to 36), and it is only necessary to replace the "second object to be corrected" in the second correction processing shown in the above steps 31 to 33 (or the second correction processing shown in the above steps 31 to 36) with the "error correction result".
And step 63: and determining an error correction result to be used according to the first error correction result and the second error correction result.
As an example, when the d-th first error correction result is obtained by performing the first correction process on the d-th error correction result, and the d-th second error correction result is obtained by performing the second correction process on the d-th error correction result, the intersection between the d-th first error correction result and the d-th second error correction result may be determined as the d-th error correction result to be used. Wherein D is a positive integer, D is less than or equal to D, D is a positive integer, and D represents the number of error correction results.
Based on the related contents of the above steps 61 to 62, when the number of the above "error correction results" is D, the number of the above "error detection results" is N, and the number of the above "protected character recognition results" is Q, first, N error detection results and D error correction results may be integrated, and a first correction process may be performed on the D error correction results to obtain B first error correction results, so that the modification suggestion in the B first error correction results is only a partial modification suggestion in the D error correction results, so that the number of the modification suggestion in the B first error correction results where an error occurs is less than the number of the modification suggestion in the D error correction results; and, the Q protection character recognition results can also be referred to, and the second correction processing is performed on each error correction result to obtain E second error correction results, so that the modification suggestions in the E second error correction results are only partial modification suggestions in the D error correction results, and thus, the number of the modification suggestions with errors in the E second error correction results is less than the number of the modification suggestions with errors in the D error correction results. Then, according to the B first error correction results and the E second error correction results, at least one error correction result to be used is determined, so that the modification suggestions in the error correction results to be used are only part of the B first error correction results (and the E second error correction results), and thus the number of the modification suggestions with errors in the error correction results to be used is less than that in the B first error correction results (and the E second error correction results), which can effectively reduce the possibility that the error modification suggestions exist in the error correction results to be used, and is favorable for improving the text error correction effect. Wherein E is a positive integer and E is not more than D; b is a positive integer, and B is not more than D.
In a fifth possible implementation manner, when the "corrected reference information of the text to be processed" includes the error detection result of the text to be processed and the protection character recognition result of the text to be processed, S2 may specifically include steps 71 to 72:
step 71: and carrying out second correction processing on the error correction result of the text to be processed by utilizing the protection character recognition result of the text to be processed and a second correction rule corresponding to the protection character recognition result to obtain a second correction result.
It should be noted that the relevant content of step 71 refers to the relevant content of step 62 above.
Step 72: and performing first correction processing on the second error correction result by using the error detection result of the text to be processed and a first correction rule corresponding to the error detection result to obtain an error correction result to be used.
Step 72 may be implemented by any of the embodiments shown in the first possible embodiment of S2, and only the "error correction result" in any of the embodiments shown in the first possible embodiment of S2 needs to be replaced with the "second error correction result". For example, when the number of the "second error correction results" is E, step 72 may be performed by the first correction processing shown in steps 11 to 13, and it is sufficient to replace the "first object to be corrected" in the first correction processing shown in steps 11 to 13 by the "second error correction results", the "M" by the "E", and the "M" by the "E". Wherein E is a positive integer, and E is less than or equal to D.
Based on the related contents of the above steps 71 to 72, as shown in fig. 3, when the number of the above "error correction results" is D, the number of the above "error detection results" is N, and the number of the above "protection character recognition results" is Q, the Q protection character recognition results may be referred to first, and each error correction result is subjected to the second correction processing to obtain E second error correction results, so that the modification suggestions in the E second error correction results are only partial modification suggestions in the D error correction results, and thus the number of the modification suggestions in which errors occur in the E second error correction results is less than the number of the modification suggestions in which errors occur in the D error correction results; and then, integrating the N error detection results and the E second error correction results, and performing first correction processing on the E second error correction results to obtain at least one error correction result to be used, so that the modification suggestions in the error correction results to be used are only partial modification suggestions in the E second error correction results, and the number of the modification suggestions with errors in the error correction results to be used is less than that of the modification suggestions with errors in the E second error correction results, thereby effectively reducing the possibility that the error modification suggestions exist in the error correction results to be used, and being beneficial to improving the text error correction effect. Wherein E is a positive integer, and E is less than or equal to D.
In fact, in order to further improve the text error correction effect, a screening process may be performed on the modification suggestions in the error correction result to be used. Based on this, the present application provides another possible implementation manner of S3, which may specifically include S31-S32:
s31: and carrying out preset suggestion screening processing on the error correction result to be used to obtain a third error correction result.
Among them, the "preset recommended screening process" may be set in advance.
The "third error correction result" refers to a screening result obtained by performing a preset suggested screening process on one error correction result to be used.
In addition, the embodiment of the present application does not limit the determination process of the "third error correction result" described above, and for example, the determination process may specifically include steps 81 to 83:
step 81: the probability of rewriting for each modification suggestion in the error correction result to be used is determined.
Wherein, the rewriting probability of the g-th modification suggestion in the error correction result to be used is used for representing the possibility of changing the semantics of the modified text to be processed after the text to be processed is modified by the g-th modification suggestion. Wherein G is a positive integer, G is less than or equal to G, G is a positive integer, and G represents the number of modified proposals in the error correction result to be used.
In addition, the present embodiment does not limit the determination process of the "rewriting probability" described above, and is described below with reference to an example.
As an example, when the above-mentioned "error correction result to be used" includes a proposal to be used, and the proposal to be used includes: when the third character information is modified into the fourth character information, the determination process of the probability of rewriting to be used recommendation may specifically include steps 91 to 92:
step 91: and determining the feature difference degree between the third character information and the fourth character information according to the character feature information of the third character information and the character feature information of the fourth character information.
Wherein, the character characteristic information is used for representing the characteristic of one character information; moreover, the embodiment of the present application does not limit the "character feature information", for example, it may include: at least one of the input operation information, the pronunciation characterization information, and the character shape information.
The above-mentioned "input operation information" is used to describe input characteristics (for example, key positions of a keyboard, a movement trace of a stylus pen, a sliding trace of a writing pad, etc.) of character information when the character information is input by means of an input device.
The pronunciation representation information is used for representing the pronunciation characteristics of one character information; furthermore, the pronunciation characterization information is not limited in the embodiments of the present application, and for example, for a chinese character information (e.g., chinese character, word, etc.), the pronunciation characterization information may include pinyin information of the chinese character information. For another example, for an english character message (e.g., an english word, an english phrase, etc.), the "pronunciation characterization message" may include phonetic symbol information of the english character message.
The above-mentioned "character shape information" is used to represent the outline characteristics of one character information; furthermore, the embodiment of the present application is not limited to the "character shape information", and for example, it may be determined according to the shape of the character in the character information.
The "degree of difference in characteristics between the third character information and the fourth character information" is used to indicate a difference (i.e., a degree of dissimilarity) between the "character characteristic information of the third character information" and the "character characteristic information of the fourth character information"; the determination process of the "feature difference degree between the third character information and the fourth character information" is not limited in the embodiments of the present application, and may be implemented by means of a character difference model that is constructed in advance, for example. The "character difference model" is used for performing feature difference degree measurement processing on input data of the character difference model.
And step 92: and determining the rewriting probability of the proposal to be used according to the characteristic difference degree between the third character information and the fourth character information.
The present embodiment does not limit the implementation manner of step 92; for example, it may specifically include: and determining the characteristic difference degree between the third character information and the fourth character information as the rewrite probability of the proposal to be used. As another example, step 92 may also include: and carrying out preset positive correlation processing on the feature difference degree between the third character information and the fourth character information to obtain the rewriting probability of the suggestion to be used, so that the 'rewriting probability of the suggestion to be used' and the 'feature difference degree between the third character information and the fourth character information' are in positive correlation. The "preset positive correlation processing" may be preset, and the embodiment of the present application is not limited to the "preset positive correlation processing", and may be implemented by any existing or future method capable of performing positive correlation processing on one numerical data.
Based on the related contents of the above steps 91 to 92, for a modification suggestion, the rewrite probability of the modification suggestion can be determined according to the difference presented on the character feature information between the modified character information and the modified character information in the modification suggestion, so that the rewrite probability can indicate the possibility that the semantic meaning of the modified text to be processed is changed after the text to be processed is modified by using the modification suggestion.
It should be noted that, the determination process of the "rewriting probability of the g-th modification suggestion in the to-be-used error correction result" may be implemented by the rewriting probability determination process shown in the above steps 91 to 92, and only the "to-be-used suggestion" in the rewriting probability determination process shown in the above steps 91 to 92 needs to be replaced by the "g-th modification suggestion". Wherein G is a positive integer, G is less than or equal to G, G is a positive integer, and G represents the number of modified proposals in the error correction result to be used.
Step 82: and judging whether the rewriting probability of each modification suggestion in the error correction result to be used meets a third condition or not to obtain a judgment result of each modification suggestion.
Wherein, the "third condition" may be preset; furthermore, the embodiment of the present application does not limit the "third condition", and for example, it may specifically include: a preset overwrite threshold (e.g., 0.9) is reached.
Based on the related content of the step 82, after the rewrite probability of the g-th modification suggestion in the error correction result to be used is obtained, it may be determined whether the rewrite probability of the g-th modification suggestion satisfies a third condition, so as to obtain a determination result of the g-th modification suggestion, so that the determination result of the g-th modification suggestion can indicate whether the rewrite probability of the g-th modification suggestion satisfies the third condition (e.g., whether the rewrite probability of the g-th modification suggestion reaches a preset rewrite threshold). Wherein G is a positive integer, G is less than or equal to G, G is a positive integer, and G represents the number of modified proposals in the error correction result to be used.
Step 83: and according to the judgment result of each modification suggestion in the error correction result to be used, carrying out rewriting suggestion deletion processing on the error correction result to be used to obtain a third error correction result.
Wherein the "rewriting suggestion deletion process" is used to perform a deletion process on at least one rewriting suggestion in one error correction result to be used.
The above-mentioned "rewriting suggestion" refers to a modification suggestion that the rewriting probability satisfies a third condition; moreover, the embodiment of the present application does not limit the determination process of the "rewriting suggestion", and for example, it may specifically include: if the "determination result of the g-th modification suggestion in the to-be-used error correction result" indicates that the rewrite probability of the g-th modification suggestion meets a third condition (for example, reaches a preset rewrite threshold), it may be determined that, after the text to be processed is modified by using the g-th modification suggestion, the semantic of the modified text to be processed is likely to change, so that the g-th modification suggestion may be determined as a rewrite suggestion; however, if the above "determination result of the g-th modification suggestion in the error correction result to be used" indicates that the rewriting probability of the g-th modification suggestion does not satisfy the third condition (for example, is lower than the preset rewriting threshold), it may be determined that after the text to be processed is modified by using the g-th modification suggestion, the semantics of the modified text to be processed does not substantially change, so it may be determined that the g-th modification suggestion is not a non-rewriting suggestion.
Based on the related content of the step 83, for one to-be-used error correction result, after the determination result of each modification suggestion in the to-be-used error correction result is obtained, at least one rewrite suggestion may be determined from the to-be-used error correction result according to the determination result of each modification suggestion in the to-be-used error correction result; and deleting the rewriting suggestions from the error correction result to be used to obtain a third error correction result, so that the rewriting probabilities of all the modification suggestions in the third error correction result do not meet a third condition, and the number of the modification suggestions with errors in the third error correction result is less than that of the modification suggestions with errors in the error correction result to be used.
Based on the related content of S31, after the y-th to-be-used error correction result is obtained, preset suggestion screening processing may be performed on the y-th to-be-used error correction result to obtain a third error correction result corresponding to the y-th to-be-used error correction result, so that the modification suggestions in the third error correction result are only partial modification suggestions in the y-th to-be-used error correction result, and thus the number of modification suggestions with errors in the third error correction result is less than the number of modification suggestions with errors in the y-th to-be-used error correction result. Wherein Y is a positive integer, Y is less than or equal to Y, Y is a positive integer, Y represents the number of error correction results to be used, and Y is less than or equal to D.
S32: and determining text error correction information of the text to be processed according to the third error correction result.
The present embodiment is not limited to the implementation of S32, and for example, when the number of the "third error correction results" is Y, S32 may specifically be: firstly, performing set processing on all modification suggestions in Y third error correction results to obtain a modification suggestion set; and performing redundancy removal processing on all modification suggestions in the modification suggestion set to obtain text error correction information of the text to be processed.
Based on the related contents of S31 to S32, after at least one to-be-used error correction result is obtained, a preset suggested screening process may be performed on each to-be-used error correction result to obtain each third error correction result; and then extracting text error correction information of the text to be processed from the third error correction results, so that the text error correction information can more accurately represent modification suggestions of at least one wrong character in the text to be processed, and the text error correction effect is favorably improved.
Actually, in order to further improve the text error correction effect, the semantic smoothness of the text after error correction obtained by modifying the text to be processed with one error correction result may be referred to, and it may be determined whether to participate in the process of refining the "text error correction information" with the error correction result. Based on this, the embodiment of the present application provides a possible implementation manner of determining "text error correction information of a text to be processed", which may specifically include steps 101 to 104:
step 101: and carrying out text modification processing on the text to be processed by utilizing the tth error correction result to obtain a tth candidate error correction text. Wherein T is a positive integer, T is less than or equal to T, T is a positive integer, and T represents the number of error correction results to be processed.
The text error correction information of the text to be processed is extracted according to the text error correction information of the text to be processed; also, the embodiment of the present application does not limit the "to-be-processed error correction result", for example, when the above-mentioned S3 is implemented by using steps 101 to 104, the "to-be-processed error correction result" is used to represent the above-mentioned "to-be-used error correction result". For another example, when the above-mentioned S32 is implemented by using steps 101 to 104, the "to-be-processed error correction result" is used to represent the above-mentioned "third error correction result".
The above-mentioned "text modification processing" refers to performing character modification processing on a text data in accordance with a text correction result so that modified character information (for example, "C" shown in fig. 4) related to the text correction result does not exist in the modified text data5") but such that modified character information to which the text error correction result relates exists in the modified text data (e.g.," C "shown in fig. 410”)。
The "tth candidate error correction text" refers to modified text data obtained by performing character modification processing on a text to be processed according to the tth error correction result to be processed.
Step 102: determining the smoothness score of the tth candidate error correction text; wherein T is a positive integer, T is less than or equal to T, and T is a positive integer.
The 'compliance score of the tth candidate error correction text' is used for representing the semantic compliance degree of the tth candidate error correction text; the determination process of the "compliance score of the tth candidate corrected text" is not limited in the embodiments of the present application, and for example, the determination process may be implemented by any existing or future method (e.g., N-gram language model) capable of performing semantic compliance measurement processing on one text data.
Step 103: and screening the corrected text meeting the fourth condition from the T candidate corrected texts according to the smoothness scores of the T candidate corrected texts.
Wherein, the "fourth condition" may be preset; in addition, the embodiment of the present application does not limit the "fourth condition", and for example, it may specifically be: the candidate corrected text with the largest compliance score. As another example, it may specifically be: and (4) candidate error correction texts with the smoothness scores reaching a preset score threshold. For example, it may specifically be: and the candidate error correction text with the maximum smoothness score reaches a preset score threshold.
It should be noted that the "preset score threshold" may be preset, or may be determined according to the smoothness score of the text to be processed (for example, the smoothness score of the text to be processed may be multiplied by the second coefficient to obtain the preset score threshold). Here, the "second coefficient" may be set in advance, and may be 110% for example.
The above-mentioned "corrected text" is used to indicate candidate corrected texts satisfying the fourth condition.
In addition, the embodiment of step 103 is not limited in the present application, and for the convenience of understanding, the following description is made with reference to two examples.
Example 1, step 103 may specifically include: firstly, determining the maximum value of the smoothness score according to the smoothness scores of the T candidate error correction texts, so that the 'maximum value of the smoothness score' is used for expressing the maximum value in the smoothness scores of the T candidate error correction texts; and when the fact that the smoothness score maximum value and the smoothness score of the text to be processed meet the fifth condition is determined, determining the candidate error correction text with the smoothness score maximum value as the text after error correction.
Wherein, the "fifth condition" may be preset; in addition, the embodiment of the present application does not limit the "fifth condition", and for example, it may specifically be: the highest value of the smoothness score is 10% higher than the smoothness score of the text to be processed (i.e., the highest value of the smoothness score is 110% times the smoothness score of the text to be processed).
Example 2, step 103 may specifically include steps 111 to 112:
step 111: and screening at least one target error correction text meeting a sixth condition from the T candidate error correction texts according to the smoothness scores of the T candidate error correction texts.
Wherein, the "sixth condition" may be preset; in addition, the embodiment of the present application does not limit the "sixth condition", and for example, it may specifically be: and candidate error correction texts with the smoothness scores 10% higher than that of the texts to be processed.
The above-mentioned "target error correction text" is used to indicate candidate error correction texts satisfying the sixth condition.
Step 112: and screening error-corrected texts meeting a seventh condition from the at least one target error-corrected text.
Wherein, the "seventh condition" may be preset; in addition, the embodiment of the present application does not limit the "seventh condition", and for example, it may specifically be: and P target error correction texts with the largest smoothness scores. P is a positive integer.
Based on the related content in step 103, after the smoothness scores of the T candidate error correction texts are obtained, the smoothness scores of the candidate error correction texts may be referred to, and an error corrected text meeting a fourth condition is screened from the candidate error correction texts, so that the error corrected text has a better semantic smoothness, and the error corrected text can be subsequently used to determine text error correction information of the text to be processed.
Step 104: and determining text error correction information of the text to be processed according to the to-be-processed error correction result corresponding to the text after error correction.
The term "to-be-processed error correction result corresponding to the text after error correction" refers to a to-be-processed error correction result used when the text after error correction is generated. For example, if the "text after error correction" is obtained by performing character modification processing on the text to be processed by using the tth result of error correction to be processed, the result of error correction to be processed corresponding to the "text after error correction" is the tth result of error correction to be processed.
In addition, the embodiment of the present application is not limited to the implementation of step 104, and for example, the implementation may specifically include: and determining the to-be-processed error correction result corresponding to the text after error correction as the text error correction information of the text to be processed.
Based on the related contents of the above steps 101 to 104, after T error correction results to be processed are obtained, text modification processing may be performed on the text to be processed by using each error correction result to be processed, so as to obtain each candidate error correction text; determining an error-corrected text with higher semantic currency degree according to the semantic currency degree of the candidate error-corrected texts; finally, according to the to-be-processed error correction result corresponding to the text after error correction, text error correction information of the text to be processed is determined, so that the text error correction information can more accurately represent the modification suggestion of at least one error character in the text to be processed, and the text error correction effect is favorably improved.
It should be noted that, the above S3 may be implemented by using the above steps 101 to 104, and it is only necessary to replace the "to-be-processed error correction result" in the above steps 101 to 104 with the "to-be-used error correction result". In addition, the above-mentioned step S32 may also be implemented by adopting the above-mentioned step 101 to step 104, and it is only necessary to replace the "to-be-processed error correction result" in the above-mentioned step 101 to step 104 with the "third error correction result".
Based on the text error correction method provided by the above method embodiment, the embodiment of the present application further provides a text error correction device, which is explained and explained below with reference to the accompanying drawings.
Referring to fig. 5, the figure is a schematic structural diagram of a text error correction apparatus according to an embodiment of the present application.
The text error correction apparatus 500 provided in the embodiment of the present application includes:
a result determining unit 501, configured to determine an error correction result of the text to be processed and correction reference information of the text to be processed after the text to be processed is acquired;
a result correction unit 502, configured to perform preset correction processing on the error correction result by using the correction reference information, so as to obtain an error correction result to be used;
an information determining unit 503, configured to determine text error correction information of the text to be processed according to the error correction result to be used.
In a possible embodiment, the correction reference information includes an error detection result and/or a protection character recognition result; the error correction result is used for representing the position of at least one error character in the text to be processed; the protection character recognition result is used for representing the position of at least one protected character in the text to be processed.
In a possible embodiment, the correction reference information comprises an error detection result; the result correction unit 502 is specifically configured to: and performing first correction processing on the error correction result by using the error detection result and a first correction rule corresponding to the error detection result to obtain the error correction result to be used.
In one possible embodiment, the correction reference information includes a protection character recognition result; the result correction unit 502 is specifically configured to: and performing second correction processing on the error correction result by using the protection character recognition result and a second correction rule corresponding to the protection character recognition result to obtain the error correction result to be used.
In one possible embodiment, the correction reference information includes an error correction result and a protection character recognition result; the result correction unit 502 is specifically configured to: performing first correction processing on the error correction result by using the error detection result and a first correction rule corresponding to the error detection result to obtain a first error correction result; and performing second correction processing on the first error correction result by using the protection character recognition result and a second correction rule corresponding to the protection character recognition result to obtain the error correction result to be used.
In one possible embodiment, the correction reference information includes an error correction result and a protection character recognition result; the result correction unit 502 is specifically configured to: performing first correction processing on the error correction result by using the error detection result and a first correction rule corresponding to the error detection result to obtain a first error correction result; performing second correction processing on the error correction result by using the protection character recognition result and a second correction rule corresponding to the protection character recognition result to obtain a second correction result; and determining the error correction result to be used according to the first error correction result and the second error correction result.
In one possible embodiment, the correction reference information includes an error correction result and a protection character recognition result; the result correction unit 502 is specifically configured to: performing second correction processing on the error correction result by using the protection character recognition result and a second correction rule corresponding to the protection character recognition result to obtain a second correction result; and performing first correction processing on the second error correction result by using the error detection result and a first correction rule corresponding to the error detection result to obtain the error correction result to be used.
In a possible embodiment, the number of the error detection results is N, and the number of the first objects to be corrected is M; wherein N is a positive integer; m is a positive integer;
the first correction processing includes: voting a kth modification suggestion in the mth first object to be corrected by using the N error detection results and at least one other first object to be corrected except the mth first object to be corrected in the M first objects to be corrected to obtain a reserved voting result of the kth modification suggestion; wherein M is a positive integer, M is less than or equal to M, K is a positive integer, and K is less than or equal to Km,KmIs a positive integer, KmRepresenting the number of modified proposals in the mth first object to be corrected; if the reserved voting result of the kth modification suggestion does not meet the first condition, deleting the kth modification suggestion from the mth first object to be corrected; wherein M is a positive integer, M is less than or equal to M, K is a positive integer, and K is less than or equal to Km,KmIs a positive integer.
In a possible implementation manner, the number of the protection character recognition results is Q; wherein Q is a positive integer;
the second correction processing includes: voting the r-th modification suggestion in the second object to be corrected by using Q protection character recognition results to obtain a deletion voting result of the r-th modification suggestion; wherein R is a positive integer, R is not more than R, R is a positive integer, and R represents the number of modification proposals in the second object to be corrected; and if the deletion voting result of the r modification suggestion meets a second condition, deleting the r modification suggestion from the second object to be corrected.
In a possible implementation, the second correction processing further includes: if the deletion voting result of the r modification suggestion meets a second condition, determining a screening condition to be used according to the r modification suggestion; searching a target modification suggestion meeting the to-be-used screening condition from the second to-be-corrected object to obtain a search result; and if the search result shows that at least one target modification suggestion exists in the second object to be corrected, deleting the at least one target modification suggestion from the second object to be corrected.
In one possible embodiment, the r-th modification suggestion includes modifying the first character information into the second character information; the screening condition to be used is determined according to the first character information.
In a possible implementation, the result determining unit 501 includes:
the error detection subunit is configured to perform error detection processing on the to-be-processed text by using at least one pre-constructed error detection model and/or at least one error detection rule, so as to obtain at least one error detection result of the to-be-processed text;
and the protection recognition subunit is used for performing protection character recognition processing on the text to be processed by utilizing at least one pre-constructed protection character recognition model and/or at least one protection character recognition rule to obtain at least one protection character recognition result of the text to be processed.
In a possible implementation, the result determining unit 501 includes:
and the error correction subunit is configured to perform error correction processing on the text to be processed by using at least one pre-constructed error correction model and/or at least one error correction rule, so as to obtain at least one error correction result of the text to be processed.
In a possible implementation, the information determining unit 503 includes:
the suggestion screening subunit is used for carrying out preset suggestion screening processing on the error correction result to be used to obtain a third error correction result;
and the information determining subunit is used for determining the text error correction information of the text to be processed according to the third error correction result.
In a possible embodiment, the proposed screening subunit is specifically configured to: determining rewriting probabilities of various modification suggestions in the error correction result to be used; judging whether the rewriting probability of each modification suggestion in the error correction result to be used meets a third condition or not to obtain a judgment result of each modification suggestion; and according to the judgment result of each modification suggestion, carrying out rewriting suggestion deletion processing on the error correction result to be used to obtain the third error correction result.
In one possible embodiment, the to-be-used error correction result includes a to-be-used suggestion, and the to-be-used suggestion includes: modifying the third character information into fourth character information;
the determination process of the probability of rewriting proposed to be used includes: determining a feature difference degree between the third character information and the fourth character information according to the character feature information of the third character information and the character feature information of the fourth character information; and determining the rewriting probability of the suggestion to be used according to the feature difference degree between the third character information and the fourth character information.
In one possible implementation, the character feature information includes: at least one of the input operation information, the pronunciation characterization information, and the character shape information.
In a possible implementation, the information determining unit 503 includes:
the text modification subunit is used for performing text modification processing on the text to be processed by utilizing the tth error correction result to be processed to obtain a tth candidate error correction text; wherein T is a positive integer, T is less than or equal to T, T is a positive integer, and T represents the number of the error correction results to be processed;
a smoothness score subunit, configured to determine a smoothness score of the tth candidate error correction text; wherein T is a positive integer, T is less than or equal to T, and T is a positive integer;
the text screening subunit is used for screening the error-corrected text meeting a fourth condition from the T candidate error-corrected texts according to the smoothness scores of the T candidate error-corrected texts;
and the error correction determining subunit is used for determining the text error correction information of the text to be processed according to the to-be-processed error correction result corresponding to the text after error correction.
In a possible implementation manner, the text filtering subunit is specifically configured to: determining the maximum value of the smoothness scores according to the smoothness scores of the T candidate error correction texts; and if the fact that the highest smoothness score and the smoothness score of the text to be processed meet a fifth condition is determined, determining the candidate error correction text with the highest smoothness score as the error corrected text.
In a possible implementation manner, the text filtering subunit is specifically configured to: screening at least one target error correction text meeting a sixth condition from the T candidate error correction texts according to the smoothness scores of the T candidate error correction texts; and screening the corrected text meeting a seventh condition from the at least one target corrected text.
Based on the related content of the text error correction device 500, for the text error correction device 500, after the text to be processed is obtained, the error correction result of the text to be processed and the correction reference information of the text to be processed are determined; then, preset correction processing is carried out on the error correction result by utilizing the correction reference information to obtain an error correction result to be used, so that the modification suggestion in the error correction result to be used is more accurate; finally, according to the error correction result to be used, determining the text error correction information of the text to be processed, so that the text error correction information can more accurately represent the modification suggestion of at least one error character in the text to be processed, and the probability of error correction can be reduced as much as possible when the text to be processed is modified by using the text error correction information, thereby improving the text error correction effect and improving the text input experience of a user.
Further, an embodiment of the present application further provides an apparatus, where the apparatus includes a processor and a memory:
the memory is used for storing a computer program;
the processor is used for executing any implementation mode of the text error correction method provided by the embodiment of the application according to the computer program.
Further, the present application also provides a computer-readable storage medium for storing a computer program, where the computer program is used to execute any implementation manner of the text error correction method provided by the present application.
Further, an embodiment of the present application also provides a computer program product, which when running on a terminal device, causes the terminal device to execute any implementation of the text error correction method provided in the embodiment of the present application.
It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
The foregoing is merely a preferred embodiment of the invention and is not intended to limit the invention in any manner. Although the present invention has been described with reference to the preferred embodiments, it is not intended to be limited thereto. Those skilled in the art can make numerous possible variations and modifications to the present teachings, or modify equivalent embodiments to equivalent variations, without departing from the scope of the present teachings, using the methods and techniques disclosed above. Therefore, any simple modification, equivalent change and modification made to the above embodiments according to the technical essence of the present invention are still within the scope of the protection of the technical solution of the present invention, unless the contents of the technical solution of the present invention are departed.

Claims (21)

1. A method for correcting text, the method comprising:
after a text to be processed is obtained, determining an error correction result of the text to be processed and correction reference information of the text to be processed;
performing preset correction processing on the error correction result by using the correction reference information to obtain an error correction result to be used;
and determining text error correction information of the text to be processed according to the error correction result to be used.
2. The method according to claim 1, wherein the correction reference information includes an error detection result and/or a protection character recognition result; the error correction result is used for representing the position of at least one error character in the text to be processed; the protection character recognition result is used for representing the position of at least one protected character in the text to be processed.
3. The method according to claim 2, wherein the correction reference information includes an error detection result;
the using the correction reference information to perform preset correction processing on the error correction result to obtain an error correction result to be used includes:
and performing first correction processing on the error correction result by using the error detection result and a first correction rule corresponding to the error detection result to obtain the error correction result to be used.
4. The method according to claim 2, wherein the correction reference information includes a guard character recognition result;
the using the correction reference information to perform preset correction processing on the error correction result to obtain an error correction result to be used includes:
and performing second correction processing on the error correction result by using the protection character recognition result and a second correction rule corresponding to the protection character recognition result to obtain the error correction result to be used.
5. The method according to claim 2, wherein the correction reference information includes an error correction result and a protection character recognition result;
the determination process of the error correction result to be used comprises the following steps:
performing first correction processing on the error correction result by using the error detection result and a first correction rule corresponding to the error detection result to obtain a first error correction result; performing second correction processing on the first error correction result by using the protection character recognition result and a second correction rule corresponding to the protection character recognition result to obtain the error correction result to be used;
alternatively, the first and second electrodes may be,
the determination process of the error correction result to be used comprises the following steps:
performing first correction processing on the error correction result by using the error detection result and a first correction rule corresponding to the error detection result to obtain a first error correction result; performing second correction processing on the error correction result by using the protection character recognition result and a second correction rule corresponding to the protection character recognition result to obtain a second correction result; determining the error correction result to be used according to the first error correction result and the second error correction result;
alternatively, the first and second electrodes may be,
the determination process of the error correction result to be used comprises the following steps:
performing second correction processing on the error correction result by using the protection character recognition result and a second correction rule corresponding to the protection character recognition result to obtain a second correction result; and performing first correction processing on the second error correction result by using the error detection result and a first correction rule corresponding to the error detection result to obtain the error correction result to be used.
6. The method according to claim 3 or 5, wherein the number of the error detection results is N, the number of the first objects to be corrected is M; wherein N is a positive integer; m is a positive integer;
the first correction processing includes:
voting a kth modification suggestion in the mth first object to be corrected by using the N error detection results and at least one other first object to be corrected except the mth first object to be corrected in the M first objects to be corrected to obtain a reserved voting result of the kth modification suggestion; wherein M is a positive integer, M is less than or equal to M, K is a positive integer, and K is less than or equal to Km,KmIs a positive integer, KmTo representModifying the number of proposals in the mth first object to be corrected;
if the reserved voting result of the kth modification suggestion does not meet the first condition, deleting the kth modification suggestion from the mth first object to be corrected; wherein M is a positive integer, M is less than or equal to M, K is a positive integer, and K is less than or equal to Km,KmIs a positive integer.
7. The method according to claim 4 or 5, wherein the number of the protection character recognition results is Q; wherein Q is a positive integer;
the second correction processing includes:
voting the r-th modification suggestion in the second object to be corrected by using Q protection character recognition results to obtain a deletion voting result of the r-th modification suggestion; wherein R is a positive integer, R is not more than R, R is a positive integer, and R represents the number of modification proposals in the second object to be corrected;
and if the deletion voting result of the r modification suggestion meets a second condition, deleting the r modification suggestion from the second object to be corrected.
8. The method of claim 7, further comprising:
if the deletion voting result of the r modification suggestion meets a second condition, determining a screening condition to be used according to the r modification suggestion;
searching a target modification suggestion meeting the to-be-used screening condition from the second to-be-corrected object to obtain a search result;
and if the search result shows that at least one target modification suggestion exists in the second object to be corrected, deleting the at least one target modification suggestion from the second object to be corrected.
9. The method of claim 8, wherein the r-th modification suggestion includes modifying first character information into second character information; the screening condition to be used is determined according to the first character information.
10. The method of claim 2, wherein the determining of the error detection result comprises:
carrying out error detection processing on the text to be processed by utilizing at least one pre-constructed error detection model and/or at least one error detection rule to obtain at least one error detection result of the text to be processed;
the determination process of the protection character recognition result comprises the following steps:
and performing protection character recognition processing on the text to be processed by utilizing at least one pre-constructed protection character recognition model and/or at least one protection character recognition rule to obtain at least one protection character recognition result of the text to be processed.
11. The method of claim 1, wherein the determining of the error correction result comprises:
and carrying out error correction processing on the text to be processed by utilizing at least one pre-constructed error correction model and/or at least one error correction rule to obtain at least one error correction result of the text to be processed.
12. The method according to claim 1, wherein the determining the text correction information of the text to be processed according to the result of the correction to be used comprises:
carrying out preset suggestion screening processing on the error correction result to be used to obtain a third error correction result;
and determining text error correction information of the text to be processed according to the third error correction result.
13. The method according to claim 12, wherein the determining of the third error correction result comprises:
determining rewriting probabilities of various modification suggestions in the error correction result to be used;
judging whether the rewriting probability of each modification suggestion in the error correction result to be used meets a third condition or not to obtain a judgment result of each modification suggestion;
and according to the judgment result of each modification suggestion, carrying out rewriting suggestion deletion processing on the error correction result to be used to obtain the third error correction result.
14. The method of claim 13, wherein the to-be-used error correction result comprises a to-be-used suggestion, and wherein the to-be-used suggestion comprises: modifying the third character information into fourth character information;
the determination process of the probability of rewriting proposed to be used includes:
determining a feature difference degree between the third character information and the fourth character information according to the character feature information of the third character information and the character feature information of the fourth character information;
and determining the rewriting probability of the suggestion to be used according to the feature difference degree between the third character information and the fourth character information.
15. The method of claim 14, wherein the character characterization information comprises: at least one of the input operation information, the pronunciation characterization information, and the character shape information.
16. The method according to claim 1 or 12, wherein the determining process of the text error correction information of the text to be processed comprises:
carrying out text modification processing on the text to be processed by utilizing the tth error correction result to obtain a tth candidate error correction text; wherein T is a positive integer, T is less than or equal to T, T is a positive integer, and T represents the number of the error correction results to be processed;
determining the smoothness score of the tth candidate error correction text; wherein T is a positive integer, T is less than or equal to T, and T is a positive integer;
screening the corrected texts meeting a fourth condition from the T candidate error correction texts according to the smoothness scores of the T candidate error correction texts;
and determining text error correction information of the text to be processed according to the to-be-processed error correction result corresponding to the text after error correction.
17. The method of claim 16, wherein the step of screening the corrected text meeting a fourth condition from the T candidate corrected texts according to the smoothness scores of the T candidate corrected texts comprises:
determining the maximum value of the smoothness scores according to the smoothness scores of the T candidate error correction texts; if the fact that the highest smoothness score and the smoothness score of the text to be processed meet a fifth condition is determined, determining the candidate error correction text with the highest smoothness score as the text after error correction;
alternatively, the first and second electrodes may be,
the step of screening the corrected text meeting a fourth condition from the T candidate corrected texts according to the smoothness scores of the T candidate corrected texts comprises the following steps:
screening at least one target error correction text meeting a sixth condition from the T candidate error correction texts according to the smoothness scores of the T candidate error correction texts; and screening the corrected text meeting a seventh condition from the at least one target corrected text.
18. A text correction apparatus, comprising:
the result determining unit is used for determining an error correction result of the text to be processed and correction reference information of the text to be processed after the text to be processed is obtained;
the result correction unit is used for carrying out preset correction processing on the error correction result by utilizing the correction reference information to obtain an error correction result to be used;
and the information determining unit is used for determining the text error correction information of the text to be processed according to the error correction result to be used.
19. An apparatus, comprising a processor and a memory:
the memory is used for storing a computer program;
the processor is configured to perform the method of any one of claims 1-17 in accordance with the computer program.
20. A computer-readable storage medium, characterized in that the computer-readable storage medium is used to store a computer program for performing the method of any of claims 1-17.
21. A computer program product, characterized in that the computer program product, when run on a terminal device, causes the terminal device to perform the method of any of claims 1-17.
CN202111122968.8A 2021-09-24 2021-09-24 Text error correction method, device, equipment and computer readable storage medium Active CN113779970B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111122968.8A CN113779970B (en) 2021-09-24 2021-09-24 Text error correction method, device, equipment and computer readable storage medium
PCT/CN2022/119636 WO2023045868A1 (en) 2021-09-24 2022-09-19 Text error correction method and related device therefor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111122968.8A CN113779970B (en) 2021-09-24 2021-09-24 Text error correction method, device, equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN113779970A true CN113779970A (en) 2021-12-10
CN113779970B CN113779970B (en) 2023-05-23

Family

ID=78853230

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111122968.8A Active CN113779970B (en) 2021-09-24 2021-09-24 Text error correction method, device, equipment and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN113779970B (en)
WO (1) WO2023045868A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115630635A (en) * 2022-12-20 2023-01-20 苏州大学 Chinese text proofreading method, system and equipment based on retrieval and multiple stages
CN115713934A (en) * 2022-11-30 2023-02-24 中移互联网有限公司 Error correction method, device, equipment and medium for converting voice into text
WO2023045868A1 (en) * 2021-09-24 2023-03-30 北京字跳网络技术有限公司 Text error correction method and related device therefor
CN117807990A (en) * 2023-12-27 2024-04-02 北京海泰方圆科技股份有限公司 Text processing method, device, equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110457688A (en) * 2019-07-23 2019-11-15 广州视源电子科技股份有限公司 Correction processing method and device, storage medium and processor
CN111339758A (en) * 2020-02-21 2020-06-26 苏宁云计算有限公司 Text error correction method and system based on deep learning model
CN111639489A (en) * 2020-05-15 2020-09-08 民生科技有限责任公司 Chinese text error correction system, method, device and computer readable storage medium
CN112861518A (en) * 2020-12-29 2021-05-28 科大讯飞股份有限公司 Text error correction method and device, storage medium and electronic device
WO2021129411A1 (en) * 2019-12-23 2021-07-01 华为技术有限公司 Text processing method and device

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104915264A (en) * 2015-05-29 2015-09-16 北京搜狗科技发展有限公司 Input error-correction method and device
CN107807915B (en) * 2017-09-27 2021-03-09 北京百度网讯科技有限公司 Error correction model establishing method, device, equipment and medium based on error correction platform
CN110765996B (en) * 2019-10-21 2022-07-29 北京百度网讯科技有限公司 Text information processing method and device
CN111723791A (en) * 2020-06-11 2020-09-29 腾讯科技(深圳)有限公司 Character error correction method, device, equipment and storage medium
CN111950262A (en) * 2020-07-17 2020-11-17 武汉联影医疗科技有限公司 Data processing method, data processing device, computer equipment and storage medium
CN112784581B (en) * 2020-11-20 2024-02-13 网易(杭州)网络有限公司 Text error correction method, device, medium and electronic equipment
CN112560842B (en) * 2020-12-07 2021-10-22 马上消费金融股份有限公司 Information identification method, device, equipment and readable storage medium
CN112580324B (en) * 2020-12-24 2023-07-25 北京百度网讯科技有限公司 Text error correction method, device, electronic equipment and storage medium
CN113239683A (en) * 2021-05-31 2021-08-10 平安科技(深圳)有限公司 Method, system and medium for correcting Chinese text errors
CN113779970B (en) * 2021-09-24 2023-05-23 北京字跳网络技术有限公司 Text error correction method, device, equipment and computer readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110457688A (en) * 2019-07-23 2019-11-15 广州视源电子科技股份有限公司 Correction processing method and device, storage medium and processor
WO2021129411A1 (en) * 2019-12-23 2021-07-01 华为技术有限公司 Text processing method and device
CN111339758A (en) * 2020-02-21 2020-06-26 苏宁云计算有限公司 Text error correction method and system based on deep learning model
CN111639489A (en) * 2020-05-15 2020-09-08 民生科技有限责任公司 Chinese text error correction system, method, device and computer readable storage medium
CN112861518A (en) * 2020-12-29 2021-05-28 科大讯飞股份有限公司 Text error correction method and device, storage medium and electronic device

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023045868A1 (en) * 2021-09-24 2023-03-30 北京字跳网络技术有限公司 Text error correction method and related device therefor
CN115713934A (en) * 2022-11-30 2023-02-24 中移互联网有限公司 Error correction method, device, equipment and medium for converting voice into text
CN115713934B (en) * 2022-11-30 2023-08-15 中移互联网有限公司 Error correction method, device, equipment and medium for converting voice into text
CN115630635A (en) * 2022-12-20 2023-01-20 苏州大学 Chinese text proofreading method, system and equipment based on retrieval and multiple stages
CN117807990A (en) * 2023-12-27 2024-04-02 北京海泰方圆科技股份有限公司 Text processing method, device, equipment and medium

Also Published As

Publication number Publication date
WO2023045868A1 (en) 2023-03-30
CN113779970B (en) 2023-05-23

Similar Documents

Publication Publication Date Title
CN113779970A (en) Text error correction method and related equipment thereof
CN110717031B (en) Intelligent conference summary generation method and system
CN111639489A (en) Chinese text error correction system, method, device and computer readable storage medium
US6513005B1 (en) Method for correcting error characters in results of speech recognition and speech recognition system using the same
CN111079412A (en) Text error correction method and device
CN108021545B (en) Case course extraction method and device for judicial writing
CN113254574A (en) Method, device and system for auxiliary generation of customs official documents
CN107341143B (en) Sentence continuity judgment method and device and electronic equipment
US9286526B1 (en) Cohort-based learning from user edits
CN111651978A (en) Entity-based lexical examination method and device, computer equipment and storage medium
CN111737968A (en) Method and terminal for automatically correcting and scoring composition
CN109299233A (en) Text data processing method, device, computer equipment and storage medium
CN112560450A (en) Text error correction method and device
CN111276149A (en) Voice recognition method, device, equipment and readable storage medium
CN110287493B (en) Risk phrase identification method and device, electronic equipment and storage medium
CN112149680A (en) Wrong word detection and identification method and device, electronic equipment and storage medium
EP2138959B1 (en) Word recognizing method and word recognizing program
CN111008624A (en) Optical character recognition method and method for generating training sample for optical character recognition
CN112613293A (en) Abstract generation method and device, electronic equipment and storage medium
CN116225956A (en) Automated testing method, apparatus, computer device and storage medium
CN114677689B (en) Text image recognition error correction method and electronic equipment
JP5594134B2 (en) Character string search device, character string search method, and character string search program
CN114860873A (en) Method, device and storage medium for generating text abstract
CN114548049A (en) Digital regularization method, device, equipment and storage medium
CN114330303A (en) Text error correction method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant