CN114254627A - Text error correction method, device, equipment and readable storage medium - Google Patents

Text error correction method, device, equipment and readable storage medium Download PDF

Info

Publication number
CN114254627A
CN114254627A CN202111533003.8A CN202111533003A CN114254627A CN 114254627 A CN114254627 A CN 114254627A CN 202111533003 A CN202111533003 A CN 202111533003A CN 114254627 A CN114254627 A CN 114254627A
Authority
CN
China
Prior art keywords
text
error correction
corrected
word
dictionary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111533003.8A
Other languages
Chinese (zh)
Inventor
王建辉
杜新凯
吕超
刘广鹏
郑志敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sunshine Insurance Group Co Ltd
Original Assignee
Sunshine Insurance Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sunshine Insurance Group Co Ltd filed Critical Sunshine Insurance Group Co Ltd
Priority to CN202111533003.8A priority Critical patent/CN114254627A/en
Publication of CN114254627A publication Critical patent/CN114254627A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/232Orthographic correction, e.g. spell checking or vowelisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Abstract

The method comprises the steps of obtaining description information of a text to be corrected, wherein the description information is used for representing field information and identification information of the text to be corrected; determining an error correction dictionary according to the description information, wherein the error correction dictionary is a directed word pair set consisting of source words and target words, and the error correction dictionaries corresponding to different description information are different; and performing text error correction on the text to be corrected according to the error correction dictionary. The method can achieve the effect of improving the correctness and the certainty of the text error correction result.

Description

Text error correction method, device, equipment and readable storage medium
Technical Field
The present application relates to the field of computer information technology, and in particular, to a method, an apparatus, a device, and a readable storage medium for text error correction.
Background
With the rapid development of computer information technology, voice input, handwriting input, scanning input and the like are often combined with an automatic system, so as to complete more complex tasks. However, this also places higher demands on the accuracy of the inputs, and erroneous inputs will cause the downstream automation system to fail to operate properly.
The corresponding fields of the edited contents are not distinguished, and the error correction effect is poor under machine input and specific scenes due to blind text error correction. In addition, the existing scheme generally adopts a probability model, the error correction effect is related to the context, and the requirement that a specific word must be corrected into specific content cannot be supported.
Therefore, how to improve the correctness of the text error correction result is a technical problem which needs to be solved urgently.
Disclosure of Invention
The embodiment of the application aims to provide a text error correction method, and the effect of improving the correctness of a text error correction result can be achieved through the technical scheme of the embodiment of the application.
In a first aspect, the present application provides a method for correcting a text, which obtains description information of a text to be corrected, where the description information is used to represent field information and identification information of the text to be corrected; determining an error correction dictionary according to the description information, wherein the error correction dictionary is a directed word pair set consisting of source words and target words, and the error correction dictionaries corresponding to different description information are different; and performing text error correction on the text to be corrected according to the error correction dictionary.
In the process, the corresponding error correction dictionary is determined according to the source and the identification information of the text, the target words in the dictionary are sequentially substituted for the source words with errors in the text to finish text error correction, and the special words can be accurately positioned and correct words can be accurately substituted by the mode of carrying out text error correction on the directed word pairs in the error correction dictionary, so that the accuracy of text error correction is realized.
Optionally, obtaining description information of the text to be corrected includes:
and determining the description information according to the field information and the identification information of the text to be corrected.
In the process, the positions of the corresponding directed word pairs in the dictionary can be accurately searched through the text attribution, the use scene and the identification information of the text, so that the replacement of the target words and the source words is completed, and the error correction result is more accurate.
Optionally, the description information includes:
the text correction method comprises the steps of applying information and content information, wherein the applying information is used for representing text attribution and using scenes of a text to be corrected, and the content information is used for representing content of the text to be corrected obtained through time division and space division in an input source.
In the above process, the corresponding required directed word pairs in different dictionaries can be determined through different application scenarios and different identification information of the text.
Optionally, before obtaining the description information of the text to be corrected, the method further includes:
and constructing the manually input directed word pairs and/or new directed word pairs deduced from the existing directed word pairs into the error correction dictionary.
In the process, the error correction dictionary formed in advance can search the corresponding directed word pairs more quickly and accurately, and further more accurately finishes the error correction of the text.
Optionally, performing text error correction on the text to be corrected according to the error correction dictionary, including:
generating a text error correction method according to the error correction dictionary;
and performing text error correction on the text to be corrected according to the text error correction method.
In the process, a dictionary corresponding to the error word can be found through text sources and recognition, and then the source word is replaced by the target word in the directed word pair in the dictionary, so that the error correction of the text is completed.
Optionally, performing text error correction on the text to be corrected according to the text error correction method includes:
and according to the text error correction method, sequentially replacing the corresponding source words in the text to be corrected with the target words in the directed word pairs in the error correction dictionary, and completing text error correction of the text to be corrected.
In the process, the error correction dictionary corresponding to the identification information under the specific scene can be analyzed by different error correction dictionaries by using a text error correction method, and the corresponding error words can be searched and replaced by dictionary traversal, so that the text error correction can be more accurate.
Optionally, after sequentially replacing the source word in the text to be corrected with the target word in the directed word pair in the correction dictionary according to the text error correction method, the method further includes;
and taking the target word as a second source word, and searching a second target word corresponding to the second source word again:
and replacing the corresponding source word in the text to be corrected with the second target word to finish text error correction of the text to be corrected.
In the above process, the final desired error correction result may not be achieved by one-time error correction, and the error correction of the text can be performed again on the text after error correction according to the above process, so as to achieve more accurate text error correction.
In a second aspect, an embodiment of the present application provides an apparatus for text error correction, including:
the device comprises an acquisition module, a correction module and a processing module, wherein the acquisition module is used for acquiring description information of a text to be corrected, and the description information is used for representing field information and identification information of the text to be corrected;
the determining module is used for determining an error correction dictionary according to the description information, wherein the error correction dictionary is a directed word pair set consisting of a source word and a target word, and the error correction dictionaries corresponding to different description information are different;
and the error correction module is used for performing text error correction on the text to be corrected according to the error correction dictionary.
Optionally, the obtaining module is specifically configured to:
and determining the description information according to the field information and the identification information of the text to be corrected.
Optionally, the description information includes:
the text correction method comprises the steps of applying information and content information, wherein the applying information is used for representing text attribution and using scenes of a text to be corrected, and the content information is used for representing content of the text to be corrected obtained through time division and space division in an input source.
Optionally, the apparatus further comprises:
and the forming module is used for forming an error correction dictionary by using the manually input directed word pairs and/or new directed word pairs derived from the existing directed word pairs before the description information of the text to be corrected is acquired.
Optionally, the error correction module is specifically configured to:
generating a text error correction method according to the error correction dictionary;
and performing text error correction on the text to be corrected according to the text error correction method.
Optionally, the error correction module is specifically configured to:
and according to the text error correction method, sequentially replacing the corresponding source words in the text to be corrected with the target words in the directed word pairs in the error correction dictionary, and completing text error correction of the text to be corrected.
Optionally, the apparatus further comprises;
and the second error correction module is used for sequentially replacing the corresponding source words in the text to be error corrected with the target words in the directed word pair in the error correction dictionary according to the text error correction method, taking the target words as second source words, and searching second target words corresponding to the second source words again:
and replacing the corresponding source word in the text to be corrected with the second target word to finish text error correction of the text to be corrected.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory, where the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, the steps in the method as provided in the first aspect are executed.
In a fourth aspect, embodiments of the present application provide a readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, performs the steps in the method as provided in the first aspect.
Additional features and advantages of the present application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the present application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1 is a flowchart of a text error correction method according to an embodiment of the present application;
fig. 2 is a flowchart of an embodiment of a text error correction method according to an embodiment of the present application;
fig. 3 is a schematic block diagram of an apparatus for text error correction according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a text error correction apparatus according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
The method is applied to a text error correction scene, and specifically comprises the steps of obtaining an input text, marking the system and the content according to the text, searching a corresponding error correction dictionary according to marked information, finding out a corresponding directed word pair from the dictionary, and replacing an incorrect word in the input text with a correct word in the directed word pair.
However, in the current text error correction method, the corresponding field of the edited content is not distinguished, and the error correction effect is poor under machine input and specific scenes due to blind text error correction. Moreover, the existing scheme generally adopts a probability model, the error correction effect is related to the context, and the requirement that a specific word must be corrected into specific content cannot be supported, for example: spell Correction (Spelling Correction), also known as "spell checking". When text editing is carried out, the computer automatically identifies and corrects the Error technology, the computer can automatically give out a correct vocabulary list according to probability according to model training carried out on a large amount of data for a user to correct the possible wrong Spelling, and the Spelling Error Correction comprises two subtasks of Error checking (Spelling Error Detection) and automatic Error Correction (Spelling Error Correction). The error checking task is divided into two types of Non-word Errors (Non-word Errors) and Real-word Errors (Real-word Errors) according to different error types, wherein the Non-word Errors refer to words with misspelled spelling, and for example, a threshold value is written into the threshold value by an error; the latter refers to those cases where misspelled words are still legitimate, such as misspelling "sunshine insurance" as "Miss Yang" (phonetic approach). Non-word spelling errors are typically recognized through dictionaries, with recognition accuracy depending on the size and quality of the dictionary. The real word spelling error recognition method is responsible for comparison, generally, each word is supposed to be possible to make mistakes, then a candidate word set comprising the word is generated in a sentence, and finally the most possible word is selected by methods such as an information source channel model and the like and is judged to be the same as the current word; natural Language Processing (Natural Language Processing) is a research into various theories and methods that enable efficient communication between humans and computers in Natural Language.
Therefore, the system source of error correction is firstly determined through the text, the corresponding correct words or sentences in the corresponding dictionary are found out according to the system source, the correct words or sentences replace the wrong words or sentences in the text, error correction of the text is completed, and the accuracy of the error correction result is realized.
The method for text error correction according to the embodiment of the present application is described in detail below with reference to fig. 1.
Referring to fig. 1, fig. 1 is a flowchart of a text error correction method according to an embodiment of the present application, where the text error correction method shown in fig. 1 includes:
110: and acquiring the description information of the text to be corrected.
By acquiring the description information of the text, the corresponding directed word pair in the corresponding dictionary can be found out, and then the text error correction is accurately completed to compare the subsequent steps, so that the processing range can be obviously narrowed, the processing process is more convenient, and the processing result is more accurate.
Specifically, the description information includes the domain information and the content identification information of the system, and may further include the restriction conditions of the text, such as length, input time, and the relation between preceding and following words, and by these restriction conditions, the search range may also be narrowed, so that the final search result is more accurate. The directional word pair comprises a source word (key name) and a target word (key name), wherein the source word is a word with an error in the text, and the target word is a correct word which can replace the source word in the dictionary. The description information may be embodied in the form of a description tag, which may be extracted from the input system, obtained directly by the system interfacing upstream text, for example: the method is derived from production insurance or life insurance, and can also be automatically constructed according to input contents, and is automatically constructed through the categories of the contents, such as: the tag is constructed according to whether the identification information of the content is voice recognition or picture recognition or the like.
The content in the above-mentioned directional word pair is not limited to words, but may also be regular expression, wildcard, sentence and symbol, or content such as matching logic.
Optionally, obtaining description information of the text to be corrected includes:
and determining the description information according to the field information and the identification information of the text to be corrected.
By the aid of the field information and the identification information of the text, the positions of corresponding directed word pairs in the dictionary can be accurately searched, and accordingly, source words can be replaced by target words, and error correction results are more accurate.
The | domain information may represent a system source of the text, a knowledge domain of the text, a usage scenario of the text, or a source of another domain, and the present application is not limited thereto, and the identification information may represent information obtained by segmenting the text by time and space, for example: the speech recognition results are segmented according to time, the text recognition results are segmented according to space, the punctuation is segmented, and the segmentation result sequences are grouped together according to absolute values (such as 0 to 10 seconds) or relative values (such as the former 10%) and are marked (labeled). The collection can also be started from the end, such as the text corresponding to the end 10 seconds, and the end 2 lines of characters. The system can be simplified by designing the label hierarchy, but the labels have no strict upper and lower relation, and the hierarchical order can be different. For example, the error types may be divided first, or may be divided last.
Optionally, the description information includes:
the text correction method comprises the steps of applying information and content information, wherein the applying information is used for representing text attribution and using scenes of a text to be corrected, and the content information is used for representing content of the text to be corrected obtained through time division and space division in an input source.
And determining the corresponding required directed word pairs in different dictionaries according to different application scenes and different identification information of the texts.
The application information may represent text attribution and usage scenario of the text, the text attribution may be a source of a text hierarchy system, and the usage scenario may be a scenario of the text application, for example: the field of text can be divided into finance, law, biology, etc.; further finance can be divided into insurance, securities, banks, etc.; further insurance may be divided into sales, claims, and the like. The content information represents a method of text recognition, for example: and (3) voice recognition: automatic Speech Recognition, (ASR), also known as Automatic Speech Recognition, aims at converting the vocabulary content in human Speech into computer-readable input, such as keystrokes, binary codes or character sequences; character recognition Optical Character Recognition (OCR) refers to a process in which an electronic device (e.g., a scanner or a digital camera) checks characters printed on paper, and then translates the shapes into computer characters by a character recognition method, i.e., a process in which text data is scanned, and then an image file is analyzed to obtain characters and layout information.
Optionally, before obtaining the description information of the text to be corrected, the method shown in fig. 1 may further include:
and constructing the manually input directed word pairs and/or new directed word pairs deduced from the existing directed word pairs into the error correction dictionary.
The error correction dictionary formed in advance can search the corresponding directed word pairs more quickly and accurately, and further more accurately finishes the error correction of the text.
The error correction dictionary is formed in advance and obtained by sequencing the directed word pairs according to a preset rule, wherein the rule is that the directed word pairs are firstly arranged in a descending order according to the length of the key name, and the directed word pairs are arranged in a descending order according to the occurrence frequency of characters in the key name when the lengths are the same. For example: the sequencing method comprises the following steps of: calculating the length of the key name (2, 2, 4, 4, 4) to obtain the first ranking result (sunlight insurance, yang wide insurance, broad names); (sunshine ); calculate the number of occurrences of the character [ yang: 4, light: 2, wide: 3, protecting: 2, risk 2, large: 1, human: 1, name: 1 ]; calculating the number of times of character appearance in the key name [ sunlight: 6, yang guang: 7, sunlight insurance: 10, yang guang insurance: 11, broad names: 6 ] obtaining final results [ broad names, sunlight insurance, sunlight ], namely sequencing from long to short according to length, sequencing from few to many according to the frequency sum of each character of each word with the same length, and thus, correcting errors of some special words or sentences firstly, for example: the content which has no obvious error but is not in accordance with the reality is difficult to achieve accurate error correction under normal circumstances, such as a near sound error, a near shape error, a near meaning error, a random error and the like, and the effect of accurately performing text error correction can be achieved by preferentially performing error correction on the content after sequencing.
In addition to this, key names are words or combinations of words and punctuation or templates that can match words, as source words, representing the entered text, e.g., key name "(<)? Text of "\\ w + @ \ w + (; the key value is a character or a combination of the character and the punctuation or a character generation rule, is used as a target word and represents the text after error correction. For example, the key value "this is user mailbox" (word combination form) or "three characters before and after reserving, each remaining character is replaced by an asterisk" or "fill 3 asterisks before" @ and the remaining characters of the last three characters after @ @ are replaced by an asterisk "(word generation rule), and by this way, the influence of the correct word in the text to be corrected on the text correction can be accurately eliminated.
The core of constructing the dictionary is to obtain the directed word pairs which need to be added or deleted, and the method comprises two methods of manually collecting the directed word pairs and deriving the directed word pairs according to rules. Manual collection includes two methods, active analysis of input text, human recognition of errors and recording as word pairs and collection of user feedback text errors and recording as word pairs. The rule derivation is to perform intersection, union and difference operations based on the existing word pair set to obtain a new word pair set.
120: and determining an error correction dictionary according to the description information.
The description information can determine which part of the dictionary should be searched specifically, and the text can be corrected accurately through the corresponding dictionary.
The error correction dictionary is a directed word pair set consisting of source words and target words, and the error correction dictionaries corresponding to different description information are different;
130: and performing text error correction on the text to be corrected according to the error correction dictionary.
And the error correction of the text can be realized by sequentially replacing the wrong words in the text by the target words in the dictionary.
Optionally, performing text error correction on the text to be corrected according to the error correction dictionary, including:
generating a text error correction method according to the error correction dictionary;
and performing text error correction on the text to be corrected according to the text error correction method.
And finding out a dictionary corresponding to the error word through a text source and recognition mode, and then replacing the source word with the target word in the directed word pair in the dictionary to finish the error correction of the text.
Optionally, performing text error correction on the text to be corrected according to the text error correction method includes:
and according to the text error correction method, sequentially replacing the corresponding source words in the text to be corrected with the target words in the directed word pairs in the error correction dictionary, and completing text error correction of the text to be corrected.
The error correction dictionary corresponding to the identification information under the specific scene can be analyzed by different error correction dictionaries by using a text error correction method, and the corresponding error words can be searched and replaced by dictionary traversal, so that the text error correction can be more accurate.
Optionally, after sequentially replacing the source word in the text to be corrected with the target word in the directed word pair in the correction dictionary according to the text error correction method, the method further includes;
and taking the target word as a second source word, and searching a second target word corresponding to the second source word again:
and replacing the corresponding source word in the text to be corrected with the second target word to finish text error correction of the text to be corrected.
The final desired error correction result may not be achieved by one-time error correction, and the error correction of the text can be performed again on the text after error correction according to the above process, so as to achieve more accurate text error correction.
The text error correction method is described above by using fig. 1, and the text error correction method of the present application is specifically illustrated in the following by referring to a flowchart of an embodiment of the text error correction method of fig. 2.
Fig. 2 is a flowchart of an embodiment of a text error correction method provided in the embodiment of the present application.
210: and extracting the system label.
The system label is extracted by the upstream system traffic system, enterprise WeChat and official micro-extraction.
220: a content tag is generated.
According to the content labeled by the system label, the label can be further labeled by the content label, and the label is in a voice or sound recording form, a photo or picture form or a character or symbol form. Further content tags can be divided into according to specific content: the content related to the life insurance customer service, the content of the insurance production electric sales, and the like.
230: and performing set operation according to the labels to obtain the ordered dictionary corresponding to the input text.
240: an error correction method is performed.
250: and outputting the result.
Wherein 230 to 250 are ordered dictionaries of corresponding contents obtained by performing set operations through the tags, for example: determining whether the user is life insurance or customer service according to the label, and obtaining a corresponding dictionary (key 1: value1, key 2: value2.. the.) when the input text is' good you, sunlight insurance! Ask what can help you? "at this time, the source word" sun insurance "is found out from the dictionary by the above method, then" sun insurance "is automatically replaced with" young sister Yang ", and finally the output content is" you are good, young sister Yang! Ask what can help you? ". Similarly, according to the label insurance and the electric marketing, and the corresponding defined condition of ' open white ', a corresponding dictionary is obtained (key 1: value1, key 3: value3.. the.) when the input text is ' good you, sunshine insurance! My job number is xxxxx. At this time, the source word "sunshine financial insurance" is found out from the dictionary by the method, then the "sunshine financial insurance" is automatically replaced by "Miss Yang", and finally the output content is "you are good, Miss Yang! My job number is xxxxx. Is there a ".
The method of text correction is described above with reference to fig. 1-2, and the apparatus of text correction is described below with reference to fig. 3-4.
Referring to fig. 3, a schematic block diagram of a text error correction apparatus 300 provided in an embodiment of the present application is shown, where the text error correction apparatus 300 may be a module, a program segment, or code on an electronic device. The apparatus 300 for text error correction corresponds to the above-mentioned embodiment of the method of fig. 1, and can perform various steps related to the embodiment of the method of fig. 1, and specific functions of the apparatus 300 for text error correction can be referred to the following description, and detailed descriptions are appropriately omitted herein to avoid repetition.
Optionally, the apparatus 300 for text error correction includes:
an obtaining module 310, configured to obtain description information of a text to be corrected, where the description information is used to represent field information and identification information of the text to be corrected;
a determining module 320, configured to determine, according to the description information, a directed word pair corresponding to the text to be corrected in an error correction dictionary, where the error correction dictionary is a directed word pair set formed by a source word and a target word, and the error correction dictionaries corresponding to different description information are different;
and the error correction module 330 is configured to replace the source word corresponding to the text to be corrected with the target word in the directed word pair corresponding to the text to be corrected, so as to complete text error correction on the text to be corrected.
Optionally, the obtaining module is specifically configured to:
and determining the description information according to the field information and the identification information of the text to be corrected.
Optionally, the description information includes:
the text correction method comprises the steps of obtaining text attribution and using scenes of a text to be corrected, and obtaining content information and content information, wherein the application information is used for representing the text attribution and using scenes of the text to be corrected, and the content information is used for representing the content of the text to be corrected obtained through time division and space division in an input source.
Optionally, the apparatus further comprises:
and the forming module is used for forming an error correction dictionary by using the manually input directed word pairs and/or new directed word pairs derived from the existing directed word pairs before the description information of the text to be corrected is acquired.
Optionally, the error correction module is specifically configured to:
generating a text error correction method according to the error correction dictionary;
and performing text error correction on the text to be corrected according to the text error correction method.
Optionally, the error correction module is specifically configured to:
and according to the text error correction method, sequentially replacing the corresponding source words in the text to be corrected with the target words in the directed word pairs in the error correction dictionary, and completing text error correction of the text to be corrected.
Optionally, the apparatus further comprises;
and the second error correction module is used for sequentially replacing the corresponding source words in the text to be error corrected with the target words in the directed word pair in the error correction dictionary according to the text error correction method, taking the target words as second source words, and searching second target words corresponding to the second source words again:
and replacing the corresponding source word in the text to be corrected with the second target word to finish text error correction of the text to be corrected.
Referring to fig. 4, a schematic structural diagram of an apparatus 400 for text error correction provided in an embodiment of the present application is shown, where the apparatus for text error correction may include a processor 410 and a memory 420. Optionally, the apparatus for text error correction may further include: a communication interface 430, and a communication bus 440. The text error correction device corresponds to the above-mentioned embodiment of the method in fig. 1, and can perform various steps related to the embodiment of the method in fig. 1, and specific functions of the text error correction device can be referred to the following description.
In particular, memory 420 is used to store computer readable instructions.
Processor 410, for processing instructions stored in memory 420, performs the steps of embodiments 110 through 130 of the method of fig. 1.
A communication interface 430 for communicating signaling or data with other node devices. For example: the embodiments of the present application are not limited to the above-described node devices for communication with a server or a terminal.
And a communication bus 440 for realizing direct connection communication of the above components.
The communication interface 430 of the device in the embodiment of the present application is used for performing signaling or data communication with other node devices. Memory 420 may be a high-speed RAM memory or a non-volatile memory, such as at least one disk memory. The memory 420 may optionally be at least one memory device located remotely from the aforementioned processor. The memory 420 stores computer readable instructions, which when executed by the processor 410, cause the electronic device to perform the method processes described above with reference to fig. 1. The processor 410 may be used on the apparatus for text error correction 300 and for performing the functions herein. The Processor 410 may be a general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component, for example, and the embodiments of the present Application are not limited thereto.
Embodiments of the present application further provide a readable storage medium, and when being executed by a processor, the computer program performs a method process performed by an electronic device in the method embodiment shown in fig. 1.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working process of the apparatus described above may refer to the corresponding process in the foregoing method, and will not be described in too much detail herein.
In summary, the embodiment of the present application provides a method, an apparatus, an electronic device, and a readable storage medium for text error correction, where the method obtains description information of a text to be error corrected, where the description information is used to represent field information and identification information of the text to be error corrected; determining an error correction dictionary according to the description information, wherein the error correction dictionary is a directed word pair set formed by source words and target words, and the error correction dictionaries corresponding to different description information are different; and performing text error correction on the text to be corrected according to the error correction dictionary. The method can achieve the effect of improving the correctness and the certainty of the text error correction result.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A method of text correction, comprising:
acquiring description information of a text to be corrected, wherein the description information is used for representing field information and identification information of the text to be corrected;
determining an error correction dictionary according to the description information, wherein the error correction dictionary is a directed word pair set formed by source words and target words, and the error correction dictionaries corresponding to different description information are different;
and performing text error correction on the text to be corrected according to the error correction dictionary.
2. The method according to claim 1, wherein the obtaining description information of the text to be corrected comprises:
and determining the description information according to the field information and the identification information of the text to be corrected.
3. The method according to claim 1 or 2, wherein the description information comprises:
the text correction method comprises the steps of obtaining text attribution and using scenes of a text to be corrected, and obtaining content information and content information, wherein the application information is used for representing text attribution and using scenes of the text to be corrected, and the content information is used for representing content of the text to be corrected obtained through time division and space division in an input source.
4. The method according to claim 1 or 2, wherein before the obtaining the description information of the text to be corrected, the method further comprises:
and constructing the directional word pair which is manually input and/or a new directional word pair which is deduced from the existing directional word pair into the error correction dictionary.
5. The method according to claim 1 or 2, wherein the text correction of the text to be corrected according to the correction dictionary comprises:
generating a text error correction method according to the error correction dictionary;
and performing text error correction on the text to be corrected according to the text error correction method.
6. The method of claim 5, wherein the text correction of the text to be corrected according to the text correction method comprises:
and according to the text error correction method, sequentially replacing the corresponding source words in the text to be corrected with the target words in the directed word pairs in the error correction dictionary, and completing text error correction of the text to be corrected.
7. The method according to claim 6, wherein after the method for correcting the error of the text sequentially replaces the corresponding source words in the text to be corrected with the target words in the directed word pairs in the error correction dictionary, the method further comprises;
taking the target word as a second source word, and searching a second target word corresponding to the second source word again:
and replacing the corresponding source word in the text to be corrected with the second target word to finish text correction of the text to be corrected.
8. An apparatus for correcting text, comprising:
the device comprises an acquisition module, a correction module and a processing module, wherein the acquisition module is used for acquiring description information of a text to be corrected, and the description information is used for representing field information and identification information of the text to be corrected;
the determining module is used for determining an error correction dictionary according to the description information, wherein the error correction dictionary is a directed word pair set consisting of a source word and a target word, and the error correction dictionaries corresponding to different description information are different;
and the error correction module is used for performing text error correction on the text to be corrected according to the error correction dictionary.
9. An electronic device for text correction, comprising:
a memory and a processor, the memory storing computer readable instructions which, when executed by the processor, perform the steps of the method of any one of claims 1 to 7.
10. A computer-readable storage medium, comprising:
computer program, which, when run on a computer, causes the computer to carry out the method according to any one of claims 1 to 7.
CN202111533003.8A 2021-12-15 2021-12-15 Text error correction method, device, equipment and readable storage medium Pending CN114254627A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111533003.8A CN114254627A (en) 2021-12-15 2021-12-15 Text error correction method, device, equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111533003.8A CN114254627A (en) 2021-12-15 2021-12-15 Text error correction method, device, equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN114254627A true CN114254627A (en) 2022-03-29

Family

ID=80792352

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111533003.8A Pending CN114254627A (en) 2021-12-15 2021-12-15 Text error correction method, device, equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN114254627A (en)

Similar Documents

Publication Publication Date Title
CN110489760B (en) Text automatic correction method and device based on deep neural network
US11151130B2 (en) Systems and methods for assessing quality of input text using recurrent neural networks
CN109960728B (en) Method and system for identifying named entities of open domain conference information
CN112417885A (en) Answer generation method and device based on artificial intelligence, computer equipment and medium
CN111198948A (en) Text classification correction method, device and equipment and computer readable storage medium
CN113807098A (en) Model training method and device, electronic equipment and storage medium
CN110276069B (en) Method, system and storage medium for automatically detecting Chinese braille error
CN113168498A (en) Language correction system and method thereof, and language correction model learning method in system
CN107341143B (en) Sentence continuity judgment method and device and electronic equipment
CN113254574A (en) Method, device and system for auxiliary generation of customs official documents
CN108573707B (en) Method, device, equipment and medium for processing voice recognition result
CN113591457A (en) Text error correction method, device, equipment and storage medium
CN113033185B (en) Standard text error correction method and device, electronic equipment and storage medium
WO2022267353A1 (en) Text error correction method and apparatus, and electronic device and storage medium
JP7155625B2 (en) Inspection device, inspection method, program and learning device
CN112287100A (en) Text recognition method, spelling error correction method and voice recognition method
CN112633001A (en) Text named entity recognition method and device, electronic equipment and storage medium
CN112447172B (en) Quality improvement method and device for voice recognition text
CN111401012A (en) Text error correction method, electronic device and computer readable storage medium
CN111178080A (en) Named entity identification method and system based on structured information
CN113934834A (en) Question matching method, device, equipment and storage medium
CN111597302B (en) Text event acquisition method and device, electronic equipment and storage medium
CN113705207A (en) Grammar error recognition method and device
CN111368547A (en) Entity identification method, device, equipment and storage medium based on semantic analysis
CN111046627A (en) Chinese character display method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination