CN116341525A - Text examination and correction system based on natural language processing - Google Patents

Text examination and correction system based on natural language processing Download PDF

Info

Publication number
CN116341525A
CN116341525A CN202310294883.0A CN202310294883A CN116341525A CN 116341525 A CN116341525 A CN 116341525A CN 202310294883 A CN202310294883 A CN 202310294883A CN 116341525 A CN116341525 A CN 116341525A
Authority
CN
China
Prior art keywords
information
text
natural language
error correction
correction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310294883.0A
Other languages
Chinese (zh)
Inventor
洪创波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Chaoting Group Co ltd
Original Assignee
Guangdong Chaoting Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Chaoting Group Co ltd filed Critical Guangdong Chaoting Group Co ltd
Priority to CN202310294883.0A priority Critical patent/CN116341525A/en
Publication of CN116341525A publication Critical patent/CN116341525A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/232Orthographic correction, e.g. spell checking or vowelisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to the technical field of natural language processing, in particular to a text examination and correction system based on natural language processing, which comprises the steps of inputting natural language information, classifying the natural language information into voice information, picture information and text information, respectively carrying out text conversion on the voice information and the picture information, uniformly carrying out morphological analysis, then carrying out fluency judgment, and carrying out examination and correction on the text information with lower fluency. The invention has the advantages that: the analyzed semantic data is subjected to secondary smoothness judgment, error text information is further screened, the accuracy of system output processing information is improved, and the system is timely prompted to an operator, so that the operator can know the processing progress and difficulty of natural language according to error correction conditions, and the operator can directly know error correction details according to labels of error correction positions in the prompt, so that an error correction program of the system is debugged, and the accuracy and efficiency of text inspection and error correction of the system are improved.

Description

Text examination and correction system based on natural language processing
Technical Field
The invention relates to the technical field of natural language processing, in particular to a text examination and correction system based on natural language processing.
Background
Natural language processing is an important direction in the fields of computer science and artificial intelligence, and is used for researching various theories and methods capable of realizing effective communication between people and computers by using natural language, and natural language processing is a science integrating linguistics, computer science and mathematics, and the research in the field relates to natural language, namely language used by people in daily life, so that the natural language processing has close relation with the research of the linguistics, but has important differences. Natural language processing is not a general study of natural language, but rather, is the development of computer systems, and in particular software systems therein, that are part of computer science that can effectively implement natural language communications.
A natural language processing system, a natural language processing method, and a natural language processing program disclosed in chinese patent CN106030568B are capable of automatically correcting a segmentation model of morphological analysis within a certain time.
The existing natural language processing system has the defects that: the existing natural language processing system is used for analyzing and identifying the voice information, but the input information is possibly in error in the process of inputting and converting the natural language during voice input and image input due to problems such as accent, picture definition and the like, the existing natural language processing system is poor in identification capability of the natural language input information with errors, and is used for correcting the information, and correction bases for timely feeding back confirmation to an information input source are less in error correction capability.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a text examination and correction system based on natural language processing, which effectively solves the defects of the prior art.
The aim of the invention is achieved by the following technical scheme: a text censoring and error correction system based on natural language processing, comprising the steps of:
1) Inputting natural language information, and classifying the natural language information into voice information, picture information and text information;
2) The voice information is converted into character information, the smoothness of the converted character information is judged, the character information with smoothness reaching the standard is input into a character information module, and the character information with smoothness not reaching the standard is input into the character information module after being corrected;
3) The picture information is identified and converted into character information, and intelligent typesetting is carried out on the character information on the picture;
4) Judging the fluency of the text information converted from the picture information, performing intelligent error correction on the text information with lower fluency, and inputting the text information into a text information module;
5) Performing morphological analysis and translation on the input text information and generating semantic data;
6) Judging semantic fluency of the semantic data;
7) The semantic data with the smoothness reaching the standard is directly output;
8) And the semantic data retrieval equipment database with lower fluency is used for correcting errors.
Optionally, the text information with lower fluency in the step 2) is extracted, then a voice back-query confirmation is provided for the sounding source, after the sounding source answers according to the confirmation question, the voice information is converted into text and is compared with the text information for the first time for analysis and combination, and after the text information is analyzed and combined for two times, error correction is completed and input to the text information module.
The technical scheme is adopted: through judging the fluency of the voice information, after a user sends the voice information, the system can timely find the problem of natural language information which cannot be recognized in voice characters after voice recognition, and can perform a reverse query to supplement the judgment basis for correcting the voice information, so that the accuracy of correcting the voice information is improved, the situation that the input confidence judgment is wrong due to the problems of dialect pronunciation and the like in the voice input process, and the correction basis is insufficient to cause larger error correction deviation of the voice information is avoided, and the voice information is directly corrected by converting the voice information into the text information, so that the difficulty of language processing caused by lower accuracy of the input text information can be reduced.
Optionally, the text information with lower fluency in the step 4 sends an error correction prompt to an operator when intelligent error correction is performed, and error correction recording is performed on the equipment database.
The technical scheme is adopted: through carrying out intelligent error correction to the character information identified on the picture, when the characters on the picture are smeared or the fonts are not neat, the method can utilize the logical relation of the dialect to carry out repair to a certain extent, and send error correction prompt to an operator, so that the operator can assist natural language processing equipment to manually read unclear pattern information, the error rate of the system for identifying the illegal fonts is reduced, or the operator can search the reason of the information processing error according to the error correction prompt after the pattern information processing error, record the error correction information and record the unclear or illegal fonts, and when the similar patterns are identified again, the error correction record in the database of the equipment is called to assist in carrying out font identification, thereby the system can improve the accuracy of font identification after identifying and storing the font pictures for multiple times, and self-service error correction is realized.
Optionally, the specific names and analysis results after the morphological analysis in the step 5) are stored into a device database, and the device database is compared and searched during the morphological analysis.
The technical scheme is adopted: the device database is arranged to store the result of the word information morpheme analysis, so that the subsequent morpheme analysis can be compared with the analysis data in the database, the morpheme analysis efficiency is improved, the system can store more natural language processing data in multiple morpheme analysis, the accuracy of language processing identification is improved, some special names of the word information can only represent a person or an article, after the special names are searched and the search information is stored, data processing can be carried out according to the search condition of the special names when the names are identified next time, for example, a vehicle can be known when a vehicle name is identified, and the accuracy of morpheme analysis is improved.
Optionally, after the semantic data in the step 8) is subjected to error correction in the search database, an error correction prompt is popped up to an operator, the error correction prompt marks the error corrected text, and the error correction prompt is stored in the equipment database.
The technical scheme is adopted: the analyzed semantic data is subjected to secondary smoothness judgment, error text information is further screened, the accuracy of system output processing information is improved, and the system is timely prompted to an operator, so that the operator can know the processing progress and difficulty of natural language according to error correction conditions, and the operator can directly know error correction details according to labels of error correction positions in the prompt, so that an error correction program of the system is debugged, and the accuracy and efficiency of text inspection and error correction of the system are improved.
The invention has the following advantages:
1. according to the text examination and correction system based on natural language processing, smoothness judgment is carried out on voice information, so that after a user sends out the voice information, the system can timely find out the problem of natural language information which cannot be identified in voice characters after voice recognition and can carry out a back question to supplement judgment basis for voice information correction, the accuracy of voice information correction is improved, the situation that input confidence judgment is wrong due to the fact that the problem of dialect pronunciation and the like in the voice input process, correction deviation of the voice information is large due to the fact that the correction basis is insufficient is avoided, and correction is carried out on the voice information after the voice information is converted into the character information, so that the difficulty of language processing caused by low accuracy of the input character information can be reduced.
2. According to the text examination and correction system based on natural language processing, intelligent correction is carried out on character information identified on a picture, when characters on the picture are smeared or fonts are irregular, a certain degree of restoration can be carried out by utilizing a logic relation of a front language and a rear language, correction prompts are sent to an operator, so that the operator can assist a natural language processing device to manually read unclear pattern information, the error rate of the system for identifying the illegal fonts is reduced, or the operator can search the cause of the information processing errors according to the correction prompts after the pattern information processing errors, record the correction information and can record the unclear or illegal fonts, and when similar patterns are identified again, correction record assistance in a database of the retrieval device can be carried out for carrying out font identification, and therefore the system can improve the accuracy of font identification after multiple times of identification and storage of the font pictures, and self-service correction is realized.
3. According to the text examination and correction system based on natural language processing, a device database is arranged to store the result of text information morpheme analysis, so that the subsequent morpheme analysis can be compared with analysis data in the database, the morpheme analysis efficiency is improved, the system can store more natural language processing data in multiple morpheme analyses, the accuracy of language processing identification is improved, some special names of text information can only represent a person or an article, after the special names are searched and search information is stored, data processing can be carried out according to the search condition of the special names when the names are identified next time, for example, a car name can be identified, and the accuracy of morpheme analysis is improved.
4. According to the text examination and correction system based on natural language processing, through performing secondary smoothness judgment on the analyzed semantic data, wrong text information is further screened, the accuracy of system output processing information is improved, and timely prompt is given to an operator, so that the operator can know the processing progress and difficulty of the natural language according to the correction condition, and the operator can directly know correction details according to the marks of correction positions in prompt, so that the correction program of the system is debugged, and the accuracy and efficiency of text examination and correction of the system are improved.
Drawings
Fig. 1 is a block diagram of the system operation structure of the present invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.
As shown in fig. 1, a text review correction system based on natural language processing includes the steps of:
1) Inputting natural language information, and classifying the natural language information into voice information, picture information and text information;
2) The voice information is converted into character information, the smoothness of the converted character information is judged, the character information with smoothness reaching the standard is input into a character information module, and the character information with smoothness not reaching the standard is input into the character information module after being corrected;
3) The picture information is identified and converted into character information, and intelligent typesetting is carried out on the character information on the picture;
4) Judging the fluency of the text information converted from the picture information, performing intelligent error correction on the text information with lower fluency, and inputting the text information into a text information module;
5) Performing morphological analysis and translation on the input text information and generating semantic data;
6) Judging semantic fluency of the semantic data;
7) The semantic data with the smoothness reaching the standard is directly output;
8) And the semantic data retrieval equipment database with lower fluency is used for correcting errors.
Example 1: the text information with lower fluency in the step 2) is extracted and then a voice back-asking confirmation is provided for a sound source, after the sound source answers according to the confirmation question, the voice information is converted into text and is compared, analyzed and combined with the text information for the first time, error correction is completed after the text information is analyzed and combined for two times, and input to a text information module, after the voice information is sent out by a user, the fluency judgment is carried out on the voice information, the system can timely find out the problem of natural language information which cannot be recognized in the voice text after voice recognition, back-asking is carried out to supplement the judgment basis for error correction of the voice information, the accuracy rate of error correction of the voice information is improved, the situation that the input confidence judgment is wrong due to the problem of pronunciation of the dialect and the like in the voice input process, error correction deviation of the voice information is larger due to the fact that the error correction basis is insufficient is corrected directly, and the difficulty in language processing caused by the fact that the input text information is low in the accuracy rate of the voice information is reduced.
Example 2: in the step 4, the text information with lower fluency sends an error correction prompt to an operator when intelligent error correction is performed, error correction recording is performed on an equipment database, through intelligent error correction is performed on the text information identified on the picture, when the text on the picture is smeared or the fonts are not regular, the text information can be repaired to a certain extent by utilizing the logical relation of the front and rear languages, and the error correction prompt is sent to the operator, so that the operator can assist a natural language processing equipment to manually read the unclear pattern information, the error rate of the system for identifying the illegal fonts is reduced, or the operator can search the cause of the information processing error according to the error correction prompt after the pattern information processing error, record the unclear or illegal fonts, and when similar patterns are identified again, the error correction recording in the equipment database can be called for assisting in performing font identification, thereby improving the identification accuracy after the fonts are identified and stored for many times, and realizing self-service error correction.
Example 3: the specific names and analysis results after the morphological analysis in the step 5) are stored into a device database, the device database is compared and searched during the morphological analysis, the device database is arranged to store the results after the morphological analysis of the text information, so that the subsequent morphological analysis can be compared with the analysis data in the database, the morphological analysis efficiency is improved, the system can store more natural language processing data in multiple morphological analysis, the accuracy of language processing recognition is improved, some special names of the text information possibly only represent a person or an article, after the special names are searched and the search information is stored, the data processing can be performed according to the search condition of the special names when the names are recognized next time, for example, a car name can be recognized, and the accuracy of the morphological analysis is improved.
Example 4: the semantic data in the step 8) are searched for error correction of the database, then an error correction prompt is popped up to an operator, error correction characters are marked in the error correction prompt, the error correction prompt is stored in a device database, error text information is further screened through second smoothness judgment on the analyzed semantic data, the accuracy of system output processing information is improved, the operator is prompted in time, the operator can know the processing progress and difficulty of natural language according to error correction conditions, and the operator can directly know error correction details according to marks of error correction positions in the prompt, so that an error correction program of the system is debugged, and the accuracy and efficiency of text examination and error correction of the system are improved.
The working principle of the invention is as follows:
s1, converting voice information and picture information into text information, judging fluency, and performing preliminary screening;
s2, performing morphological analysis on the summarized text information and correcting the error of part of semantic data.
Compared with the prior art, the invention has the following beneficial effects compared with the prior art:
1. according to the text examination and correction system based on natural language processing, smoothness judgment is carried out on voice information, so that after a user sends out the voice information, the system can timely find out the problem of natural language information which cannot be identified in voice characters after voice recognition and can carry out a back question to supplement judgment basis for voice information correction, the accuracy of voice information correction is improved, the situation that input confidence judgment is wrong due to the fact that the problem of dialect pronunciation and the like in the voice input process, correction deviation of the voice information is large due to the fact that the correction basis is insufficient is avoided, and correction is carried out on the voice information after the voice information is converted into the character information, so that the difficulty of language processing caused by low accuracy of the input character information can be reduced.
2. According to the text examination and correction system based on natural language processing, intelligent correction is carried out on character information identified on a picture, when characters on the picture are smeared or fonts are irregular, a certain degree of restoration can be carried out by utilizing a logic relation of a front language and a rear language, correction prompts are sent to an operator, so that the operator can assist a natural language processing device to manually read unclear pattern information, the error rate of the system for identifying the illegal fonts is reduced, or the operator can search the cause of the information processing errors according to the correction prompts after the pattern information processing errors, record the correction information and can record the unclear or illegal fonts, and when similar patterns are identified again, correction record assistance in a database of the retrieval device can be carried out for carrying out font identification, and therefore the system can improve the accuracy of font identification after multiple times of identification and storage of the font pictures, and self-service correction is realized.
3. According to the text examination and correction system based on natural language processing, a device database is arranged to store the result of text information morpheme analysis, so that the subsequent morpheme analysis can be compared with analysis data in the database, the morpheme analysis efficiency is improved, the system can store more natural language processing data in multiple morpheme analyses, the accuracy of language processing identification is improved, some special names of text information can only represent a person or an article, after the special names are searched and search information is stored, data processing can be carried out according to the search condition of the special names when the names are identified next time, for example, a car name can be identified, and the accuracy of morpheme analysis is improved.
4. According to the text examination and correction system based on natural language processing, through performing secondary smoothness judgment on the analyzed semantic data, wrong text information is further screened, the accuracy of system output processing information is improved, and timely prompt is given to an operator, so that the operator can know the processing progress and difficulty of the natural language according to the correction condition, and the operator can directly know correction details according to the marks of correction positions in prompt, so that the correction program of the system is debugged, and the accuracy and efficiency of text examination and correction of the system are improved.

Claims (5)

1. A text censoring and correcting system based on natural language processing, which is characterized in that: the method comprises the following steps:
1) Inputting natural language information, and classifying the natural language information into voice information, picture information and text information;
2) The voice information is converted into character information, the smoothness of the converted character information is judged, the character information with smoothness reaching the standard is input into a character information module, and the character information with smoothness not reaching the standard is input into the character information module after being corrected;
3) The picture information is identified and converted into character information, and intelligent typesetting is carried out on the character information on the picture;
4) Judging the fluency of the text information converted from the picture information, performing intelligent error correction on the text information with lower fluency, and inputting the text information into a text information module;
5) Performing morphological analysis and translation on the input text information and generating semantic data;
6) Judging semantic fluency of the semantic data;
7) The semantic data with the smoothness reaching the standard is directly output;
8) And the semantic data retrieval equipment database with lower fluency is used for correcting errors.
2. A text censoring and correction system based on natural language processing as recited in claim 1, wherein: and 2) extracting the text information with lower fluency in the step 2), then providing a voice back-inquiry confirmation for a sound source, after the sound source answers according to the confirmation question, converting the voice information into text, carrying out comparison analysis combination with the text information of the first time, and completing error correction input to a text information module after the text information is analyzed and combined twice.
3. A text censoring and correction system based on natural language processing as claimed in claim 2, wherein: and (4) sending error correction prompts to operators when intelligent error correction is performed on the text information with low fluency in the step, and performing error correction recording on the equipment database.
4. A text censoring and correction system based on natural language processing as recited in claim 3, wherein: and 5) storing the specific names and analysis results after the morphological analysis in the step 5) into a device database, and comparing and searching the device database during the morphological analysis.
5. A natural language processing based text censoring and correction system as set forth in claim 4 wherein: and (3) after the semantic data in the step 8) are subjected to database retrieval and error correction, popping up an error correction prompt to an operator, marking error correction characters in the error correction prompt, and storing the error correction prompt into a device database.
CN202310294883.0A 2023-03-24 2023-03-24 Text examination and correction system based on natural language processing Pending CN116341525A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310294883.0A CN116341525A (en) 2023-03-24 2023-03-24 Text examination and correction system based on natural language processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310294883.0A CN116341525A (en) 2023-03-24 2023-03-24 Text examination and correction system based on natural language processing

Publications (1)

Publication Number Publication Date
CN116341525A true CN116341525A (en) 2023-06-27

Family

ID=86880217

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310294883.0A Pending CN116341525A (en) 2023-03-24 2023-03-24 Text examination and correction system based on natural language processing

Country Status (1)

Country Link
CN (1) CN116341525A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117807990A (en) * 2023-12-27 2024-04-02 北京海泰方圆科技股份有限公司 Text processing method, device, equipment and medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117807990A (en) * 2023-12-27 2024-04-02 北京海泰方圆科技股份有限公司 Text processing method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN109947836B (en) English test paper structuring method and device
CN109255113B (en) Intelligent proofreading system
US7092870B1 (en) System and method for managing a textual archive using semantic units
CN104503998B (en) For the kind identification method and device of user query sentence
CN113505209A (en) Intelligent question-answering system for automobile field
CN110119510B (en) Relationship extraction method and device based on transfer dependency relationship and structure auxiliary word
CN110688863B (en) Document translation system and document translation method
CN112397201B (en) Intelligent inquiry system-oriented repeated sentence generation optimization method
CN111143531A (en) Question-answer pair construction method, system, device and computer readable storage medium
CN106845467B (en) Aeronautical maintenance work card action recognition methods based on optical character recognition technology
CN116341525A (en) Text examination and correction system based on natural language processing
CN111737424A (en) Question matching method, device, equipment and storage medium
CN117332789A (en) Semantic analysis method and system for dialogue scene
US6567548B2 (en) Handwriting recognition system and method using compound characters for improved recognition accuracy
CN111444720A (en) Named entity recognition method for English text
CN112347121B (en) Configurable natural language sql conversion method and system
CN113407676A (en) Title correction method and system, electronic device and computer readable medium
CN117610527A (en) Report analysis and report generation method and system based on large language model
CN103164398A (en) Chinese-Uygur language electronic dictionary and automatic translating Chinese-Uygur language method thereof
CN113037934A (en) Hot word analysis system based on call recording of call center
JP2000040085A (en) Method and device for post-processing for japanese morpheme analytic processing
CN108959253A (en) Extracting method, device and the readable storage medium storing program for executing of core phrase
CN114840640A (en) Chinese text grammar error detection method based on ELECTRA-GCNN-CRF model
CN114332903A (en) Lute music score identification method and system based on end-to-end neural network
CN112257420B (en) Text processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination