CN116777693A

CN116777693A - Intelligent composition correcting method, system and storage medium

Info

Publication number: CN116777693A
Application number: CN202310723458.9A
Authority: CN
Inventors: 施其明; 刘永坚; 白立华; 韩双力; 桂前礼
Original assignee: Wuhan Ligong Digital Communications Engineering Co ltd
Current assignee: Wuhan Ligong Digital Communications Engineering Co ltd
Priority date: 2023-06-19
Filing date: 2023-06-19
Publication date: 2023-09-19

Abstract

The application discloses a method, a system and a storage medium for intelligent composition correction, wherein through the design of multiple modules and multiple algorithms, traversal comparison steps are added, and rules are set for sentences which appear in front and back; the teacher end is added for counting the composition ratings, comments and general problems of all students; adding an excellent material module, and storing the automatically identified good sentence segment and the good sentence segment identified by the teacher end in checking the missing and repairing; the teacher can modify the content automatically modified by the program and allow the program to learn itself with the modified content of the teacher, and the algorithm involved in self learning is not limited. The writing level of students can be conveniently known by generating the full-text general comment report. Interaction is formed, and a one-to-one effect of the mind communication companion is generated. The sentence information to be promoted is input into a preset intelligent composition modification model, so that the suggested color-rendering sentence information is obtained, and the student can learn and improve conveniently.

Description

Intelligent composition correcting method, system and storage medium

Technical Field

The present application relates to the field of composition correction, and in particular, to a method, system, and storage medium for intelligent composition correction.

Background

Writing is a requirement of social development, and is a basic and important capability which modern people should possess. The composition is the embodiment of the comprehensive ability of listening, speaking and reading, especially the composition of unit practice and examination room, and is a key element for judging the literacy of a person in comprehensive language.

The evaluation is an educational mode, the composition evaluation service is a special type composition lecture, and is an innovation class type truly taking students as the core.

In the related art, due to heavy teaching tasks, when a teacher changes a text for a student, detailed review cannot be performed, and the teacher criticizes light writing, so that the writing capability of the student is slowly improved, and the teacher cannot find all the advantages and disadvantages of the student.

The existing composition correction is marked after reading by a teacher, and the correction mode depends on the integral feeling of the teacher and cannot be used for accurate detailed correction, so that the correction result is not accurate enough, and the composition correction result is not objective enough.

Disclosure of Invention

In order to overcome the defects and shortcomings in the prior art, the application provides an intelligent composition correcting method, an intelligent composition correcting system and a storage medium.

A method of intelligent composition correction, comprising:

step A1, a student uploads a photo of a handwritten composition and a composition question requirement through a photo/album;

a2, processing the photo by a picture processing module, wherein the processing content comprises photo direction correction operation, distinguishing two parts of a composition question and a handwriting answer, binarization processing, removing non-handwriting parts such as transverse lines/square checks and the like, and obtaining a question image 1 and an answer image 2;

step A3, performing OCR character recognition on the topic image 1 and the answer image 2 to obtain corresponding characters 1 and 2; if the corresponding text cannot be identified by the question image 1, searching the question image 1 in a picture question library;

step A4, starting a roll surface cleanliness module, and counting the number of non-text parts in the answer image 2, wherein the higher the number is, the lower the cleanliness is;

step A5, deleting the non-handwriting part in the answer image 2 to obtain an image 2 with only handwriting content;

step A6, starting a language identification module;

step A7, starting a composition correction flow;

and step A8, transmitting the correction result in the step A7 to a teacher end, and counting general problems of students. The teacher can modify the correction result and automatically feed the modification content back to the student end;

and step A9, establishing a self-learning model according to the correction result by a teacher, wherein the possible used technologies include artificial neural network, inductive logic programming, bayesian network, similarity and metric learning, genetic algorithm and the like.

A system for intelligent composition correction, comprising:

the handwriting scoring module compares the stroke image in the image 2 with the national standard regular script image, and the higher the similarity is, the higher the handwriting score is;

the deduction module is used for judging whether the edited content (such as the history of dummies, the fact that the history does not accord with the natural law) exists in the characters or not; the judgment can be made based on current laws and regulations, common historic stock, common natural law and the like. If the buckling situation exists, buckling;

the duplicate checking module can compare the characters with common network materials (such as hundred-degree library, beans Ding Wang and the like) in duplicate checking, and the duplicate rate is judged to be 0 score when the duplicate rate is over 70%;

traversing and comparing the characters by taking paragraphs as units, deleting the parts with the repetition degree higher than 70% in different paragraphs, and only correcting the parts appearing for the first time;

determining the text according to the searching result in the corresponding text 1 or picture question bank, and specifically determining the text by keywords;

the roll surface cleanliness evaluation module is characterized in that the higher the cleanliness is, the higher the score is;

the original image is preprocessed again, and the operations such as image enhancement, denoising and clipping are performed so as to prepare for subsequent feature extraction and analysis.

Detecting the definition and flatness of the edge of the rolled surface by using an edge detection algorithm; the white balance and illumination uniformity are detected to ensure the whole quality and readability;

stain detection, namely confirming potential stains by using color analysis; a texture analysis algorithm for detecting whether abnormal textures or stain marks exist;

computer vision and pattern recognition technology are adopted to analyze and evaluate the handwriting of students, and the definition of the handwriting, the consistency of the handwriting (pen pressure and line thickness) and the error index of the handwriting are mainly examined.

And (5) checking a weight module:

using Word embedding model (Word 2 Vec) to convert sentences into high latitude vector representation, and then carrying out operations such as averaging or adding Word vectors to obtain vector representation of the whole article;

and querying similar articles from a vector database Milvus storing a large amount of composition data, and judging whether the articles are plagiarized or not through a dynamic threshold value.

A detailed index scoring module:

grammar and spelling errors, the structure of sentences is analyzed using Context-Free Grammar (Context-Free Grammar) and Syntax trees (syncax Tree), and then it is checked whether there are structures or relations that do not conform to Grammar rules. If there is an error, a corresponding error prompt or suggestion may be given for correction. The pre-built dictionary is used to check if a word is present in the dictionary and if not, it is considered misspelled. Marking the positions of the text and the re-picture of the error and giving a prompt for the error;

article structure and logic to split an article into paragraphs by analyzing paragraph labels (e.g., line breaks, indents, etc.) in the text. Paragraphs are basic organization units of articles, and the information such as the number and the length of the extracted paragraphs can reflect the organization structure of the articles. Extracting paragraph keywords and identifying topics by using a keyword extraction algorithm, vectorizing by using word embedding, and finally calculating similarity to evaluate topic consistency and logic consistency;

vocabulary and syntactic expression, evaluating the richness (vocabulary coverage or diversity index) of the vocabulary by using a statistical-based method, and measuring the complexity of sentences by indexes such as the length of the sentences, the number of clauses and the like;

content relevance and viewpoint expression, evaluating the relevance between the composition and the subject by using a cosine similarity algorithm, and analyzing emotion tendencies in the composition by using an emotion classification model based on deep learning so as to judge that the viewpoints are vivid;

the content authenticity scoring, namely comparing entity events extracted in the text with an existing knowledge base by using a text similarity algorithm, and primarily scoring the authenticity;

and (5) content health scoring, namely intercepting and withholding through a sensitive word stock.

And an evaluation module:

A. and a document module: different evaluation submodules and weights are set according to different literary bodies, for example, the weights of expression modules of the submodule 2 are increased for the negotiable literary bodies. For the narrative, the weight of the content module of the sub-module 1 is increased. For the lyrics, the weight of the characteristic module of the sub module 3 is increased;

B. the evaluation submodule 1 content module: the specific dimension includes a. Whether the topic b. Whether the center point stands out c. Whether the content is full d. Whether the thought is healthy d. Whether the emotion is true

C. Evaluation submodule 2 expression module: the specific dimension includes a. Whether the structure is strict b. Whether the language is smooth c. Whether the character is wrongly written and punctuation marks

D. Evaluation submodule 3 feature module: a. whether deep b is rich c is literature d is creative;

a storage medium having stored thereon a computer program which when executed by a processor performs the method steps of the intelligent composition batch as claimed in claim 1.

The beneficial effects are that:

the application provides an intelligent composition correction method, a system and a storage medium, wherein the application adds traversal comparison steps through the design of a plurality of modules and a plurality of algorithms, and sets rules for sentences which appear the same before and after; the teacher end is added for counting the composition ratings, comments and general problems of all students; adding an excellent material module, and storing the automatically identified good sentence segment and the good sentence segment identified by the teacher end in checking the missing and repairing; the teacher can modify the content automatically modified by the program and allow the program to learn itself with the modified content of the teacher, and the algorithm involved in self learning is not limited. The writing level of students can be conveniently known by generating the full-text general comment report. Interaction is formed, and a one-to-one effect of the mind communication companion is generated. The sentence information to be promoted is input into a preset intelligent composition modification model, so that the suggested color-rendering sentence information is obtained, and the student can learn and improve conveniently.

Drawings

FIG. 1 is a flow chart of the general steps of the method of the present application;

FIG. 2 is a diagram of the system module components of the present application.

Detailed Description

It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other, and the present application will be further described in detail with reference to the drawings and the specific embodiments.

As shown in fig. 1, a method, system and storage medium for intelligent composition correction,

preprocessing an image, sequentially performing image correction (processing the condition of inclination or perspective deformation), image enhancement, image noise reduction and image binarization operation, and simultaneously performing text region detection and clipping by a computer vision technology and scaling to a proper size so as to improve the accuracy of subsequent processing;

performing OCR on the preprocessed picture;

step A3, cleaning and preprocessing the identified text, and removing unnecessary symbols, blank spaces, noise and the like to obtain clean text data; OCR character recognition is carried out on the question image 1 and the answer image 2 to obtain corresponding characters 1 and 2; if the corresponding text cannot be identified by the question image 1, searching the question image 1 in a picture question library;

step A4, using data transmission to identify and extract the requirements and the composition contents of the composition questions based on a model obtained by training after manual labeling of a large amount of test question data provided by a publishing company; starting a roll surface cleanliness module, and counting the number of non-text parts in the image 1, wherein the higher the number is, the lower the cleanliness is;

step A6, starting a language identification module;

step A7, starting a composition correction flow; dynamically loading the weights of various indexes based on the requirements of the student portrait and the title, and finally carrying out weighted average;

step A9, automatically collecting secondary correction data of a teacher by the system, and training and optimizing the model by adopting a reinforcement learning algorithm; establishing a self-learning model according to the correction result by a teacher;

techniques that may be used are artificial neural networks, inductive logic programming, bayesian networks, similarity and metric learning, genetic algorithms, and the like.

As shown in fig. 2, a system for intelligent composition correction, comprising:

And (5) checking a weight module: using Word embedding model (Word 2 Vec) to convert sentences into high latitude vector representation, and then carrying out operations such as averaging or adding Word vectors to obtain vector representation of the whole article;

A language identification module: counting the number of Chinese and English in the characters 1, comparing the number of Chinese and English, judging the type of the language, and starting a corresponding correction flow; and (2) building an NLP model, training, inputting the statistical characters 1, calculating the language requirement of the problem, and starting a corresponding correction flow.

A detailed index scoring module: grammar and spelling errors, the structure of sentences is analyzed using Context-Free Grammar (Context-Free Grammar) and Syntax trees (syncax Tree), and then it is checked whether there are structures or relations that do not conform to Grammar rules. If there is an error, a corresponding error prompt or suggestion may be given for correction. The pre-built dictionary is used to check if a word is present in the dictionary and if not, it is considered misspelled. Marking the positions of the text and the re-picture of the error and giving a prompt for the error;

And an evaluation module:

the application provides a structural schematic diagram of a computer system suitable for an electronic device for implementing the embodiment of the application.

The computer system includes a Central Processing Unit (CPU) that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) or a program loaded from a storage section into a Random Access Memory (RAM). In the RAM, various programs and data required for the system operation are also stored. The CPU, ROM and RAM are connected to each other by a bus. An input/output (I/O) interface is also connected to the bus.

The following components are connected to the I/O interface: an input section including a keyboard, a mouse, etc.; an output section including a display such as a Liquid Crystal Display (LCD) and a speaker; a storage section including a hard disk or the like; and a communication section including a network interface card such as a LAN card, a modem, and the like. The communication section performs communication processing via a network such as the internet. The drives are also connected to the I/O interfaces as needed. Removable media such as magnetic disks, optical disks, magneto-optical disks, semiconductor memories, and the like are mounted on the drive as needed so that a computer program read therefrom is mounted into the storage section as needed.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program comprising program code for performing the method shown in the flowcharts. In such embodiments, the computer program may be downloaded and installed from a network via a communication portion, and/or installed from a removable medium. The above-described functions defined in the method of the present application are performed when the computer program is executed by a Central Processing Unit (CPU). The computer readable storage medium according to the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable storage medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules involved in the embodiments of the present application may be implemented in software or in hardware. The units described may also be provided in a processor, and the names of these units do not in some case constitute a limitation of the unit itself.

Embodiments of the present application also relate to a computer readable storage medium having stored thereon a computer program which, when executed by a computer processor, implements the method as described above. The computer program contains program code for performing the method shown in the flow chart. The computer readable medium of the present application may be a computer readable signal medium or a computer readable medium, or any combination of the two.

The above description is only illustrative of the preferred embodiments of the present application and of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the application referred to in the present application is not limited to the specific combinations of the technical features described above, but also covers other technical features formed by any combination of the technical features described above or their equivalents without departing from the inventive concept described above. Such as the above-mentioned features and the technical features disclosed in the present application (but not limited to) having similar functions are replaced with each other.

Claims

1. A method for intelligent composition correction, comprising:

step A6, starting a language identification module;

step A7, starting a composition correction flow;

step A8, sending the correction result in the step A7 to a teacher end, and counting general problems of students; the teacher modifies the correction result and automatically feeds the modification content back to the student end;

and A9, establishing a self-learning model according to the correction result by the teacher.

2. A system for intelligent composition correction, comprising:

the deduction module is used for judging whether the edited content exists in the characters; judging based on the current laws and regulations, a common historic stock, a common natural law and the like, and if a deduction condition exists, deducting;

the duplicate checking module is used for comparing the characters with the common network materials in duplicate checking, and judging that the duplicate rate is 0 score when the duplicate rate is more than 70%; traversing and comparing the characters by taking paragraphs as units, deleting the parts with the repetition degree higher than 70% in different paragraphs, and only correcting the parts appearing for the first time;

and determining the text according to the searched result in the corresponding text 1 or picture question bank, wherein the text can be determined by the keywords.

3. The intelligent composition altering system as in claim 2, further comprising:

preprocessing the original image again, and performing operations such as image enhancement, denoising, cutting and the like so as to prepare for subsequent feature extraction and analysis;

computer vision and pattern recognition technology are adopted to analyze and evaluate the handwriting of students, and the definition of the handwriting, the consistency of the handwriting and the error index of the handwriting are mainly examined.

4. The method, system and storage medium for intelligent composition correction as claimed in claim 2, wherein the duplication checking module uses word embedding model to convert sentences into high latitude vector representation, and then performs operations such as averaging or addition and the like on word vectors to obtain vector representation of the whole article; and querying similar articles from a vector database Milvus storing a large amount of composition data, and judging whether the articles are plagiarized or not through a dynamic threshold value.

5. The intelligent composition altering system as set forth in claim 2, further comprising: a detailed index scoring module for grammar and spelling error lookup, analyzing the structure of sentences using context-free grammar and syntax tree, and then checking whether there is a structure or relationship that does not conform to the grammar rules; if errors exist, corresponding error prompts or suggestions can be given for correction; checking whether a word exists in the dictionary by using a dictionary constructed in advance, and if not, considering the word as misspelling; marking the positions of the text and the re-picture of the error and giving a prompt for the error; splitting the article into paragraphs by analyzing paragraph marks in the text; the paragraphs are basic organization units of the articles, and the information such as the number and the length of the extracted paragraphs can reflect the organization structure of the articles; extracting paragraph keywords and identifying topics by using a keyword extraction algorithm, vectorizing by using word embedding, and finally calculating similarity to evaluate topic consistency and logic consistency; using a statistical-based method to evaluate the vocabulary richness, and measuring the sentence complexity through indexes such as sentence length, clause number and the like; using cosine similarity algorithm to evaluate the relativity between the composition and the subject, and using emotion classification model based on deep learning to analyze emotion tendency in the composition so as to judge that the views are vivid; comparing the entity event extracted in the text with the prior knowledge base by using a text similarity algorithm, and primarily scoring the authenticity; interception and deduction are carried out through the sensitive word stock.

6. The system for intelligent composition correction as set forth in claim 1, further comprising an evaluation module, the evaluation module comprising:

A. and a document module: setting different evaluation submodules and weights according to different cultural relics, such as increasing the weights of expression modules of the submodule 2 for the cultural relics; for the narrative text, the weight of the content module of the sub-module 1 is increased; for the lyrics, the weight of the characteristic module of the sub module 3 is increased;

B. the evaluation submodule 1 content module: the specific dimension comprises a. Whether the topic b. Whether the central point stands out, whether the content is full d. Whether the thought is healthy d. Whether the emotion is real;

C. evaluation submodule 2 expression module: the specific dimension comprises a, whether the structure is strict, b, whether the language is smooth, c, whether wrongly written characters and punctuation marks exist;

D. evaluation submodule 3 feature module: a. whether deep b, whether rich c, whether there is a literature d, whether there is a creative.

7. A storage medium having stored thereon a computer program, which when executed by a processor performs the method steps of the intelligent composition batch as claimed in claim 1.