CN113360608B - Man-machine combined Chinese composition correcting system and method - Google Patents

Man-machine combined Chinese composition correcting system and method Download PDF

Info

Publication number
CN113360608B
CN113360608B CN202110774531.6A CN202110774531A CN113360608B CN 113360608 B CN113360608 B CN 113360608B CN 202110774531 A CN202110774531 A CN 202110774531A CN 113360608 B CN113360608 B CN 113360608B
Authority
CN
China
Prior art keywords
composition
correction
picture
information
area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110774531.6A
Other languages
Chinese (zh)
Other versions
CN113360608A (en
Inventor
杨林
雷思东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing One Stroke Two Stroke Technology Co ltd
Beijing Yueshen Intelligent Technology Co ltd
Original Assignee
Beijing One Stroke Two Stroke Technology Co ltd
Beijing Yueshen Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing One Stroke Two Stroke Technology Co ltd, Beijing Yueshen Intelligent Technology Co ltd filed Critical Beijing One Stroke Two Stroke Technology Co ltd
Priority to CN202110774531.6A priority Critical patent/CN113360608B/en
Publication of CN113360608A publication Critical patent/CN113360608A/en
Application granted granted Critical
Publication of CN113360608B publication Critical patent/CN113360608B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/117Tagging; Marking up; Designating a block; Setting of attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/258Heading extraction; Automatic titling; Numbering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Character Input (AREA)

Abstract

The application relates to a man-machine combined Chinese composition correction system and a man-machine combined Chinese composition correction method, wherein the system comprises a composition acquisition system, a preprocessing system, a correction system and a material recommendation system, wherein the preprocessing system carries out preprocessing on a to-be-corrected composition in a picture format acquired by the composition acquisition system, the correction system carries out automatic correction, correction information is given on an original picture of a composition paper, and therefore teachers and students can see visual correction results; in addition, the provided correction information is in an editable form, so that a teacher can further modify the correction information according to own experience, and the correction result is more in line with the actual situation; and moreover, the material recommendation system can automatically recommend excellent composition materials according to the defects of composition in the correction result, so that students can conveniently improve the composition capability. That is, by adopting the technical scheme of the application, the problems existing in the prior art can be solved, visual correction results can be presented, and more functions are provided.

Description

Man-machine combined Chinese composition correcting system and method
Technical Field
The application relates to the technical field of computers, in particular to a man-machine combined Chinese composition correction system and method.
Background
nlp (Natural Language Processing ) technology starts to penetrate gradually in the fields of Chinese composition and the like, and a part of relatively trivial work of a teacher can be shared by a computer on the work of basic dimension diagnosis and statistical analysis of composition.
The existing automatic composition correction system mostly needs two-stage operation, namely ocr recognition (Optical Character Recognition ) is needed first, the uploaded composition picture is converted into a text form result, and then the converted text content is recognized and corrected based on nlp technology. The correction result is finally displayed in a single text form and cannot be synchronized to the paper, namely, the display mode of the correction result is not visual; most of the existing systems only realize correction functions and have single functions.
Disclosure of Invention
The application provides a man-machine combined Chinese composition correction system and method, which aim to solve the problems that the correction result of the existing automatic composition correction system is not visual in presentation mode and has single function.
The above object of the present application is achieved by the following technical solutions:
in a first aspect, an embodiment of the present application provides a man-machine combined chinese composition correction system, including:
the composition acquisition system is used for acquiring a composition to be modified in a picture format uploaded by a user; wherein, the picture format comprises PDF format;
the preprocessing system is used for carrying out layout analysis on the acquired to-be-modified composition by utilizing a ocr recognition engine so as to extract an actual composition area, obtaining text position coordinate information and text content information, and carrying out topic extraction and segmentation processing;
the correction system is used for correcting the text content information obtained by the preprocessing system and adding the correction information to the corresponding position of the text to be corrected in the original picture format; wherein the correction information is in an editable form, and the correction system provides correction tools so that a user can modify the correction information;
and the material recommendation system is used for automatically recommending excellent composition materials according to the defects of the composition.
Optionally, the composition acquisition system can acquire a single picture or acquire a plurality of pictures uploaded in batches, and if the pictures are the plurality of pictures uploaded in batches, the plurality of pictures are automatically matched with the corresponding names; the matching process comprises the following steps: performing layout analysis on each picture to extract name areas to obtain a plurality of name area pictures, and identifying each name area picture by using a ocr identification engine to obtain name information; and matching the corresponding picture with the corresponding name according to the obtained name information.
Optionally, the process of extracting the actual composition area by the preprocessing system includes:
extracting the maximum communication area at the periphery of the picture, and determining the area inside the communication area as an actual composition area when the maximum communication area exceeds a set area threshold value;
calculating the distance between each point on the maximum connected region outline and four vertexes of the uploaded picture, and respectively selecting four points closest to the four vertexes of the original picture as four vertexes of the actual composition region;
and performing perspective transformation based on the four vertexes of the actual composition area obtained through the selection so as to correct the picture.
Optionally, the process of extracting the title and segmenting the title by the preprocessing system includes:
inputting the corrected picture into a ocr recognition engine, and extracting and segmenting the title aiming at the line coordinate information in the returned text position coordinate information; if the abscissa of the leftmost vertex positions of two continuous lines at the beginning in a piece of paper is larger than the next line and larger than a preset first threshold value, determining a first behavior title area; and if the abscissa of the leftmost vertex position of the current row is greater than the next row and greater than a preset second threshold, considering the start of a new section of the current row.
Optionally, the chinese composition correction system is provided with a pre-trained composition classification model and a comment library, where the composition classification model is obtained based on training of a deep learning algorithm;
and in the process of correcting by the correcting system, the composition genre classification model is utilized, the composition genres are identified based on the text content information, and related comments are automatically selected from the comment library according to the identified composition genres to be pushed so as to be convenient for a user to select and modify.
Optionally, in the correcting system, according to a plurality of preset capability points to be detected, determining capability points which do not appear in the composition information; wherein, each composition genre is correspondingly provided with a plurality of capability points;
and the material recommendation system automatically recommends the corresponding excellent composition materials according to the capability points which do not appear in the composition content information.
Optionally, the correction information includes text comment information and marks, and the marks include lines, graphics and symbols;
when the correction information is added to the corresponding position of the original picture format to be corrected, the correction system adds marks of different forms to the corresponding position in the picture according to the habit of the user aiming at different text content information, and adds text comment information.
Optionally, the system further comprises a general evaluation system for performing overall evaluation on the composition according to each piece of correction information, wherein the general evaluation system is used for scoring different aspects of the composition and giving general scores and general evaluation suggestions, and counting the number of words, words and sentences of the composition.
In a second aspect, an embodiment of the present application further provides a man-machine combined chinese composition modifying method, which is applied to the man-machine combined chinese composition modifying system of any one of the first aspect, and the method includes:
the composition acquisition system acquires a composition to be modified in a picture format uploaded by a user;
the preprocessing system utilizes a ocr recognition engine to conduct layout analysis on the acquired to-be-modified composition to extract an actual composition area, obtain text position coordinate information and text content information, and conduct topic extraction and segmentation processing;
the correction system corrects the text content information obtained by the preprocessing system and adds the correction information to the corresponding position of the text to be corrected in the original picture format;
the material recommending system automatically recommends excellent composition materials according to the defects of compositions.
The technical scheme provided by the embodiment of the application can comprise the following beneficial effects:
the man-machine combined Chinese composition correction system provided by the embodiment of the application comprises a composition acquisition system, a preprocessing system, a correction system and a material recommendation system, wherein the preprocessing system is used for preprocessing a to-be-corrected composition in a picture format acquired by the composition acquisition system, then the correction system is used for automatically correcting the composition and giving correction information on an original picture of a composition paper, so that teachers and students can see visual correction results; in addition, the provided correction information is in an editable form, so that a teacher can further modify the correction information according to own experience, and the correction result is more in line with the actual situation; and moreover, the material recommendation system can automatically recommend excellent composition materials according to the defects of composition in the correction result, so that students can conveniently improve the composition capability. That is, by adopting the technical scheme of the application, the problems existing in the prior art can be solved, visual correction results can be presented, and more functions are provided.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
FIG. 1 is a schematic diagram of a human-computer combined Chinese composition correction system according to an embodiment of the present application;
FIG. 2 is a schematic diagram of an exemplary modification result according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a name matching process according to an embodiment of the present application;
FIG. 4 is a schematic diagram of an overall evaluation result provided by an embodiment of the present application;
fig. 5 is a schematic diagram of a material recommendation process according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.
In order to solve the problems mentioned in the background art, the application provides a man-machine combined Chinese composition correction system and a man-machine combined Chinese composition correction method, wherein, firstly, correction information is synchronized to a student composition paper picture by means of an image processing technology, so as to simulate the real correction habit of a teacher to the greatest extent; in addition, a correction tool is provided for a teacher to modify the system pre-correction result, so that the correction result is more in line with the actual situation; in addition, excellent material recommending function is provided after correction, so that students can better promote composition level. Specific embodiments are described in detail below by way of examples.
Examples
Referring to fig. 1, fig. 1 is a schematic workflow diagram of a man-machine combined chinese composition correction system according to an embodiment of the present application. As shown in FIG. 1, the human-computer combined Chinese composition correction system mainly comprises the following parts:
the composition acquisition system 1 is used for acquiring a to-be-modified composition in a picture format uploaded by a user; wherein, the picture format comprises PDF format;
the preprocessing system 2 is used for performing layout analysis on the acquired to-be-modified composition by utilizing a ocr recognition engine to extract an actual composition area, obtaining text position coordinate information and text content information, and performing topic extraction and segmentation processing;
the correction system 3 is used for correcting the text content information obtained by the preprocessing system and adding the correction information to the corresponding position of the text to be corrected in the original picture format; wherein the correction information is in an editable form, and the correction system provides correction tools so that a user can modify the correction information;
and the material recommendation system 4 is used for automatically recommending excellent composition materials according to the defects of the composition.
It should be noted that, strictly speaking, the PDF format does not belong to the picture format, but, since the text in the PDF format and the file in the picture format cannot be directly modified (i.e. different from the text format that can be directly edited and modified by word, txt, etc.), in the processing of the file in the picture format and the file in the PDF format, the processing of the file in the picture format and the file in the PDF format needs to be performed by the ocr technology, so in this embodiment, the PDF format is regarded as one of the picture formats, that is, the pictures mentioned below all include the PDF file.
In addition, after the composition picture is uploaded to the system, ocr identification and AI pre-reading are automatically carried out in the background, after the reading is finished, a text reading result is obtained, and then the correction information is added to the corresponding position of the original picture in a text box or similar mode for visual display, and correction tools are provided for teachers to modify the correction information of the AI pre-reading. The correction information comprises character comment information and marks, wherein the marks comprise lines, figures, symbols and the like; when the correction information is added to the corresponding position of the to-be-corrected text in the original picture format, the correction system adds marks in different forms to the corresponding position in the picture according to the habit of a user (teacher) aiming at different text content information, and adds text comment information.
For example, as shown in fig. 2, the correction information includes labeling a good sentence with wavy lines (settable colors, such as red, not shown in fig. 2), and giving text comment information on the right side (or below, etc.); marking wrongly written characters with circles (with settable colors); marking the unsmooth sentences and the like by using transverse lines (with settable colors), so as to fit the correction habit of a teacher as much as possible; in addition, the teacher may use the right-side correction tool to make secondary edits to the correction results of the AI, including editing text comment information in text boxes, modifying line forms or colors, adding text boxes, symbols, lines, and the like.
According to the technical scheme, in the man-machine combined Chinese composition correction system provided by the embodiment of the application, after the pretreatment system carries out pretreatment on the to-be-corrected composition in the picture format acquired by the composition acquisition system, the correction system carries out automatic correction and gives correction information on the original picture of the composition paper, so that teachers and students can see visual correction results; in addition, the provided correction information is in an editable form, so that a teacher can further modify the correction information according to own experience, and the correction result is more in line with the actual situation; and moreover, the material recommendation system can automatically recommend excellent composition materials according to the defects of composition in the correction result, so that students can conveniently improve the composition capability. That is, by adopting the technical scheme of the application, the problems existing in the prior art can be solved, visual correction results can be presented, and more functions are provided.
Further, in a specific application process, the composition acquisition system can acquire a single picture or acquire a plurality of pictures uploaded in batches (the plurality of pictures can be integrated into a PDF format for uploading), and if the pictures are the plurality of pictures uploaded in batches, the plurality of pictures are automatically matched with corresponding names; as shown in fig. 3, the matching process includes: performing layout analysis on each picture to extract name areas to obtain a plurality of name area pictures, and identifying each name area picture by using a ocr identification engine to obtain name information; and matching the corresponding picture with the corresponding name according to the obtained name information to obtain a matching result.
By automatic name matching, the composition pictures can be distributed to the corresponding names of the students, so that the time for manual distribution by a teacher is saved, and the efficiency is improved.
Furthermore, in some embodiments, the process of extracting the actual composition area by the preprocessing system includes: extracting the maximum communication area at the periphery of the picture, and determining the area inside the communication area as an actual composition area when the maximum communication area exceeds a set area threshold value; calculating the distance between each point on the maximum connected region outline and four vertexes of the uploaded picture, and respectively selecting four points closest to the four vertexes of the original picture as four vertexes of the actual composition region; and performing perspective transformation based on the four vertexes of the actual composition area obtained through the selection so as to correct the picture.
It should be noted that the above process is implemented for the pictures of the paper (as shown in fig. 2) that often occur in the chinese art, and the four vertices of the actual paper area are obtained, that is, the four vertices of the square frame line in the paper shown in fig. 2.
Further, the process of extracting and segmenting the title by the preprocessing system comprises the following steps: inputting the corrected picture into a ocr recognition engine, and extracting and segmenting the title aiming at the line coordinate information in the returned text position coordinate information; if the abscissa of the leftmost vertex positions of two continuous lines at the beginning in a piece of paper is larger than the next line and larger than a preset first threshold value, determining a first behavior title area; and if the abscissa of the leftmost vertex position of the current row is greater than the next row and greater than a preset second threshold, considering the start of a new section of the current row.
Since the head of each segment necessarily contains the indents of two characters (which can also be regarded as two squares for a paper), the abscissa of the first character of each segment (i.e. the character of the leftmost vertex position of the first line of each segment) is necessarily larger than the abscissa of the first character of the next line of the segment, and based on this principle, the header extraction and segmentation process can be performed by the above procedure.
In addition, in some embodiments, for the overall comment of the composition, in specific implementation, the Chinese composition correction system is provided with a pre-trained composition genre classification model and a comment library, wherein the composition genre classification model is trained based on a deep learning algorithm; and in the process of correcting by the correcting system, the composition genre classification model is utilized, the composition genres are identified based on the text content information, and related comments are automatically selected from the comment library according to the identified composition genres to be pushed so as to be convenient for a user to select and modify.
More specifically, millions of composition samples can be collected on each large composition website in advance, and a body-cutting classification model (a body-cutting classifier) is trained by using a deep learning algorithm, so that body-cutting identification is carried out on the composition to be corrected; and the comment labels under each genre are pre-arranged, when a teacher needs to set and modify comments, a comment library can be opened through the provided comment assistant tool, and a proper comment is selected for quick setting, as shown in fig. 4.
In addition, in some embodiments, as shown in fig. 4, the system further includes a general evaluation system for performing overall evaluation on the composition according to each piece of correction information, including scoring different aspects of the composition and giving general scores and general evaluation suggestions, and performing statistics on the number of words, terms and sentences of the composition, where the evaluation may specify templates according to each dimension in the scoring details. As shown in FIG. 4, the different aspects of the composition include content, expression, structure, and context specifications. And, the score and the result such as the general comment suggestion that the system gave, the mr can also revise, when the mr adjusts the score in the scoring details, the comment that the system gave in advance changes simultaneously.
In addition, regarding the excellent composition material recommendation of the material recommendation system, in specific implementation, a plurality of capability points can be set in advance according to each composition, so that in the correction process in the correction system, the capability points which do not appear in the composition content information can be determined according to the preset plurality of capability points to be detected; furthermore, the material recommendation system automatically recommends the corresponding excellent composition materials according to the capability points which do not appear in the composition content information. Taking the writing of the human composition as an example, the writing of the human composition capability points comprises the description of the appearance of the human, the psychological description of the human and the like, after the system diagnoses the capability points of the composition, the capability points which appear and the capability points which do not appear in the composition are diagnosed, and the related excellent material recommendation is carried out on the capability points which do not appear at the recommended learning place. In addition, in order to facilitate recommending materials, a labeled material data set may be preset, so that corresponding materials may be obtained from the labeled material data set according to the capability point diagnosis result or directly according to the genre label, and the specific process is shown in fig. 5.
In addition, the specific working process of the man-machine combined Chinese composition correction system comprises the following steps:
the method comprises the steps that a composition acquisition system 1 acquires a composition to be modified in a picture format uploaded by a user;
the preprocessing system 2 utilizes a ocr recognition engine to conduct layout analysis on the acquired to-be-modified composition to extract an actual composition area, obtain text position coordinate information and text content information, and conduct topic extraction and segmentation processing;
the correction system 3 corrects the text content information obtained by the preprocessing system, and adds the correction information to the corresponding position of the original picture format to be corrected;
the material recommendation system 4 automatically recommends excellent composition materials according to the defects of composition.
In the scheme, a Chinese composition learning closed-loop scheme from machine evaluation to manual correction to material recommendation is provided. The name matching and ai pre-reading are carried out in the uploading process, the correction tool of the platform supports a teacher to modify the pre-reading result of the machine and dynamically updates the pre-reading comment of the machine, meanwhile, the comment library function of the reading assistant can provide thinking click for the teacher to write comments and support the teacher to conveniently change comments, therefore, correction efficiency and quality of the teacher can be greatly optimized, students can conveniently check the correction result, in addition, the system can personally recommend learning materials according to the diagnosis result of the student composition, and further the composition level of the students can be improved.
It is to be understood that the same or similar parts in the above embodiments may be referred to each other, and that in some embodiments, the same or similar parts in other embodiments may be referred to.
It should be noted that in the description of the present application, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present application, unless otherwise indicated, the meaning of "plurality" means at least two.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.
The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims (7)

1. A human-machine-combined chinese-text modification system, comprising:
the composition acquisition system is used for acquiring a composition to be modified in a picture format uploaded by a user; wherein, the picture format comprises PDF format;
the preprocessing system is used for carrying out layout analysis on the acquired to-be-modified composition by utilizing a ocr recognition engine so as to extract an actual composition area, obtaining text position coordinate information and text content information, and carrying out topic extraction and segmentation processing;
the correction system is used for correcting the text content information obtained by the preprocessing system and adding the correction information to the corresponding position of the text to be corrected in the original picture format; wherein the correction information is in an editable form, and the correction system provides correction tools so that a user can modify the correction information;
the material recommendation system is used for automatically recommending excellent composition materials according to the defects of the composition;
the process of extracting the actual composition area by the preprocessing system comprises the following steps:
extracting the maximum communication area at the periphery of the picture, and determining the area inside the communication area as an actual composition area when the maximum communication area exceeds a set area threshold value;
calculating the distance between each point on the maximum connected region outline and four vertexes of the uploaded picture, and respectively selecting four points closest to the four vertexes of the original picture as four vertexes of the actual composition region;
based on the four vertexes of the actual composition area obtained by the selection, performing perspective transformation to correct the picture;
the process of the preprocessing system for extracting the title and carrying out segmentation processing comprises the following steps:
inputting the corrected picture into a ocr recognition engine, and extracting and segmenting the title aiming at the line coordinate information in the returned text position coordinate information; if the abscissa of the leftmost vertex positions of two continuous lines at the beginning in a piece of paper is larger than the next line and larger than a preset first threshold value, determining a first behavior title area; and if the abscissa of the leftmost vertex position of the current row is greater than the next row and greater than a preset second threshold, considering the start of a new section of the current row.
2. The system of claim 1, wherein the composition acquisition system is capable of acquiring a single picture or acquiring a plurality of pictures uploaded in batches, and if the pictures are the plurality of pictures uploaded in batches, automatically matching the plurality of pictures with corresponding names; the matching process comprises the following steps: performing layout analysis on each picture to extract name areas to obtain a plurality of name area pictures, and identifying each name area picture by using a ocr identification engine to obtain name information; and matching the corresponding picture with the corresponding name according to the obtained name information.
3. The system according to claim 1, wherein the chinese composition correction system is provided with a pre-trained composition genre classification model and a comment library, wherein the composition genre classification model is trained based on a deep learning algorithm;
and in the process of correcting by the correcting system, the composition genre classification model is utilized, the composition genres are identified based on the text content information, and related comments are automatically selected from the comment library according to the identified composition genres to be pushed so as to be convenient for a user to select and modify.
4. A system according to claim 3, wherein in the correction system, in the correction process, capability points which do not appear in the composition information are determined according to a plurality of capability points to be detected which are preset; wherein, each composition genre is correspondingly provided with a plurality of capability points;
and the material recommendation system automatically recommends the corresponding excellent composition materials according to the capability points which do not appear in the composition content information.
5. The system of claim 1, wherein the correction information includes text comment information and indicia, the indicia including lines, graphics, and symbols;
when the correction information is added to the corresponding position of the original picture format to be corrected, the correction system adds marks of different forms to the corresponding position in the picture according to the habit of the user aiming at different text content information, and adds text comment information.
6. The system of claim 1, further comprising a general rating system for overall rating the composition based on the correction information, including scoring different aspects of the composition and giving general scores and general rating suggestions, and counting words, words and sentences of the composition.
7. A human-machine-combined chinese composition modifying method, applied to the human-machine-combined chinese composition modifying system according to any one of claims 1 to 6, the method comprising:
the composition acquisition system acquires a composition to be modified in a picture format uploaded by a user;
the preprocessing system utilizes a ocr recognition engine to conduct layout analysis on the acquired to-be-modified composition to extract an actual composition area, obtain text position coordinate information and text content information, and conduct topic extraction and segmentation processing; the process for extracting the actual composition area comprises the following steps: extracting the maximum communication area at the periphery of the picture, and determining the area inside the communication area as an actual composition area when the maximum communication area exceeds a set area threshold value; calculating the distance between each point on the maximum connected region outline and four vertexes of the uploaded picture, and respectively selecting four points closest to the four vertexes of the original picture as four vertexes of the actual composition region; based on the four vertexes of the actual composition area obtained by the selection, performing perspective transformation to correct the picture; the title extraction and segmentation process comprises the following steps: inputting the corrected picture into a ocr recognition engine, and extracting and segmenting the title aiming at the line coordinate information in the returned text position coordinate information; if the abscissa of the leftmost vertex positions of two continuous lines at the beginning in a piece of paper is larger than the next line and larger than a preset first threshold value, determining a first behavior title area; if the abscissa of the leftmost vertex position of the current row is greater than the next row and greater than a preset second threshold, considering the start of a new section of the current row;
the correction system corrects the text content information obtained by the preprocessing system and adds the correction information to the corresponding position of the text to be corrected in the original picture format;
the material recommending system automatically recommends excellent composition materials according to the defects of compositions.
CN202110774531.6A 2021-07-08 2021-07-08 Man-machine combined Chinese composition correcting system and method Active CN113360608B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110774531.6A CN113360608B (en) 2021-07-08 2021-07-08 Man-machine combined Chinese composition correcting system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110774531.6A CN113360608B (en) 2021-07-08 2021-07-08 Man-machine combined Chinese composition correcting system and method

Publications (2)

Publication Number Publication Date
CN113360608A CN113360608A (en) 2021-09-07
CN113360608B true CN113360608B (en) 2023-10-20

Family

ID=77538734

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110774531.6A Active CN113360608B (en) 2021-07-08 2021-07-08 Man-machine combined Chinese composition correcting system and method

Country Status (1)

Country Link
CN (1) CN113360608B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113743091A (en) * 2021-11-08 2021-12-03 山东山大鸥玛软件股份有限公司 Composition text intelligent scoring method, system and equipment
CN114489439A (en) * 2022-01-20 2022-05-13 安徽淘云科技股份有限公司 Article correcting method and related equipment thereof
CN117892720B (en) * 2024-03-15 2024-06-11 北京和气聚力教育科技有限公司 Chinese composition AI sentence evaluation pipeline output method, device and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016115866A1 (en) * 2015-01-19 2016-07-28 深圳市时尚德源文化传播有限公司 Intelligent terminal network teaching method
CN107908792A (en) * 2017-12-13 2018-04-13 北京百度网讯科技有限公司 Information-pushing method and device
CN108090445A (en) * 2017-12-17 2018-05-29 张玉存 The electronics of a kind of papery operation or paper corrects method
CN110189237A (en) * 2019-05-13 2019-08-30 上海奇初教育科技有限公司 Operation corrects system automatically and corrects method
CN111737968A (en) * 2019-03-20 2020-10-02 小船出海教育科技(北京)有限公司 Method and terminal for automatically correcting and scoring composition
CN111898346A (en) * 2020-07-15 2020-11-06 金现代信息产业股份有限公司 Online article correction system and method
CN112580503A (en) * 2020-12-17 2021-03-30 深圳市元德教育科技有限公司 Operation correction method, device, equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016115866A1 (en) * 2015-01-19 2016-07-28 深圳市时尚德源文化传播有限公司 Intelligent terminal network teaching method
CN107908792A (en) * 2017-12-13 2018-04-13 北京百度网讯科技有限公司 Information-pushing method and device
CN108090445A (en) * 2017-12-17 2018-05-29 张玉存 The electronics of a kind of papery operation or paper corrects method
CN111737968A (en) * 2019-03-20 2020-10-02 小船出海教育科技(北京)有限公司 Method and terminal for automatically correcting and scoring composition
CN110189237A (en) * 2019-05-13 2019-08-30 上海奇初教育科技有限公司 Operation corrects system automatically and corrects method
CN111898346A (en) * 2020-07-15 2020-11-06 金现代信息产业股份有限公司 Online article correction system and method
CN112580503A (en) * 2020-12-17 2021-03-30 深圳市元德教育科技有限公司 Operation correction method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
国内外自动作文评分技术对比研究――以E-rater和批改网为例;宁美华;;校园英语(第20期);11 *
基于自动作文评阅系统反馈的学生修改成效研究;曹婷;;湘南学院学报;第40卷(第06期);102-105+121 *

Also Published As

Publication number Publication date
CN113360608A (en) 2021-09-07

Similar Documents

Publication Publication Date Title
CN113360608B (en) Man-machine combined Chinese composition correcting system and method
US11508251B2 (en) Method and system for intelligent identification and correction of questions
CN102436547A (en) Wrong sentence statistical method and system for teaching
CN103154974A (en) Character recognition device, character recognition method, character recognition system, and character recognition program
CN104199834A (en) Method and system for interactively obtaining and outputting remote resources on surface of information carrier
CA2936232A1 (en) Apparatus and method for grading unstructured documents using automated field recognition
US5556282A (en) Method for the geographical processsing of graphic language texts
JP7147185B2 (en) Information processing device, information processing method and information processing program
JP4868224B2 (en) Additional recording information processing method, additional recording information processing apparatus, and program
CN113779345B (en) Teaching material generation method and device, computer equipment and storage medium
CN114722842A (en) Computer artificial intelligent foreign language translation method and translation system thereof
CN113806348A (en) Student wrong question and personalized test question algorithm applied to K12 education
CN112434568A (en) Drawing identification method and device, storage medium and computing equipment
CN111898358A (en) Method and system for on-line generation of lesson preparation teaching plan of teacher
US7059860B2 (en) Method and tools for teaching reading for test-taking
CN114445744A (en) Education video automatic positioning method, device and storage medium
Angrave et al. Creating TikToks, Memes, Accessible Content, and Books from Engineering Videos? First Solve the Scene Detection Problem.
CN111626023A (en) Automatic generation method, device and system for visualization chart highlighting and annotation
JP4710707B2 (en) Additional recording information processing method, additional recording information processing apparatus, and program
JP2007280241A (en) Postscript information processing method, postscript information processor and program
Tupman et al. Temporarily retracted: Reconsidering the Roman workshop: using computer vision to analyse the making of ancient inscriptions
US20240086452A1 (en) Tracking concepts within content in content management systems and adaptive learning systems
Buza Automatic Tests Correction System in Education
CN115035525A (en) Automatic correction system and method for paper English composition
CN113901201A (en) Document intelligent marking method and device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20221025

Address after: 100089 room 525, floor 5, building 25, yard 8, Dongbeiwang West Road, Haidian District, Beijing

Applicant after: Beijing Yueshen Intelligent Technology Co.,Ltd.

Applicant after: Beijing One-stroke Two-stroke Technology Co.,Ltd.

Address before: 100089 room 525, floor 5, building 25, yard 8, Dongbeiwang West Road, Haidian District, Beijing

Applicant before: Beijing Yueshen Intelligent Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant