CN113255629A - Document processing method and device, electronic equipment and computer readable storage medium - Google Patents

Document processing method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN113255629A
CN113255629A CN202110800924.XA CN202110800924A CN113255629A CN 113255629 A CN113255629 A CN 113255629A CN 202110800924 A CN202110800924 A CN 202110800924A CN 113255629 A CN113255629 A CN 113255629A
Authority
CN
China
Prior art keywords
test question
page
answering
area
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110800924.XA
Other languages
Chinese (zh)
Other versions
CN113255629B (en
Inventor
周宏浩
秦勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Century TAL Education Technology Co Ltd
Original Assignee
Beijing Century TAL Education Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Century TAL Education Technology Co Ltd filed Critical Beijing Century TAL Education Technology Co Ltd
Priority to CN202110800924.XA priority Critical patent/CN113255629B/en
Publication of CN113255629A publication Critical patent/CN113255629A/en
Application granted granted Critical
Publication of CN113255629B publication Critical patent/CN113255629B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/32Digital ink
    • G06V30/36Matching; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/635Overlay text, e.g. embedded captions in a TV program
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The present disclosure provides a document processing method, a device, an electronic device and a computer readable storage medium, the document processing method includes extracting a first test question segment contained in a page to be judged, the first test question segment being associated with a corresponding first answering area contained in the page to be judged; extracting a second test question segment contained in the answer page matched with the page to be judged, wherein the second test question segment is associated with the position of a corresponding second answering area contained in the answer page; when the texts of the first test question segment and the second test question segment are matched, the question judging result is output according to the text in the first answering area associated with the position of the first test question segment and the text in the second answering area associated with the position of the corresponding second test question segment.

Description

Document processing method and device, electronic equipment and computer readable storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a document processing method and apparatus, an electronic device, and a computer-readable storage medium.
Background
The shooting judgment question is an important application of the artificial intelligence technology in the education field, and the shooting judgment process is as follows: the user uses a mobile phone or a tablet computer and other terminal equipment to take a picture, then uploads the taken picture to a picture-taking question-judging program, then the picture-taking question-judging program sends the picture to a question type judging model, the question type judging model detects a text area of a pre-specified question type, and the text area is selected in a frame mode. Meanwhile, the application program sends the image into a text line detection model, the text line detection model detects a text line area, and the text line area is selected; and then, performing layout analysis by using a layout analysis model according to the detection result of the text line detection model, combining the frames, analyzing and obtaining which frames each question consists of according to the characteristics of each question type, then cutting the original picture according to the coordinates of the frames, sending the cut image into an identification model to obtain the content of each question, judging a test question judging result according to the content of the question, and outputting the test question judging result.
Disclosure of Invention
According to a first aspect of the present disclosure, there is provided a document processing method including:
extracting a first test question segment contained in a page to be judged, wherein the first test question segment is associated with the position of a corresponding first answering area contained in the page to be judged;
extracting a second test question segment contained in an answer page matched with the page to be judged, wherein the second test question segment is associated with a corresponding second answering area contained in the answer page;
and when the texts of the first test question segment and the second test question segment are matched, outputting a question judging result according to the text in the first answering area associated with the position of the first test question segment and the text in the second answering area associated with the position of the corresponding second test question segment.
According to a second aspect of the present disclosure, there is provided a document processing apparatus including:
the first extraction module is used for extracting a first test question segment contained in a page to be judged, and the first test question segment is associated with a first answering area contained in the page to be judged;
the second extraction module is used for extracting a second test question segment contained in the answer page matched with the page to be judged, and the second test question segment is associated with a corresponding second answering area contained in the answer page;
the output module is used for outputting a question judging result according to the text in the first answering area associated with the position of the first test question segment and the text in the second answering area associated with the position of the corresponding second test question segment when the texts of the first test question segment and the second test question segment are matched;
according to a third aspect of the present disclosure, there is provided an electronic device comprising:
a processor; and the number of the first and second groups,
a memory for storing a program, wherein the program is stored in the memory,
wherein the program comprises instructions which, when executed by the processor, cause the processor to perform the method according to any one of the first aspects of the present disclosure.
According to a third aspect of the present disclosure, a non-transitory computer-readable storage medium having stored thereon computer instructions for causing the computer to perform the method according to any one of the first aspects of the present disclosure.
According to one or more technical schemes provided in the embodiments of the present disclosure, a first answering area associated with a position of a first test question segment and a second answering area associated with a position of a corresponding second test question segment are represented by text matching of the first test question segment and the second test question segment, a corresponding answer result is obtained according to the text of the matched first answering area and the text of the second answering area, and the text of the matched first answering area and the second answering area are determined according to coordinates of the first answering area and the second answering area directly, when an angle error and the like exist in a page to be judged during uploading on a photo, because the first test question segment and the second test question segment are still associated with the corresponding positions of the first answering area and the second answering area, the matched first answering area and the second answering area can still be determined according to the first test question segment and the second test question segment, therefore, the document processing method can improve the accuracy of test question correction.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the disclosure and together with the description serve to explain the principles of the disclosure.
FIG. 1 illustrates a schematic diagram of an application scenario implementing various methods described herein, according to an exemplary embodiment of the present disclosure;
FIG. 2 illustrates a display of a cell phone interface in an application scenario implementing various methods described herein, according to an exemplary embodiment of the present disclosure;
FIG. 3 shows a flowchart of a document processing method according to an exemplary embodiment of the present disclosure;
FIG. 4 is a diagram illustrating a pending page and archival page of a document processing method according to an exemplary embodiment of the present disclosure;
FIG. 5 illustrates a sub-flowchart of a document processing method according to an exemplary embodiment of the present disclosure;
FIG. 6 illustrates another sub-flow diagram of a document processing method according to an exemplary embodiment of the present disclosure;
FIG. 7 illustrates yet another sub-flowchart of a document processing method according to an exemplary embodiment of the present disclosure;
FIG. 8 illustrates yet another sub-flowchart of a document processing method according to an exemplary embodiment of the present disclosure;
FIG. 9 illustrates yet another sub-flowchart of a document processing method according to an exemplary embodiment of the present disclosure;
FIG. 10 shows a schematic block diagram of a document processing apparatus according to an example embodiment of the present disclosure;
FIG. 11 illustrates a block diagram of an exemplary electronic device that can be used to implement embodiments of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description. It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
Terms in embodiments of the present disclosure are explained below:
test question text: test paper, examination paper, homework, etc. For example: "2. fill in bracketsWhen the word: coated in plain silverGround (earth)Something without profound differenceEvaluation of"middle, except"Ground (earth)”、“Evaluation ofExcept for answer texts, other texts are test question texts.
Test question fragment: the segment in the test question text can be one character, two characters or a plurality of characters in the test question text. For example: "2. fill in proper words in parentheses: coated in plain silverGround (earth)Something without profound differenceEvaluation of'middle'Coated in plain silver”、“Am of great concern"the segments in the test question text are test question segments.
Answering text: text of answers in a test paper, job, etc. page. For example: "2. fill in proper words in parentheses: coated in plain silverGround (earth)Something without profound differenceEvaluation of"in", "earth", "evaluation" are answer texts.
A response area: the area in the page for filling out the answer, test paper, job, etc. For example: "2. fill in proper words in parentheses: coated in plain silverGround (earth)Something without profound differenceEvaluation ofThe region where "the earth" and "the evaluation" are located is the answer region.
Fig. 1 is a schematic diagram of an application scenario of a document processing method provided in an exemplary embodiment of the present disclosure, and it should be understood that the application scenario does not constitute a limitation to the application scenario of the present application, and the document processing method provided in the exemplary embodiment of the present disclosure may also be used in other application scenarios.
Taking a specific application scenario of the document processing method as an example, referring to fig. 1, the application scenario 100 includes a shooting device 101, a display terminal 102, and a server 104. The photographing apparatus 101, the display terminal 102, and the like can each communicate with the server 104 through the network 103. The photographing apparatus 101 may be a video camera, a photosensor, or the like having an image capturing function. The display terminal 102 may be a mobile phone, a notebook computer, a desktop computer, a handheld terminal, or other terminals with a display function. Here, the functions of the photographing apparatus 101 may be integrated in the display terminal 102, or may be independent of the display terminal 102, and may be designed according to actual needs.
As shown in fig. 1, the shooting device 101 may directly shoot a document to be judged to obtain a page to be judged having a corresponding image format, where the image format may be JPEG (joint photographic experts group, abbreviated as JPEG), tiff (tag image file format), raw (rawimageformat), or the like. The image of the page to be judged can be a dynamic image or a static image.
As shown in fig. 1, if the above-mentioned shooting device 101 can implement a networking function, the shooting device 101 can directly upload the image of the page to be judged to the server 104 through the network 103 and perform document processing, or can transmit the image of the page to be judged to the display terminal 102 through the shooting device 101, and the display terminal 102 uploads the image to the server 104 through the network 103 and performs document processing. The user can access the server 104 from the display terminal 102 via the network 103, acquire a processing result or even a processing procedure, and display the processing result or even the processing procedure. Meanwhile, if the photographing apparatus 101 also has an image display function, the server 104 may also feed back (may be automatic feedback or may respond to an operation for the photographing apparatus) a processing result or even a processing procedure to the photographing apparatus 101 through the network 103 for display.
As shown in fig. 1, if the above-mentioned shooting device 101 cannot implement a networking function, after the shooting device 101 shoots a document to be page judged, the obtained image to be page judged can be stored by a storage medium such as mobile hardware, a usb disk, and an optical disk, the display terminal 102 reads the image to be page judged stored in the storage medium, uploads the image to the server 104 through the network 103 and performs document processing, and a user can access the server 104 from the display terminal 102 through the network 103 to obtain a processing result and even a processing procedure. In addition, the server 104 may also automatically feed back the processing result and even the processing procedure to the display terminal 102 through the network 103 for display.
In the document processing method provided by the exemplary embodiment of the disclosure, the document may be content displayed on the display interface of the mobile phone illustrated in fig. 2. As shown in fig. 2, a page to be judged 202 is displayed in a display interface 201 of the mobile phone 200. The page to be judged 202 may be a static picture or a dynamic image. When the page to be judged 202 is a dynamic image, the server can select a frame of picture from the dynamic page to be judged image in advance. The picture can be a frame of picture with the best definition and shooting angle, and can also be a randomly selected frame of picture. The following describes a document processing method provided by an exemplary embodiment of the present disclosure with reference to the drawings, taking a server as an execution subject.
Referring to fig. 3, an exemplary embodiment of the present disclosure provides a document processing method, including:
s301, extracting a first test question segment contained in the page to be judged, wherein the first test question segment is related to the position of a first answering area contained in the page to be judged.
The page to be judged comprises a test question text and an answer text, wherein the answer text in the page to be judged is the answer text to be judged, and for convenience of description, the test question text of the page to be judged is defined as a first test question text, and the answer text of the page to be judged is defined as a first answer text.
S302, extracting a second test question segment contained in the answer page matched with the page to be judged, wherein the second test question segment is associated with the position of a corresponding second answering area contained in the answer page.
The answer page includes a test question text and an answer text, the answer text in the answer page is a reference answer text, for convenience of description, the test question text in the answer page is defined as a second test question text, and the answer text in the answer page is defined as the second answer text.
The answer page matched with the page to be judged indicates that the answer page is the answer page matched with the page to be judged, and comprises a second test question text corresponding to the first test question text and a second answering area corresponding to the first answering area, wherein the second answering area is provided with a second answering text serving as a reference answer. Here, the second test question text corresponding to the first test question text refers to a test question text whose topic is the same as that of the first test question text. The second answering area corresponding to the first answering area refers to the answering area of the same question as the corresponding first answering area. The main differences between the page to be judged and the answer page matched with the page to be judged are as follows: the answer text of the page to be judged is the answer text to be judged, and the answer text of the answer page is the reference answer text.
The answer page matched with the page to be judged can be obtained from a preset question bank, specifically, the answer page matched with the page to be judged can be obtained according to the similarity between the whole page features of the page to be judged and the whole features of the answer page in the question bank, and the answer page matched with the mark can also be obtained by searching the answer page in the question bank according to the mark of the page to be judged. It should be understood that the present disclosure is only an implementable method for obtaining an answer page matching a page to be judged, and should not be construed as limiting the technical solution of the present disclosure.
The answer page matched with the page to be judged can also be a corresponding answer page stored locally.
It can be known that the second test question segment is associated with the corresponding second answering area position contained in the answer page, and the first test question segment is associated with the corresponding first answering area position contained in the page to be judged, so that whether the first test question segment is matched with the second test question segment or not can be consistent with whether the corresponding first answering area is matched with the corresponding second answering area or not, and when the texts of the first test question segment and the second test question segment are matched, the corresponding first answering area and the corresponding second answering area can be considered to be matched. Since the texts in the first answering area and the second answering area are not necessarily consistent, it is difficult to determine whether the first answering area and the second answering area match based on the texts in the first answering area and the second answering area, and the above-described steps of the present disclosure may overcome this problem.
And S303, when the texts of the first test question segment and the second test question segment are matched, outputting a question judging result according to the text in the first answering area associated with the position of the first test question segment and the text in the second answering area associated with the position of the corresponding second test question segment.
The corresponding second test question segment is the second test question segment which is matched with the first test question segment text.
When the answer result is output based on the text in the first answer area associated with the first test question segment position and the text in the second answer area associated with the corresponding second test question segment position, the answer result may be output based on whether the text in the first answer area associated with the first test question segment position and the text in the second answer area associated with the corresponding second test question segment position are the same.
In the document processing method provided by the exemplary embodiment of the present disclosure, since the first test question segment is associated with the corresponding first answering area position included in the page to be judged, and the second test question segment is associated with the corresponding second answering area position included in the answer page, when the texts of the first test question segment and the second test question segment are matched, the first answering area associated with the first test question segment position and the second answering area associated with the corresponding second test question segment position are also substantially represented to be matched with each other. At this time, the question determination result may be determined according to the text in the first answering area associated with the first test question segment position and the text in the second answering area associated with the corresponding second test question segment position.
The document processing method provided by the exemplary embodiment of the present disclosure outputs the question-judging result based on the text of the first answer area of the page to be judged and the text of the second answer area of the answer area, so that the question-judging result of the test question can be judged without understanding the question meaning of the question, and the method is applicable to different question types such as a blank filling question, a selection question, a judgment question, and the like, and has the advantage of wide application range of the question type. Meanwhile, the document processing method of the disclosure uses the text matching representation of the first test question segment and the second test question segment to match the first answering area associated with the first test question segment position and the second answering area associated with the corresponding second test question segment position, obtains the corresponding answer result according to the text of the matched first answering area and the text of the second answering area, and determines the matched text of the first answering area and the second answering area according to the coordinates of the first answering area and the second answering area directly, when the page to be judged has angle error and the like during shooting, because the first test question segment and the second test question segment are still associated with the corresponding first answering area and the second answering area position, the matched first answering area and the second answering area can still be determined according to the first test question segment and the second test question segment, therefore, the document processing method can improve the accuracy of test question correction.
The page to be judged and the answer page can be of a page-line two-stage structure, each page comprises a plurality of text lines, the answer page matched with the page to be judged can be obtained from the question bank, the question bank can be established in advance based on blank unanswered pages, and handwriting can be manually marked according to reference answers.
Before executing the document processing method of the present disclosure, a page-line two-level structure of a page to be judged and a page-line two-level structure of an answer page are obtained by preprocessing.
The document processing method of the embodiment of the disclosure may use an existing region detection module to detect the first answering region and the second answering region, for example, identify the first answering region and the second answering region by using a neural network model capable of detecting the answering region, where the neural network model may be a target detection network model centret. The target detection network model mainly comprises a residual error network for extracting picture features, a deconvolution module for performing upsampling on a feature map, and a branch convolution network for predicting the width and height of a target and the central point coordinate of the target, wherein the known neural network model is trained according to training samples corresponding to specific application occasions of the present disclosure. It is understood that the neural network model may be other network models that can implement the detection of the answering area.
The document processing method of the embodiment of the disclosure may use an existing text recognition model to recognize the text of the first answering area and the text of the second answering area. For example, the text of the first answering area and the text of the second answering area are obtained by using a neural network model capable of recognizing the text, wherein the neural network model can be a CRNN (convolutional neural network) model, the CRNN model can recognize a longer text sequence and comprises a CNN (convolutional neural networks) feature extraction layer and a BiLSTM (Bi-directional long short-term memory) sequence feature extraction layer, and end-to-end joint training can be performed. It is understood that the neural network model herein may also be other network models that can implement text detection. Similarly, the document processing method of the embodiment of the disclosure may use an existing text recognition model to recognize the first test question segment and the second test question segment.
In one embodiment, the positional association of the first question segments with the respective first answering areas includes the positional adjacency of the first question segments with the respective first answering areas, and the positional association of the second question segments with the respective second answering areas includes the positional adjacency of the second question segments with the respective second answering areas.
In this embodiment, "adjacent positions" are used as a kind of position association, so that a first test question segment is a test question segment having a stronger position association with a first answering area, and a second test question segment is a test question segment having a stronger position association with a second answering area, so that when the texts of the first test question segment and the second test question segment are matched, the possibility that the texts in the first answering area associated with the positions of the first test question segment are matched with the texts in the second answering area associated with the positions of the corresponding second test question segment is higher, and the accuracy of the corresponding answer result is higher.
Illustratively, when extracting the first test question segment and the second test question segment, the adjacent region corresponding to the first answering region may be obtained, the text of the adjacent region of the first answering region is extracted to obtain the first test question segment, the adjacent region corresponding to the second answering region may be obtained, and the text of the adjacent region corresponding to the second answering region is extracted to obtain the second test question segment. Adjacent regions are adjacent regions. For example, referring to fig. 4, in the page to be judged 400A, the region 402A is an adjacent region of the first answering region 401A, and the region 403A of the page to be judged 400A is an adjacent region of the first answering region 401A. In the answer sheet 400B, the region 402B is an adjacent region of the second answering region 401B, and the region 403B is an adjacent region of the second answering region 401B.
Illustratively, when extracting the first test question segment and the second test question segment, the adjacent region corresponding to the first answer region may be obtained, the printed body text of the adjacent region of the first answer region is extracted to obtain the first test question segment, the adjacent region corresponding to the second answer region may be obtained, and the printed body text of the adjacent region corresponding to the second answer region is extracted to obtain the second test question segment. In this example, the test question text of the page to be judged is a printed text, and the answer text of the page to be judged is a handwritten text, so that when the first test question fragment of the first answer area is extracted, the printed text of the adjacent area of the first answer area can be extracted to obtain the first test question fragment.
In one embodiment, the positional association of the first test question segments with the corresponding first answering areas includes a preset number of characters spaced between the first test question segments and the corresponding first answering areas, and the positional association of the second test question segments with the corresponding second answering areas includes a preset number of characters spaced between the second test question segments and the corresponding second answering areas.
In this embodiment, the "character with a preset number of intervals" is used as a "position association", so that the flexibility of association between the first test question segment and the first answering area is stronger, and the flexibility of association between the second test question segment and the second answering area is stronger.
For example, when extracting the first test question segment and the second test question segment, the adjacent region corresponding to the first answer region may be obtained, the text at the preset relative position in the adjacent region of the first answer region may be extracted to obtain the first test question segment, and the adjacent region corresponding to the second answer region may be obtained, and the text at the preset relative position in the adjacent region corresponding to the second answer region may be extracted to obtain the second test question segment. The preset relative position may be a preset number of characters (specifically, the preset number of characters may also be expressed by a preset distance) spaced from the corresponding first answering area or second answering area in the adjacent areas, where the preset number and the preset distance may be set according to specific needs, for example, the preset number may be an integer between 1 and N; n is not more than 1/2 characters of the whole line, N can be 3, 5, 10, 15, etc., and the set distance can be 1-N corresponding to the characters.
For example, when extracting the first test question segment and the second test question segment, the extension region corresponding to the first answer region may be obtained, the text at the preset relative position in the extension region of the first answer region may be extracted to obtain the first test question segment, the extension region corresponding to the second answer region may be obtained, and the text at the preset relative position in the extension region corresponding to the second answer region may be extracted to obtain the second test question segment. The characters at the preset relative position may be characters separated from the corresponding first answering area or second answering area by a preset number in adjacent areas (specifically, the characters separated by the preset number may be expressed by a set distance), wherein the preset number and the set distance may be set as required, and the preset number may be an integer between 1 and N; n generally does not exceed the 1/2 character count for the entire row, and N can be 3, 5, 10, 15, etc.
For example, the text in the preset relative position in the extension area of the first answering area can be the last M characters in the extension area of the first answering area; the text in the preset relative position in the expansion area of the second answering area can be the last M characters in the expansion area of the second answering area, M can be a positive integer of 1, 2, …, 10 and the like, and specifically, M can be 1 or 2.
For example, referring to fig. 4, in the page to be judged 400A, the region 402A is an adjacent region of the first answering region 401A, the region 403A of the page to be judged 400A is an adjacent region of the first answering region 401A, and a region obtained by combining the region 402A and/or the region 403A with the first answering region 401A is an extended region of the first answering region 401A. In the answer sheet 400B, the region 402B is a neighboring region of the second answering region 401B, the region 403B is a neighboring region of the second answering region 401B, and the region obtained by combining the region 402B and/or the region 403B with the second answering region 401B is an extension region of the second answering region 401B.
As a combination of the two embodiments, the first test question segment and the second test question segment may be extracted by using the above-mentioned one kind of "position association", and whether the first test question segment is text-matched with the corresponding second test question segment is determined, and when the first test question segment is not text-matched with the corresponding second test question segment, the first test question segment and the second test question segment with another kind of "position association" are adopted, and whether the first test question segment is text-matched with the corresponding second test question segment is determined. The two embodiments described above may be combined in other ways, and are not limited herein.
In one embodiment, the first question segment includes two first sub-segments, and the second question segment includes two second sub-segments, wherein the first answering area is located between the two first sub-segments, and the second answering area is located between the two second sub-segments.
In this embodiment, the first answering area is located between the two first sub-segments, and the second answering area is located between the two second sub-segments, so that the possibility that the first answering area associated with the first test question segment matches the second answering area associated with the second test question segment is higher when the first test question segment is the same as the second test question segment, and the accuracy of the output answer result is higher when the answer result is output according to the text in the first answering area associated with the first test question segment position and the text in the second answering area associated with the corresponding second test question segment position.
For example, when the first test question segment and the second test question segment are extracted, a forward expansion region and a backward expansion region corresponding to the first answering region can be obtained, a first sub-segment of the first test question segment is obtained according to the text in the forward expansion region, and another first sub-segment of the first test question segment is obtained according to the text in the backward expansion region. A forward expansion region and a backward expansion region corresponding to the second answering region can be obtained, a second sub-segment of the second test question segment is obtained according to the text in the forward expansion region, and another second sub-segment of the second test question segment is obtained according to the text in the backward expansion region. Here, the forward extension area or the forward extension area is relative to the text line direction, for example, referring to fig. 4, in the page to be judged 400A, the area 402A is an adjacent area of the first answering area 401A, the area 403A of the page to be judged 400A is an adjacent area of the first answering area 401A, and an area obtained by combining the area 402A with the first answering area 401A is a forward extension area of the first answering area 401A. In the answer sheet 400B, the region 402B is a neighboring region of the second answering region 401B, the region 403B is a neighboring region of the second answering region 401B, and the region obtained by combining the region 402B with the second answering region 401B is a forward extension region of the second answering region 401B.
In one embodiment, the first test question section may include a first sub-section, and the second test question section may include a second sub-section.
In one embodiment, referring to fig. 5, step S301 includes:
s501, acquiring a first expansion area containing a first answering area and a first test question segment based on the first answering area contained in the page to be judged;
for example, the first response area may be expanded forward to obtain the first expanded area, or the first response area may be expanded backward to obtain the first expanded area, or the first response area may be expanded forward and backward to obtain the first expanded area, and when expanded, the first response area is expanded by the length of the first response area, for example, the first response area is expanded forward to obtain the same size of the first response area, for example, the first response area is expanded backward to obtain the same size of the first response area.
For example, referring to fig. 4, in the page to be evaluated 400A, the region 402A is an adjacent region of the first answering region 401A, the region 403A of the page to be evaluated 400A is an adjacent region of the first answering region 401A, the region obtained by combining the region 402A with the first answering region 401A is a forward extension region of the first answering region 401A, the region obtained by combining the region 403A with the first answering region 401A is a backward extension region of the first answering region 401A, and the region obtained by combining the region 402A, the region 403A with the first answering region 401A is a forward extension region of the first answering region 401A. Similarly, in the answer sheet 400B, the region 402B is a neighboring region of the second answering region 401B, the region 403B is a neighboring region of the second answering region 401B, the region obtained by combining the region 402B with the second answering region 401B is a forward extension region of the second answering region 401B, the region obtained by combining the region 403B with the second answering region 401B is a forward extension region of the second answering region 401B, and the region obtained by combining the region 402B, the region 403B with the second answering region 401B is a forward extension region of the second answering region 401B.
S502, extracting a first test question segment contained in the first expansion area;
for example, when the first test question segment contained in the first extension area is extracted, the text recognition model can be used to extract the first test question segment contained in the first extension area. The text recognition model may be trained in advance, and in order to prevent the text recognition model from recognizing the non-question text of the handwriting, the text recognition model may recognize the recognition model of the text of the print volume.
Therefore, as an alternative embodiment, extracting the first test question segment contained in the first extended region may include: and extracting the printed text in the first expansion area to obtain a first test question segment. It can be known that, when the first test question segment is obtained by adopting the method, the answering text of the corresponding page to be judged is a handwritten text, and the test question text of the corresponding page to be judged is a printed text.
Of course, in a particular application, such as an examination on a computer, when the subject of the test paper is set to handwritten text, if the examinee's computer inputs answers in printed form. Then, the answering text of the page to be judged is the handwritten text, the corresponding test question text of the page to be judged is the printed text, and at this time, the handwritten text in the first expansion area can be extracted to obtain a first test question segment.
Referring to fig. 6, extracting a second test question segment included in the answer page matched with the page to be judged includes:
s601, acquiring a second expansion area containing a second answering area and a second test question segment based on the second answering area contained in the answer page;
s602, extracting a second test question segment contained in the second expansion area.
For example, when the second test question segment contained in the second expansion area is extracted, the text recognition model can be used to extract the second test question segment contained in the second expansion area. The text recognition model may be trained in advance, and in order to prevent the text recognition model from recognizing the non-question text of the handwriting, the text recognition model may recognize the recognition model of the text of the print volume.
Therefore, as an alternative embodiment, extracting the second test question segment contained in the second expansion area may include: and extracting the printed text in the second expansion area to obtain a second test question segment. It can be known that, when the second test question segment is obtained by adopting the method, the answering text of the corresponding page to be judged is a handwritten text, and the test question text of the corresponding page to be judged is a printed text.
Of course, in a particular application, such as an examination on a computer, when the subject of the test paper is set to handwritten text, if the examinee's computer inputs answers in printed form. Then, the answering text of the page to be judged is the handwritten text, the corresponding test question text of the page to be judged is the printed text, and at this time, the handwritten text in the second expansion area can be extracted to obtain a second test question segment.
Referring to fig. 7, as an embodiment of determining whether the first test question segment associated with the first answering area matches with the second test question segment associated with each second answering area, when the first test question segment matches with the text of the second test question segment, before outputting a result of the decision according to the text in the first answering area associated with the position of the first test question segment and the text in the second answering area associated with the position of the corresponding second test question segment, the method includes the following steps:
s701, determining the position deviation between the first answering area and each second answering area;
s702, determining the priority of each second answering area according to the sequence of the position deviation from small to large;
s703, judging whether the first test question fragments associated with the first answering areas are text-matched with the second test question fragments associated with the second answering areas or not according to the sequence of the priorities of the second answering areas;
if the first test question segment associated with the first answering area is matched with the text of the second test question segment associated with the second answering area with the ith priority, outputting a question judging result according to the text in the first answering area associated with the position of the first test question segment and the text in the second answering area associated with the position of the corresponding second test question segment; otherwise, judging whether a first test question segment associated with the first answering area of the (i + 1) th priority is text-matched with a second test question segment associated with the corresponding second answering area; wherein i is an integer greater than or equal to 1 and less than or equal to N, N is an integer greater than or equal to 1, and N is less than the total number of the second answering areas of the answer page.
In this embodiment, the position deviation between the first answering area and each second answering area is used to determine the priority of each second answering area, and then whether the first test question segment associated with the first answering area is text-matched with the second test question segment associated with each second answering area is determined according to the sequence of the priorities of each second answering area. Therefore, the first test question segment and the second test question segment corresponding to the first answering area and the second answering area which are most likely to be matched can be judged whether to be text matched, so that the judgment efficiency can be improved, meanwhile, when the first answering area and the second answering area are not matched, the possibility that the first test question segment corresponding to the first answering area is the same as the second test question segment of the second answering area still exists, the reason for the situation is that different test question segments have the same test question segment, and in the embodiment, the first test question segment related to the first answering area and the second test question segment related to each second answering area are judged whether to be text matched or not by utilizing the sequence of the priorities of the second answering areas determined based on the position deviation, so that the probability of misjudgment can be reduced.
Accordingly, when it is determined that the second answering areas of the first priority and the second priority do not match the corresponding first answering areas, since the probability of matching the second answering areas of the remaining priorities with the corresponding first answering areas is extremely low, and because of the above-mentioned misjudgment, the matching judgment of the second answering areas of the remaining priorities with the corresponding first answering areas can be stopped, so as to reduce unnecessary useless judgment and also reduce the probability of misjudgment. After the judgment is stopped, the question of the judgment stop can be manually judged at that time by adopting manual work and the like.
In step S701, the position deviation of the first response region and the second response region is used to indicate the position relationship between the first response region and the second response region, so that the smaller the position deviation, the more likely the corresponding first response region and the second response region match. For example, if a first response area is located near the center coordinates of the page to be evaluated and a second response area is located near the center coordinates of the page to be evaluated, the position deviation between the first response area and the second response area is small, and accordingly, the probability of matching between the first response area and the second response area is high.
Referring to fig. 8, the specific steps of step S701 may include:
s801, determining a position quantization parameter of the first answering area and a position quantization parameter of each second answering area;
s802, according to the position quantization parameter of the first answering area and the position quantization parameter of the corresponding second answering area, the position deviation of the first answering area and the corresponding second answering area is obtained.
The position quantization parameter of the first answering area in step S801 is a quantization parameter that can be used to indicate the position of the first answering area, for example, the quantization parameter of the first answering area may be a first multidimensional vector, and the first multidimensional vector is a distance matrix between a center point of the first answering area and a reference position of the page to be answered; the position quantization parameter of the second answering area is a quantization parameter which can be used for indicating the position of the second answering area; for example, the quantization parameter of the second response area is a second multidimensional vector, and the second multidimensional vector is a distance matrix between a central point of the second response area and the reference position of the page to be judged.
When the quantization parameter of the first answering area can be a first multi-dimensional vector and the position quantization parameter of the second answering area is a second multi-dimensional vector, the same position element attribute of the first multi-dimensional vector and the second multi-dimensional vector is the same. For example, when the first multidimensional vector is the distance between the top left vertex and the center of the first answering area, the second multidimensional vector is the distance between the top left vertex and the center of the second answering area.
The present disclosure exemplifies one form of the position quantization parameter of the first answering area and the position quantization parameter of the second answering area, but it is not limited that the position quantization parameter of the first answering area and the position quantization parameter of the second answering area may only be the one form, and the position quantization parameter of the first answering area and the position quantization parameter of the second answering area may be other forms of parameters capable of determining the position deviation of the first answering area and the second answering area, for example, the quantization parameter of the first answering area may be a relative coordinate parameter of the first answering area in the page to be answered, and the quantization parameter of the second answering area may be a relative coordinate parameter of the second answering area in the page to be answered.
Referring to fig. 9, the step S801 may include:
s901, quantizing the position information of the first answering area by using the reference position of the page to be judged to obtain a quantization parameter of the first answering area;
and S902, quantizing the position information of the second answering area by using the reference position of the answer page to obtain a quantization parameter of the second answering area.
In step S901, the reference position of the page to be judged may include at least one vertex position and/or a point position on at least one edge of the page to be judged, and the reference position of the answer page includes at least one vertex position and/or a point position on at least one edge of the answer page.
For example, the reference positions of the pages to be judged are a top left corner vertex position and a top middle point position, and the reference positions of the answer pages are a top left corner vertex position and a top middle point position;
since the page to be judged may have a background, and 4 vertexes of the page to be judged cannot be obtained visually, the vertex positions can be identified by various detection models, for example, by using a target detection network model centret. The target detection network model mainly comprises a residual error network for extracting picture features, a deconvolution module for performing upsampling on a feature map, and a branch convolution network for predicting the width and height of a target and the coordinates of a central point of the target, wherein the known neural network model is trained according to training samples corresponding to specific application occasions of the present disclosure. It is understood that the neural network model may be other network models that can implement the detection of the answering area.
When the reference position of the page to be judged comprises at least one vertex position of the page to be judged, the at least one vertex position of the page to be judged at least comprises two vertex positions which are opposite angles with each other of the page to be judged; when the reference position of the answer page comprises at least one vertex position of the answer page, the at least one vertex position of the answer page at least comprises two vertex positions which are opposite angles with each other of the answer page. The vertex positions of the two opposite angles of the page to be judged better indicate that the corresponding page to be judged is the whole position information, and the vertex positions of the two opposite angles of the answer page better indicate the whole position information of the corresponding answer page, so that when the position information of the first answering area and the second answering area can be quantized by utilizing the reference position comprising the vertex positions of the two opposite angles, the obtained quantization parameter can better express the position information of the first answering area or the position information of the second answering area.
When the reference position of the page to be judged comprises the point position on at least one edge, the point position on at least one edge is the midpoint position of the corresponding edge of the page to be judged. When the reference position of the answer page comprises a point position on at least one edge, the point position on at least one edge is a midpoint position of the corresponding edge of the answer page;
the reference position of the page to be judged may include a half position point of the adjacent vertex position of the page to be judged, and the reference position of the answer page may include a half position point of the adjacent vertex position of the page to be judged.
The document processing method of the embodiment of the disclosure may use an existing detection module to detect the vertex position, for example, identify the vertex of the page to be judged or the answer page by using a neural network model capable of detecting the vertex, where the neural network model may be a target detection network model. The target detection network model mainly comprises a residual error network for extracting picture features, a deconvolution module for performing upsampling on a feature map, and a branch convolution network for predicting the width and height of a target and the coordinates of a central point of the target, wherein the known neural network model is trained according to training samples corresponding to specific application occasions of the present disclosure. It will be appreciated that the neural network model may be any other network model that can implement vertex position detection.
In one embodiment, the reference position of the page to be judged comprises four vertexes of the page to be judged and four half quantiles, wherein the four half quantiles are respectively half position points of an edge formed by every two adjacent vertexes in the four vertexes; calculating first distances from the coordinates of the center point of each first answering area to four vertexes and four half-points of the page to be judged, and obtaining the quantization parameters of the first answering areas according to the first distances.
Similarly, the reference position of the answer page comprises four vertexes of the answer page and four half-parting points, wherein the four half-parting points are respectively half-position points of an edge formed by every two adjacent vertexes in the four vertexes of the answer page; and calculating second distances from the coordinates of the center point of each second answering area to four vertexes and four half-points of the answer page, and obtaining the quantization parameters of the first answering area according to the second distances.
Referring to fig. 10, fig. 10 discloses a document processing apparatus including:
the first extraction module 1001 is used for extracting a first test question segment contained in the page to be judged, and the first test question segment is associated with a first answering area contained in the page to be judged;
a second extraction module 1002, configured to extract a second test question segment included in the answer page matched with the page to be judged, where the second test question segment is associated with a corresponding second answering area included in the answer page;
the output module 1003 is configured to, when the text of the first test question segment matches the text of the second test question segment, output a question deciding result according to the text in the first answering area associated with the position of the first test question segment and the text in the second answering area associated with the position of the corresponding second test question segment.
In an alternative embodiment, the positional association of the first question segments with the respective first answering areas includes the positional adjacency of the first question segments with the respective first answering areas, and the positional association of the second question segments with the respective second answering areas includes the positional adjacency of the second question segments with the respective second answering areas.
In an optional embodiment, the positional association of the first test question segment with the corresponding first answering area includes a preset number of characters spaced between the first test question segment and the corresponding first answering area, and the positional association of the second test question segment with the corresponding second answering area includes a preset number of characters spaced between the second test question segment and the corresponding second answering area.
In an alternative embodiment, the first test question segment comprises a first sub-segment, and the second test question segment comprises a second sub-segment;
in an alternative embodiment, the first question segment includes two first sub-segments, and the second question segment includes two second sub-segments, wherein the first answering area is located between the two first sub-segments, and the second answering area is located between the two second sub-segments.
In an optional embodiment, the first extracting module 1001, when configured to extract a first test question segment included in a page to be judged, is specifically configured to:
acquiring a first expansion area containing a first answering area and a first test question segment based on the first answering area contained in the page to be judged;
extracting a first test question segment contained in the first expansion area;
extracting a second test question segment contained in the answer page matched with the page to be judged, wherein the second test question segment comprises:
acquiring a second expansion area containing a second answering area and a second test question segment based on a second answering area contained in the answer page;
and extracting a second test question segment contained in the second expansion area.
In an optional embodiment, the answer text of the page to be judged is a handwritten text, and the test question text of the page to be judged is a printed text;
in an optional embodiment, the first extracting module 1001, when extracting the first test question segment contained in the first extended region, is specifically configured to: extracting a print text in the first expansion area to obtain a first test question segment;
correspondingly, the answer text of the answer page is handwritten text, and the test question text of the answer page is printed text.
The second extracting module 1002 is configured to, when extracting a second test question segment included in the answer page matched with the page to be judged, specifically: and extracting the printed text in the second expansion area to obtain a second test question segment.
In an optional embodiment, the answer text of the page to be judged is a printed text, and the test question texts of the page to be judged are handwritten texts;
the first extraction module is configured to, when extracting the first test question segment included in the first extended region, specifically: and extracting the handwritten text in the first expansion area to obtain a first test question segment.
The answer text of the answer page is a printed text, and the test question text of the answer page is a handwritten text;
the second extraction module, when being configured to extract the second test question segment included in the second extended region, is specifically configured to: and extracting the handwritten text in the second expansion area to obtain a second test question segment.
In an optional embodiment, the apparatus further comprises a text matching determination module, configured to:
determining a position deviation between the first answering area and each second answering area;
determining the priority of each second answering area according to the sequence of the position deviation from small to large;
judging whether the first test question fragments associated with the first answering areas are text-matched with the second test question fragments associated with the second answering areas or not according to the sequence of the priorities of the second answering areas;
if the first test question segment associated with the first answering area is matched with the text of the second test question segment associated with the second answering area with the ith priority, outputting a question judging result according to the text in the first answering area associated with the position of the first test question segment and the text in the second answering area associated with the position of the corresponding second test question segment; otherwise, judging whether a first test question segment associated with the first answering area of the (i + 1) th priority is text-matched with a second test question segment associated with the corresponding second answering area; wherein i is an integer greater than or equal to 1 and less than or equal to N, N is an integer greater than or equal to 1, and N is less than the total number of the second answering areas of the answer page.
Specifically, the text matching and determining module is configured to, when determining a position deviation between the second answer areas of the first answer area, specifically:
determining a position quantization parameter of the first answering area and a position quantization parameter of each second answering area;
and obtaining the position deviation of the first answering area and the corresponding second answering area according to the position quantization parameter of the first answering area and the position quantization parameter of the corresponding second answering area.
In an optional implementation manner, the text matching determining module, when determining the position quantization parameter of the first answer region and the position quantization parameter of each second answer region, specifically includes:
quantizing the position information of the first answering area by using the reference position of the page to be judged to obtain a quantization parameter of the first answering area;
and quantizing the position information of the second answering area by using the reference position of the answer page to obtain a quantization parameter of the second answering area.
In an alternative embodiment, the quantization parameter of the first answering area can be a first multi-dimensional vector, and the first multi-dimensional vector can be a distance matrix between a central point of the first answering area and a reference position of the page to be answered;
the quantization parameter of the second response area may be a second multidimensional vector, and the second multidimensional vector may be a distance matrix between a center point of the second response area and a reference position of the page to be judged.
In an alternative embodiment, the same-position element attributes of the first multi-dimensional vector and the second multi-dimensional vector are the same.
In an alternative embodiment, the reference position of the page to be judged may include at least one vertex position and/or a point position on at least one edge of the page to be judged, and the reference position of the answer page includes at least one vertex position and/or a point position on at least one edge of the answer page.
In an optional embodiment, when the reference position of the page to be judged comprises at least one vertex position of the page to be judged, the at least one vertex position of the page to be judged at least comprises two vertex positions which are opposite angles with each other of the page to be judged;
when the reference position of the answer page comprises at least one vertex position of the answer page, the at least one vertex position of the answer page at least comprises two vertex positions which are opposite angles with each other of the answer page.
In an optional embodiment, when the reference position of the page to be judged includes a point position on at least one edge, the point position on at least one edge is a midpoint position of the corresponding edge of the page to be judged, and the reference position of the page to be judged includes a half position point of an adjacent vertex position of the page to be judged.
An exemplary embodiment of the present disclosure also provides an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor. The memory stores a computer program executable by the at least one processor, the computer program, when executed by the at least one processor, is for causing the electronic device to perform a method according to an embodiment of the present disclosure.
The disclosed exemplary embodiments also provide a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the methods of the disclosed embodiments.
The exemplary embodiments of the present disclosure also provide a computer program product comprising a computer program, wherein the computer program, when being executed by a processor of a computer, is adapted to cause the computer to carry out the method according to the embodiments of the present disclosure.
Referring to fig. 11, a block diagram of a structure of an electronic device 1100, which may be a server or a client of the present disclosure, which is an example of a hardware device that may be applied to aspects of the present disclosure, will now be described. Electronic device is intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 11, the electronic device 1100 includes a computing unit 1101, which can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 1102 or a computer program loaded from a storage unit 1108 into a Random Access Memory (RAM) 1103. In the RAM1103, various programs and data necessary for the operation of the device 1100 may also be stored. The calculation unit 1101, the ROM1102, and the RAM1103 are connected to each other by a bus 1104. An input/output (I/O) interface 1105 is also connected to bus 1104.
A number of components in electronic device 1100 connect to I/O interface 1105, including: an input unit 1106, an output unit 1107, a storage unit 1108, and a communication unit 1109. The input unit 1106 may be any type of device capable of inputting information to the electronic device 1100, and the input unit 1106 may receive input numeric or character information and generate key signal inputs related to user settings and/or function controls of the electronic device. Output unit 1107 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, a video/audio output terminal, a vibrator, and/or a printer. The storage unit 1104 may include, but is not limited to, a magnetic disk, an optical disk. The communication unit 1109 allows the electronic device 1100 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication transceivers and/or chipsets, such as bluetooth (TM) devices, WiFi devices, WiMax devices, cellular communication devices, and/or the like.
The computing unit 1101 can be a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 1101 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 1101 performs the respective methods and processes described above. For example, in some embodiments, the aforementioned document processing methods may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 1108. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 1100 via the ROM1102 and/or the communication unit 1109. In some embodiments, the computing unit 1101 may be configured to perform the document processing method by any other suitable means (e.g., by means of firmware).
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
As used in this disclosure, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

Claims (17)

1. A method of document processing, comprising:
extracting a first test question segment contained in a page to be judged, wherein the first test question segment is associated with a first answering area contained in the page to be judged;
extracting a second test question segment contained in an answer page matched with the page to be judged, wherein the second test question segment is associated with a corresponding second answering area contained in the answer page;
and when the texts of the first test question segment and the second test question segment are matched, outputting a question judging result according to the text in the first answering area associated with the position of the first test question segment and the text in the second answering area associated with the position of the corresponding second test question segment.
2. The method of claim 1, wherein the positional association of the first test question segment with the corresponding first answering area comprises the positional association of the first test question segment with the corresponding first answering area, and the positional association of the second test question segment with the corresponding second answering area comprises the positional association of the second test question segment with the corresponding second answering area.
3. The method of claim 1, wherein the positional association of the first test question segment with the corresponding first answering area comprises a preset number of characters spaced between the first test question segment and the corresponding first answering area, and the positional association of the second test question segment with the corresponding second answering area comprises a preset number of characters spaced between the second test question segment and the corresponding second answering area.
4. The method of claim 1, wherein the first question segment comprises a first sub-segment, and the second question segment comprises a second sub-segment;
or the like, or, alternatively,
the first test question segment comprises two first sub-segments, the second test question segment comprises two second sub-segments, the first answering area is located between the two first sub-segments, and the second answering area is located between the two second sub-segments.
5. The method according to any one of claims 1 to 4, wherein the extracting of the first test question segment contained in the page to be judged comprises:
acquiring a first expansion area containing the first answering area and the first test question segment based on the first answering area contained in the page to be judged;
extracting a first test question segment contained in the first expansion area;
the extracting of the second test question segment contained in the answer page matched with the page to be judged comprises:
acquiring a second expansion area containing a second answering area and a second test question segment based on the second answering area contained in the answer page;
and extracting a second test question segment contained in the second expansion area.
6. The method according to claim 5, wherein the answer text of the page to be judged is handwritten text, and the test question text of the page to be judged is printed text;
the extracting of the first test question segment contained in the first extension area comprises: extracting a print text in the first expansion area to obtain the first test question segment;
or the like, or, alternatively,
the answer text of the page to be judged is a printed text, and the test question text of the page to be judged is a handwritten text;
the extracting of the first test question segment contained in the first extension area comprises: and extracting the handwritten text in the first expansion area to obtain the first test question segment.
7. The method of claim 1, wherein before outputting a question result according to the text in the first answering area associated with the first test question segment position and the text in the second answering area associated with the corresponding second test question segment position when the first test question segment matches the text in the second test question segment, the method comprises:
determining a positional deviation between the first response area and each of the second response areas;
determining the priority of each second answering area according to the sequence of the position deviation from small to large;
judging whether the first test question fragments associated with the first answering areas are text-matched with the second test question fragments associated with the second answering areas or not according to the sequence of the priorities of the second answering areas;
if the first test question segment associated with the first answering area is matched with the text of the second test question segment associated with the second answering area at the ith priority, executing the output of a question judgment result according to the text in the first answering area associated with the position of the first test question segment and the text in the second answering area associated with the position of the corresponding second test question segment; otherwise, judging whether the first test question segment associated with the first answering area with the (i + 1) th priority is text-matched with the second test question segment associated with the corresponding second answering area or not; wherein i is an integer greater than or equal to 1 and less than or equal to N, N is an integer greater than or equal to 1, and N is less than the total number of the second answering areas of the answer page.
8. The method of claim 7, wherein said determining a positional deviation between each of said second response regions of said first response region comprises:
determining a position quantization parameter of the first answering area and a position quantization parameter of each second answering area;
and obtaining the position deviation of the first answering area and the corresponding second answering area according to the position quantization parameter of the first answering area and the position quantization parameter of the corresponding second answering area.
9. The method of claim 8, wherein determining the location quantization parameter for the first response region and the location quantization parameter for each of the second response regions comprises:
quantizing the position information of the first answering area by using the reference position of the page to be judged to obtain a quantization parameter of the first answering area;
and quantizing the position information of the second answering area by using the reference position of the answer page to obtain a quantization parameter of the second answering area.
10. The method according to claim 9, wherein the quantization parameter of the first answering area is a first multidimensional vector, and the first multidimensional vector is a distance matrix between a center point of the first answering area and a reference position of the page to be answered;
the quantization parameter of the second answering area is a second multidimensional vector, and the second multidimensional vector is: and the distance matrix between the central point of the second answering area and the reference position of the page to be judged.
11. The method of claim 10, wherein the same positional element attributes of the first multidimensional vector and the second multidimensional vector are the same.
12. The method according to any one of claims 8 to 10, wherein the reference position of the page to be judged comprises at least one vertex position and/or a point position on at least one edge of the page to be judged, and the reference position of the answer page comprises at least one vertex position and/or a point position on at least one edge of the answer page.
13. The method according to claim 12, characterized in that, when the reference position of the page to be judged comprises at least one vertex position of the page to be judged, the at least one vertex position of the page to be judged comprises at least two vertex positions of the page to be judged which are diagonal to each other;
when the reference position of the answer page comprises at least one vertex position of the answer page, the at least one vertex position of the answer page at least comprises two vertex positions which are opposite angles with each other of the answer page.
14. The method according to claim 13, wherein when the reference position of the page to be evaluated comprises a point position on at least one edge, the point position on at least one edge is a midpoint position of the corresponding edge of the page to be evaluated;
the reference position of the page to be judged comprises a half position point of the page to be judged, which is adjacent to the vertex position.
15. A document processing apparatus, comprising:
the first extraction module is used for extracting a first test question segment contained in a page to be judged, and the first test question segment is associated with a first answering area contained in the page to be judged;
the second extraction module is used for extracting a second test question segment contained in the answer page matched with the page to be judged, and the second test question segment is associated with a corresponding second answering area contained in the answer page;
and the output module is used for outputting a question judging result according to the text in the first answering area associated with the position of the first test question segment and the text in the second answering area associated with the position of the corresponding second test question segment when the texts of the first test question segment and the second test question segment are matched.
16. An electronic device, comprising:
a processor; and the number of the first and second groups,
a memory for storing a program, wherein the program is stored in the memory,
wherein the program comprises instructions which, when executed by the processor, cause the processor to carry out the method according to any one of claims 1-14.
17. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-14.
CN202110800924.XA 2021-07-15 2021-07-15 Document processing method and device, electronic equipment and computer readable storage medium Active CN113255629B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110800924.XA CN113255629B (en) 2021-07-15 2021-07-15 Document processing method and device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110800924.XA CN113255629B (en) 2021-07-15 2021-07-15 Document processing method and device, electronic equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN113255629A true CN113255629A (en) 2021-08-13
CN113255629B CN113255629B (en) 2022-08-02

Family

ID=77180395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110800924.XA Active CN113255629B (en) 2021-07-15 2021-07-15 Document processing method and device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN113255629B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486171A (en) * 2021-09-07 2021-10-08 北京世纪好未来教育科技有限公司 Image processing method and device and electronic equipment

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105469337A (en) * 2016-01-11 2016-04-06 北京华派科技股份有限公司 Data processing method and data processing device
US20160328469A1 (en) * 2015-05-04 2016-11-10 Shanghai Xiaoi Robot Technology Co., Ltd. Method, Device and Equipment for Acquiring Answer Information
CN108764201A (en) * 2018-06-06 2018-11-06 宁波宁大教育设备有限公司 A kind of intelligence correction method
CN109815932A (en) * 2019-02-02 2019-05-28 杭州大拿科技股份有限公司 A kind of paper corrects method, apparatus, electronic equipment and storage medium
CN111738249A (en) * 2020-08-26 2020-10-02 北京易真学思教育科技有限公司 Image detection method, image detection device, electronic equipment and storage medium
CN112000796A (en) * 2020-08-05 2020-11-27 深圳中兴网信科技有限公司 Answer generation method, electronic device and computer-readable storage medium
CN112307158A (en) * 2019-08-30 2021-02-02 北京字节跳动网络技术有限公司 Information retrieval method, device, equipment and medium
CN112990180A (en) * 2021-04-29 2021-06-18 北京世纪好未来教育科技有限公司 Question judging method, device, equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160328469A1 (en) * 2015-05-04 2016-11-10 Shanghai Xiaoi Robot Technology Co., Ltd. Method, Device and Equipment for Acquiring Answer Information
CN105469337A (en) * 2016-01-11 2016-04-06 北京华派科技股份有限公司 Data processing method and data processing device
CN108764201A (en) * 2018-06-06 2018-11-06 宁波宁大教育设备有限公司 A kind of intelligence correction method
CN109815932A (en) * 2019-02-02 2019-05-28 杭州大拿科技股份有限公司 A kind of paper corrects method, apparatus, electronic equipment and storage medium
CN112307158A (en) * 2019-08-30 2021-02-02 北京字节跳动网络技术有限公司 Information retrieval method, device, equipment and medium
CN112000796A (en) * 2020-08-05 2020-11-27 深圳中兴网信科技有限公司 Answer generation method, electronic device and computer-readable storage medium
CN111738249A (en) * 2020-08-26 2020-10-02 北京易真学思教育科技有限公司 Image detection method, image detection device, electronic equipment and storage medium
CN112990180A (en) * 2021-04-29 2021-06-18 北京世纪好未来教育科技有限公司 Question judging method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486171A (en) * 2021-09-07 2021-10-08 北京世纪好未来教育科技有限公司 Image processing method and device and electronic equipment
CN113486171B (en) * 2021-09-07 2021-12-03 北京世纪好未来教育科技有限公司 Image processing method and device and electronic equipment

Also Published As

Publication number Publication date
CN113255629B (en) 2022-08-02

Similar Documents

Publication Publication Date Title
US11275961B2 (en) Character image processing method and apparatus, device, and storage medium
US9436883B2 (en) Collaborative text detection and recognition
US9721156B2 (en) Gift card recognition using a camera
US10438086B2 (en) Image information recognition processing method and device, and computer storage medium
CN114155546B (en) Image correction method and device, electronic equipment and storage medium
CN113343958B (en) Text recognition method, device, equipment and medium
CN111104813A (en) Two-dimensional code image key point detection method and device, electronic equipment and storage medium
CN113255629B (en) Document processing method and device, electronic equipment and computer readable storage medium
CN105100616B (en) Image processing method and electronic equipment
CN108304840B (en) Image data processing method and device
CN113486171B (en) Image processing method and device and electronic equipment
CN113326815B (en) Document processing method and device, electronic equipment and readable storage medium
CN115273057A (en) Text recognition method and device, dictation correction method and device and electronic equipment
CN114627464A (en) Text recognition method and device, electronic equipment and storage medium
KR102396885B1 (en) Method for alligning the image include the text
CN114120305A (en) Training method of text classification model, and recognition method and device of text content
CN113254626A (en) Document processing method and device, electronic equipment and readable storage medium
CN113850805B (en) Multi-document detection method and device, electronic equipment and storage medium
CN113850239B (en) Multi-document detection method and device, electronic equipment and storage medium
CN113850238B (en) Document detection method and device, electronic equipment and storage medium
US11657649B2 (en) Classification of subjects within a digital image
CN114118075B (en) Text recognition method and device, electronic equipment and storage medium
CN113688809B (en) Model training method, text removal method and related device
CN116386049A (en) Formula identification method, device, equipment and storage medium
CN115063822A (en) Document detection method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant