CN110705534B

CN110705534B - Wrong problem book generation method suitable for electronic typoscope

Info

Publication number: CN110705534B
Application number: CN201910873321.5A
Authority: CN
Inventors: 郑雅羽; 林斯霞; 石俊山
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2019-09-17
Filing date: 2019-09-17
Publication date: 2022-06-14
Anticipated expiration: 2039-09-17
Also published as: CN110705534A

Abstract

The invention relates to a wrong question book generating method suitable for an electronic typoscope, wherein the electronic typoscope collects a complete test paper image in a non-column format, stores the test paper image into a buffer area after preprocessing and image segmentation, identifies a correction mark in the test paper by a neural network model, compares the coordinate midpoint of a target frame of the identified correction mark with the coordinate points of all questions of the segmented test paper image and identifies wrong questions, distinguishes handwritten fonts and printed fonts and identifies characters of the wrong question image, and the processed wrong question images are spliced to generate the wrong question book. The invention overcomes the uncertainty of the precision of the segmentation by the neural network and the complexity of the collection and the labeling of the training data, solves the problem that the prior art can not automatically intercept wrong areas, erase correction marks and wrong answers, improves the precision of wrong test books, can be implanted into an electronic typoscope, enables low-vision patients to generate wrong test books without manual copying or manually intercept wrong areas, erase question making areas and correction marks, saves time, and can read the wrong test books and make questions again.

Description

Wrong problem book generation method suitable for electronic typoscope

Technical Field

The invention belongs to data identification; a data representation; a record carrier; the technical field of processing of record carriers, in particular to an error book generating method which can be used as a learning auxiliary tool and is suitable for an electronic typoscope.

Background

The wrong problem book is an important means which can improve the learning efficiency, improve the learning quality and consolidate the learning foundation.

At present, most students still copy wrong questions into a notebook to generate wrong-question books in a hand-copying mode, and the generation of the wrong-question books in the mode takes much time for the students and reduces the learning efficiency; but for low vision people, the handsheet is more cumbersome. In general, a person with low vision needs to wear a visual aid to carry out handsheet; if the wrong question identification is carried out after the mobile phone is used for photographing, the angle and the height of the mobile phone need to be adjusted to photograph each time, which is very inconvenient; if the test paper image obtained by the scanner is transmitted to a computer or a mobile phone, the process is complex and the portability is not realized.

Many existing wrong-question-book software can be installed in an intelligent terminal and taken pictures to generate wrong-question books, but the software needs to manually intercept wrong-question areas, erase correction marks and wrong answers, and cannot generate the wrong-question books in a full-intelligent and automatic manner. The patent with publication number CN109472014A is to obtain image information by user's photographing or scanning, use an identification module based on a.i algorithm to perform character and picture identification technology, obtain the question stem and answer of the wrong question, compare the question stem of the wrong question with the questions in the question bank, and mark the evaluation value of the similarity between the questions and the wrong questions in the question bank, and store the questions with high similarity into the wrong question bank of the user, so as to form the wrong question book of the user; the method for generating the wrong question book can only obtain the wrong question area, needs the questions in the question bank, needs the user to manually erase the answer part and other handwritten items to reserve the question stem, and cannot generate the wrong question book when the question bank is not available or the whole test paper is introduced. The patent with publication number CN109710590A identifies the areas of each question in the approved test paper by the first area model based on the neural network, identifies the approved result as the wrong question by the pre-trained wrong question identification model, identifies the answer area and/or the approved area of the wrong question according to the pre-trained second area identification model, and masks the answer area and/or the approved area of the wrong question to generate the wrong question book; the method for generating the wrong problem book can overcome the defects that wrong problem book software and a patent with the publication number of CN109472014A cannot automatically identify wrong problem areas and erase correction marks and wrong answers, but the method needs to train 3 models based on a neural network for generating the wrong problem book, a large number of corrected samples need to be collected for training, the types of test paper questions of all subjects in all grades are different, the complexity is high, the accuracy of the trained models cannot be guaranteed, the second area model covers the wrong correction marks, and the problem characters of the correction marks on the questions can be covered.

Disclosure of Invention

The invention solves the problems in the prior art, provides an optimized wrong question generating method suitable for an electronic typoscope, realizes the segmentation of test paper questions and the recognition of wrong questions through a digital image processing technology and a neural network technology, enables low-vision patients to directly use the electronic typoscope to generate the wrong question book and read the wrong question book, realizes the automatic recognition of the wrong question area, erases an answer area and a correction mark, and enables the low-vision patients to read the wrong question original questions recognized by OCR by using a TTS voice function, thereby improving the learning efficiency.

The technical scheme adopted by the invention is that the method for generating the wrong problem book suitable for the electronic typoscope comprises the following steps:

step 1: the electronic typoscope collects complete test paper images in a non-column format and preprocesses the collected test paper images;

step 2: carrying out image segmentation on the preprocessed test paper image, and storing the segmented image into a buffer area of the electronic typoscope;

and step 3: reading the pixel value of the image in the buffer area, and identifying the correction mark;

and 4, step 4: obtaining a trained neural network model for identifying the correction mark, wherein the neural network model is used for identifying the correction mark in the test paper;

and 5: comparing the coordinate midpoint of the target frame of the recognized correction mark with the coordinate points of all the subjects of the segmented test paper image; if any midpoint coordinate falls into the coordinate range of any topic, the current topic is considered as a wrong topic;

step 6: distinguishing handwritten fonts and printed fonts of wrong question images corresponding to wrong questions, and performing character recognition on the wrong questions;

and 7: and after the recognition is finished, the electronic typoscope splices the processed wrong question images to generate a wrong question book.

Preferably, in step 1, the pretreatment comprises the following steps:

step 1.1: performing inclination correction on the test paper image so that the left edge and the lower edge of the test paper image are flush with the standard line;

step 1.2: carrying out graying processing on the test paper image after inclination correction, and carrying out binarization processing on the processed image by taking a gray value as a threshold value;

step 1.3: and (3) amplifying the length and the width of the processed test paper image to N times in equal proportion, wherein N is more than or equal to 1.

Preferably, the step 2 comprises the steps of:

step 2.1: horizontally projecting the preprocessed test paper images on a Y axis in sequence, and segmenting the images based on the horizontal projection to obtain a plurality of lines of images;

step 2.2: storing all the line images in sequence to obtain a complete line image set to be processed;

step 2.3: carrying out vertical projection on each line of images on an X axis according to the sequence, and judging the cutting position coordinates of each question in the test paper image by combining OCR recognition; and cutting the test paper image according to the cutting position coordinates of each question to obtain a target image of each question of the test paper.

Preferably, said step 2.1 comprises the steps of:

step 2.1.1: scanning the preprocessed test paper image line by line from top to bottom, and calculating pixels of each scanning line;

step 2.1.2: acquiring horizontal projection of the image, and determining the position of a character line according to a horizontal projection value;

step 2.1.3: identifying intervals between character lines by blank gaps of horizontal projection, and dividing the character lines according to lines to obtain a plurality of line images;

step 2.1.4: and taking the upper left corner of the initial test paper image as an origin, and recording the coordinates of the upper left corner and the coordinates of the lower right corner of each line of image.

Preferably, said step 2.3 comprises the steps of:

step 2.3.1: calling an OCR recognition unit of the electronic typoscope to vertically project the segmented line image, and taking the position of the first pixel value appearing from left to right of each line as the initial position of the current line image to obtain a corresponding initial position OCR recognition result; taking the first line image as a current line;

step 2.3.2: the OCR recognition unit processes the current line to obtain a recognition result after the line image of the current line is vertically projected;

if the identification result of the initial position is the theme, recording the coordinate point at the upper left corner and the coordinate point at the lower right corner of the current line image, and carrying out the next step;

otherwise, the coordinate point is not recorded, and the next step is directly carried out;

step 2.3.3: the next line is the current line, and the OCR recognition unit processes the current line to obtain a recognition result after the line image of the current line is vertically projected;

if continuous lower case numbers and symbols exist in the recognition result of the initial position of the current line, or the transverse coordinate value of the initial position is larger than that of the initial position of the line image of the previous line, the current line is the first line starting from a support stem, and the upper left corner coordinate of the current line is taken as a new upper left corner coordinate, and the lower right corner coordinate of the current line is taken as a new lower right corner coordinate, and the next step is carried out;

otherwise, directly recording the coordinate point of the lower right corner of the current row, and repeating the step 2.3.3;

step 2.3.4: the next line is the current line, and the OCR recognition unit processes the current line to obtain a recognition result after the line image of the current line is vertically projected;

if continuous lower case numbers and symbols do not exist in the recognition result of the initial position of the current line and the transverse coordinate value of the initial position is less than or equal to the transverse coordinate value of the initial position of the line image of the previous line, the current line and the previous line are the same topic, the coordinate of the lower right corner of the current line is taken as the updated coordinate of the lower right corner, and the step 2.3.4 is repeated;

if the recognition result of the initial position of the current line is continuous English and symbols and the horizontal coordinate value of the initial position is greater than or equal to the horizontal coordinate value of the initial position of the line image of the previous line, the current line and the previous line are considered to be the same question and an option of a choice question, the coordinate of the lower right corner of the current line is taken as the updated coordinate of the lower right corner, and the step 2.3.4 is repeated;

if continuous lower case numbers and symbols exist in the recognition result of the initial position of the current line, or the transverse coordinate value of the initial position is larger than that of the initial position of the line image of the previous line, the current line is considered to be the first line from which the next support stem starts, and the step 2.3.3 is returned;

if the identification result of the initial position of the current line is the theme trunk, returning to the step 2.3.2;

if the current line image is empty, the next step is carried out;

step 2.3.5: dividing each main question stem and each branch question stem according to a group of all upper left corner coordinate points and corresponding lower right corner coordinate points, sequencing all the divided main question stems and branch question stems, and storing the main question stems and the branch question stems in sequence;

step 2.3.6: and correspondingly marking each main question stem and each branch question stem.

Preferably, the beginning line of the theme stem is the beginning line when the identification result of the beginning position of any line is that continuous capital figures and symbols exist, Chinese characters or the combination of Chinese characters and figures and/or English characters start with special symbols.

Preferably, the step 3 comprises the steps of:

step 3.1: distinguishing a foreground from a background for the input image based on a threshold;

step 3.2: determining the colors of the foreground and the background according to the threshold value and generating a color table;

step 3.3: acquiring addresses of pixels stored in an electronic vision aid buffer area, and determining the value of each pixel point;

step 3.4: determining a color range through pixel point values, determining pixel points of correction identification colors in the image, and finding out and marking correction identification color areas in the image by utilizing the connected areas;

step 3.5: and extracting the contour of the correction identification color area, recording the corresponding position of the current contour in the image, and placing the position in an output address.

Preferably, in the step 4, identifying the modification identifier in the test paper includes the following steps:

step 4.1: testing the test paper image by using the trained model, and recording the identified coordinate points of the error identifications;

step 4.2: reading a corresponding coordinate point by using the address value of the correction identification color area in the output buffer area;

step 4.3: determining the wrong question mark in the test paper image by the coordinate point IoU of the step 4.1 and the step 4.2;

step 4.4: and acquiring 4 angular point coordinate values of the target frame corresponding to the determined wrong mark and the midpoint of the target frame.

Preferably, the step 6 comprises the steps of:

step 6.1: training a neural network model for distinguishing handwritten fonts and printed fonts, and taking the printed fonts as test paper questions;

step 6.2: inputting the error problem image in the step 5; after the model identifies the handwritten font, the color of the font is judged, and if the handwritten font is in the color area of the correction mark, the font is not erased; storing the processed wrong-question image into an image file management library of the electronic typoscope;

step 6.3: and (3) carrying out OCR character recognition on the picture processed in the step (6.2) by using an OCR function carried by the electronic typoscope, and storing a corresponding OCR character recognition result into a picture file management library of the electronic typoscope according to a certain format.

Preferably, said step 7 comprises the steps of:

step 7.1: confirming that all image recognition is completed;

step 7.2: acquiring an image corresponding to the theme trunk processed in the step 2 and an error theme image processed in the step 6;

step 7.3: arranging the error pictures in sequence; taking the first error image as the current error image;

step 7.4: taking an image corresponding to a first theme stem searched upwards from a coordinate point of the current wrong question image as the theme stem corresponding to the current wrong question; if the called times of the image corresponding to the theme trunk are 0, splicing the current wrong question image and the image corresponding to the theme trunk, and carrying out the next step, otherwise, splicing the current wrong question image under the previous wrong question image, and carrying out the next step;

step 7.5: if the next wrong image exists, taking the next wrong image as the current wrong image, returning to the step 7.4, otherwise, finishing splicing, and carrying out the next step;

step 7.6: and generating an error problem book.

The invention provides an optimized error question generating method suitable for an electronic visual aid, which comprises the steps of collecting a complete test paper image in a non-column format through the electronic visual aid, preprocessing the test paper image, dividing the image, storing the image into a buffer area of the electronic visual aid, identifying a correction mark based on a pixel value of the image in the buffer area, identifying the correction mark in the test paper by using a trained neural network model, comparing a coordinate midpoint of a target frame of the identified correction mark with a coordinate point of each question of the divided test paper image, considering the current question as an error question when any midpoint coordinate falls into a coordinate range of any question, distinguishing a handwritten font from a printed font of the error question image, identifying the error question by using characters, and splicing the processed error question images after the identification is finished to generate an error question book.

According to the method, each question of the test paper is segmented by a horizontal projection method and a vertical projection method in an image processing technology, so that the uncertainty of the segmentation precision by a neural network and the complexity of training data collection and labeling are overcome; the correction mark is recognized by the target detection network, the wrong question is recognized by combining the image processing technology, the printed font and the handwritten font are distinguished through the two-classification neural network, the handwritten font is erased, the wrong question after the handwritten font is erased is recognized through the OCR recognition function of the electronic visual aid, the correction mark is removed, the problems that the wrong region cannot be automatically intercepted, the correction mark cannot be erased and wrong answers cannot be automatically solved in the prior art are solved, the precision of the wrong exercise book can be improved, the function of wrong exercise recognition can be implanted into the electronic visual aid, the wrong exercise book cannot be generated by a low-vision patient in a manual copying mode, the wrong exercise book cannot be intercepted manually, the exercise area cannot be erased, the exercise area and the correction mark cannot be erased, time is effectively saved, and the wrong exercise book can be read and cannot be done again in multiple modes.

Drawings

FIG. 1 is a flow chart of the present invention.

Detailed Description

The present invention is described in further detail with reference to the following examples, but the scope of the present invention is not limited thereto.

The invention relates to a wrong answer book generating method suitable for an electronic typoscope, wherein the electronic typoscope is the most effective device with rich functions in the existing typoscope and accords with human engineering, and the work is finished by adjusting the angle and the height of the electronic typoscope at one time, fixing and acquiring images through a rear camera of the electronic typoscope; the electronic typoscope has the functions of image real-time display, stepless zooming, multi-mode color change, OCR recognition, TTS voice reading and the like, is easy to carry, can perform wrong question recognition in real time and assists in generating wrong question books of low-vision people by utilizing the original functions of the electronic typoscope.

The invention mainly adopts the electronic typoscope suitable for the wrong answer book generation method of the electronic typoscope, comprising a typoscope body, wherein a controller is arranged in the typoscope, a display screen is arranged on the typoscope, a rear camera and an LED are arranged at the back of the display screen in a matching way, a USB port is arranged at the side edge of the display screen, and the display screen, the rear camera, the LED and the USB are connected with the controller; in order to acquire good image quality under low-light conditions, an LED is additionally arranged on the rear camera side of the electronic typoscope, and a USB port is arranged beside the electronic typoscope in order to lead the wrong-answer book out of the electronic typoscope for printing and storing.

The invention obtains a test paper image through a rear camera of the electronic visual aid, then performs inclination correction on the collected test paper image, divides the image through horizontal projection to obtain coordinates of each line, obtains upper right corner coordinate points and lower left corner coordinate points of each question of the test paper through vertical projection combined with an OCR function of the electronic visual aid, divides each question of the test paper, identifies wrong question marks through a correction mark identification model to confirm the wrong question marks, erases a question making area of a student through distinguishing models of handwritten fonts and printed fonts, reduces or even eliminates the influence of the correction marks on the question purposes through an OCR identification function of the electronic visual aid, and further combines the OCR identification function and the image splicing to generate a wrong question book.

In the invention, after the electronic typoscope is started, the electronic typoscope enters a real-time mode, and a low-vision patient can enter a mode selection interface through a key, select a corresponding mode control and enter a corresponding mode; if the user selects to click and browse the image, a camera of the electronic typoscope starts to acquire the image, and the image is displayed on a screen after being amplified, discolored and the like for the low-vision patient to read; if the wrong question recognition is selected, entering a wrong question recognition mode, starting to collect images by a camera of the electronic typoscope, and carrying out wrong question recognition on the collected images to generate a wrong question book.

The method comprises the following steps.

Step 1: the electronic typoscope collects complete test paper images in a non-column format and preprocesses the collected test paper images.

In the step 1, the pretreatment comprises the following steps:

step 1.1: performing inclination correction on the test paper image to enable the left edge and the lower edge of the test paper image to be flush with the standard line;

In the invention, the test paper is appointed to be a test paper for a conventional lower grade, such as a third grade of primary school; in the test paper, the convention is that the main subject and the subject stem are like 'one' and fills in blank. (1 point per empty, 22 points in total) "," two, and judgment. For the contents of (check √ correct check ×) (5 points), "small questions are defined as questions starting under each big question and after indentation.

In the invention, the test paper format needs to be a non-column format, and if the test paper format is a column format, the test paper format needs to be folded and then placed under the camera for collection.

In the invention, the image inclination correction is a common technology in the field, can be realized by a plurality of processing means, and mainly aims at correcting the inclination of a certain angle generated when the test paper image is output and displayed or sent to a wrong question recognition module because the test paper cannot be placed and shot at the same angle when the low-vision patient collects the image; image tilt correction makes the test paper image correct, easy to recognize and is a necessary pre-process for horizontal projection.

In the invention, the image inclination correction generally comprises two inclinations, one is plane inclination, in this case, the electronic typoscope is parallel to the test paper, and the other is z-axis inclination, in this case, the photographing device and the test paper have a certain angle, and the photographed image has a distortion phenomenon; and (4) the corrected image is level and compared with the standard line, so that the segmentation effect of the test question is ensured.

In the invention, in some test papers, the gaps between character lines are small, and the test paper image needs to be processed totally, so that the gaps between the character lines are enlarged, and the error rate of dividing two lines of characters or even more lines of characters together during horizontal division is reduced; the magnification factor can be fixed at a default value, the low-vision patient can see the condition of the test paper, and the magnification factor of the test paper can be directly adjusted through a magnification key on the electronic typoscope.

Step 2: and carrying out image segmentation on the preprocessed test paper image, and storing the segmented image into a buffer area of the electronic typoscope.

In the invention, the preprocessed test paper image is horizontally projected to segment the trip image, the line image is vertically projected, the cutting position coordinates of each question of the test paper image are judged by combining OCR recognition, and the test paper image is cut by the coordinates to obtain the target image, namely each question of the test paper.

The step 2 comprises the following steps:

the step 2.1 comprises the following steps:

step 2.1.4: and recording the coordinates of the upper left corner and the lower right corner of each line of images by taking the upper left corner of the initial test paper image as an origin.

In the invention, the preprocessed image is scanned line by line from top to bottom and the pixels of each scanning line are calculated at the same time to obtain the horizontal projection of the image, and the position of the character line is determined according to the horizontal projection value.

In the invention, horizontal projection mainly counts the number of pixels in each line, and the characteristics of the horizontal projection indicate the characteristics of the image in the horizontal direction.

In the present invention, the projection value is the sum of black numbers of one pixel row (column), and the text line object is confirmed according to the horizontal projection value, that is, the starting point and the end point of the line are determined according to the statistical black numbers, and then are divided according to the blank space.

In the invention, the horizontal projection blank gaps caused by the blank gaps between the character lines are wave troughs, the image can be divided according to the lines, and the upper left corner coordinates and the lower right corner coordinates of each line are recorded by taking the upper left corner of the image as an origin.

Step 2.2: and storing all the line images in sequence to obtain a complete line image set to be processed.

In the invention, all the line images are stored in sequence, namely, when the test paper has a plurality of pages, all the lines with practical significance are firstly sorted, the possible large-area blank at the top and the bottom of each test paper is removed, and simultaneously, the topics which are disconnected due to paging are spliced in advance.

Said step 2.3 comprises the steps of:

otherwise, the coordinate point is not recorded, and the next step is directly carried out; since the processing of step 2.2 and here is identified by the first line of the test paper, the content that may appear here is the header part of the test paper, which is what can be skipped over, and so the next step is taken.

otherwise, directly recording the coordinate point of the lower right corner of the current row, and repeating the step 2.3.3; this may be the case if the subject matter of the preamble is still in progress, so the update is done with the newly recorded bottom right coordinate point and the verification continues for the next line.

if continuous lower case numbers and symbols exist in the recognition result of the initial position of the current line, or the transverse coordinate value of the initial position is larger than that of the initial position of the line image of the previous line, the current line is considered to be the first line from which the next support stem starts, and the step 2.3.3 is returned; in order to avoid too long topic of each division, the branch stem also needs to be divided, so when the next branch stem is identified, the division is performed, that is, no coordinate point is recorded, and the step 2.3.3 is returned to perform new division.

if the current line image is empty, the next step is carried out; here indicating that a dead end has been identified.

step 2.3.6: marking each main question stem and each branch question stem correspondingly; the marking mainly refers to marking the main question stem and the branch question stem, so that the subsequent operation can be distinguished conveniently.

When the identification result of the initial position of any line is that continuous capital figures and symbols exist, Chinese characters or the combination of the Chinese characters and the figures and/or English exist, and the line is started by special symbols, the line is the initial line of the subject stem.

In the invention, the vertical projection is mainly used for counting the number of pixels in each column, and the characteristics of the pixels represent the characteristics of the character image in the vertical direction.

In the invention, the line image is vertically projected in sequence, the relationship between the previous line and the current line and the relationship between the current line and the next line are determined according to the position of the first black pixel in the vertical projection and the OCR character recognition result of the line image, and whether the line image belongs to the same topic is determined.

In the present invention, the condition of the subject stem includes the case that the recognition result of the initial position is that continuous capital figures and symbols exist, Chinese characters or the combination of Chinese characters and figures and/or English exist, the subject is started by special symbols, and the latter two types of subjects exist as the start of additional subjects, wherein, the special characters refer to characters which are not figures, Chinese and English, and are not brackets and punctuation marks.

In the present invention, step 2.3.2 discusses the identification of the start line, which may be the subject stem or unnecessary information, so the subject stem is marked normally and the unnecessary information is skipped.

In the invention, the condition in step 2.3.3 is the recognition of the branch question stem, if the branch question stem is not the continuation of the topic question stem, the lower right corner is updated, and the processing of the next row is repeated.

In the present invention, step 2.3.4 discusses the situations of the incomplete branch question stem, the option of the choice question, the newly started branch question stem, the new main question stem and the last line.

In the present invention, the horizontal coordinate value larger than the horizontal coordinate value of the start position of the line image of the previous line means that the current line is indented with respect to the previous line.

In the present invention, the recognition result is not started with a lower case number plus a sign (a pause or a dot) or the position where the first appearing pixel value is zero is larger than the position where the first appearing pixel value of the previous line image is zero (indented from the previous line), which is a case where the large topic is too long.

In the invention, the start of each title records the left side of the upper left corner, the lower right corner coordinate point is recorded when the title is finished or not finished, and if the title is not finished, the lower right corner coordinate point of the current line is replaced by the lower right corner coordinate point of the next line, so that the coordinate point is updated in real time.

In the invention, each question is segmented according to the left point of the upper left corner and the coordinate point of the lower right corner, and each segmented question picture is transmitted to the next step in sequence.

In the present invention, since the image divided by projection is an enlarged image and the image processed in the next step is an original image, the coordinate values are recorded at a ratio between the enlarged image and the original image.

And 3, step 3: and reading the pixel value of the image in the buffer area, and identifying the correction mark.

The step 3 comprises the following steps:

step 3.3: acquiring addresses of pixels stored in a buffer area of the electronic vision aid, and determining the value of each pixel point;

In the invention, the analysis and the extraction of the outline of the connected region are common technologies in the field and can be realized by various processing means.

In the invention, the functions of the electronic typoscope, such as color processing, can be realized by the technical personnel according to the requirements.

And 4, step 4: and obtaining a trained neural network model for identifying the correction mark, and using the trained neural network model for identifying the correction mark in the test paper.

In the step 4, identifying the correction mark in the test paper includes the following steps:

step 4.4: and acquiring coordinate values of 4 angular points of the target frame corresponding to the determined error mark and a middle point of the target frame.

In the invention, the principle of training the neural network model is as follows:

selecting a neural network with high target detection network identification precision in the prior art, and modifying basic parameters of the network according to requirements;

collecting the corrected test paper, acquiring a test paper image through scanning or photographing equipment such as a mobile phone camera and the like, carrying out inclination correction on the image, and dividing the test paper image into a training set and a test set in proportion;

carrying out data identification on the correction identification in the test paper image, wherein the correction identification of a correct answer is 'V', the correction identification of an incorrect answer is 'X', the color of the general correction identification is red, data labeling is carried out on the correct answer and the correction identification of the incorrect answer, if only the incorrect answer identification is labeled, the neural network takes the red identification as the incorrect answer identification, and the other identifications are taken as backgrounds, so that the target detection network model identifies the correct question as an incorrect question; testing the collected test paper image by using a trained model, recording the identified coordinate point of the error mark, reading the coordinate point of the red area in the output buffer area through the address value of the red area recorded in the step 3, and determining the red error mark according to the IOU of the red area and the address value of the red area in the output buffer area, so that the problem that the 'check mark' and the 'x' of the selected question are mistakenly regarded as the correction mark of the teacher is avoided;

and modifying the code of the target detection network for generating the test result, enabling the test result to contain the 4 identified coordinates of the target frame, and calculating the middle point of the 4 coordinates, namely the middle point of the target frame.

And 5: comparing the coordinate midpoint of the target frame of the recognized correction mark with the coordinate points of all the subjects of the segmented test paper image; and if any midpoint coordinate falls into the coordinate range of any topic, the current topic is considered as a wrong topic.

Step 6: and distinguishing the handwritten fonts and the printed fonts of the wrong question image corresponding to the wrong question, and identifying the characters of the wrong question.

The step 6 comprises the following steps:

step 6.1: training a neural network model for distinguishing the handwritten fonts and the printing fonts, and taking the printing fonts as test paper questions;

In the invention, the handwritten fonts of students need to be selectively erased.

In the invention, a neural network model capable of distinguishing the handwritten font from the printed font needs to be trained, because the model may mistake the correction identification of a teacher as the handwritten font, and the correction identification part of the teacher is superposed on the question of the test paper, after the model recognizes the handwritten font, a color judgment needs to be made on the font, the font can be compared by calling the recorded red area, or the handwritten font is in the red area, the font is not erased, the test paper question is prevented from being erased by mistake, and the recognized wrong question picture is transmitted into the neural network model for distinguishing the handwritten font from the printed font and is stored under the picture file management of the electronic viewing aid according to a certain format name.

In the invention, the OCR function carried by the electronic visual aid is called to carry out OCR character recognition on the stored picture, the OCR recognition can reduce or even eliminate the influence of the correction mark which is not erased on the subject recognition, and the OCR character recognition result is stored as the picture and is stored under the picture file management folder of the electronic visual aid according to a certain format.

In the invention, the printing font is the test paper subject, and the handwriting font is the student answering font.

In the invention, the low-vision patient can continuously identify the wrong questions of the corrected test paper which needs to generate the wrong question book, and when one test paper image is identified, the display of the electronic visual aid can pop up a reminding message or remind the low-vision patient by voice to collect the next image or end the wrong question identification or read the image by using the screen freezing function of the electronic visual aid. The frozen screen function is to fix the picture on the image, the image is read by using the functions of scaling and color changing, when the low-vision patient selects the frozen screen function to read the image, the color frame with contrast with the background color is used for selecting the wrong question confirmed in the step 5 by using the rectangular frame, wherein the background color refers to the color changing function adopted by the low-vision patient to find the color suitable for reading when reading, and the frame selection wrong question can enable the low-vision patient to quickly find the wrong question.

The step 7 comprises the following steps:

step 7.1: confirming that all image recognition is completed;

and 7.2: acquiring an image corresponding to the theme trunk processed in the step 2 and an error theme image processed in the step 6;

and 7.5: if the next wrong problem image exists, taking the next wrong problem image as the current wrong problem image, returning to the step 7.4, otherwise, completing the splicing, and carrying out the next step;

step 7.6: and generating an error problem book.

In the invention, after all the approved test papers are identified, the wrong question identification mode exits, the image splicing is triggered when the operation exits, the electronic typoscope can automatically call the image splicing module to splice the stored images, the images are spliced into a long picture and are named and stored according to a certain format, the long picture is a wrong question set in the approved test papers, namely a wrong question book generated by the invention, the wrong question book erases a question answering area, and a low-vision patient can do questions again to consolidate knowledge points.

In the invention, because the wrong-question images are arranged in sequence, each branch question stem can be inquired in sequence, and each main question stem is spliced with the branch question stem when being accessed for the first time; the operation can keep the theme stem corresponding to each theme stem, so that the logic of the error textbook in the using process is clear.

In the invention, the low-vision patient can delete the pictures except the long picture of the wrong question book in the picture management and only keep the long picture. The low-vision patient can open the long picture, read the long picture and do questions again in a mode suitable for reading by himself by functions of color changing, amplifying and the like, can also read by using a TTS voice function of the electronic typoscope, and can also export and print the picture through a USB. Because the accuracy of reading the mathematical formula by the TTS voice function of the electronic typoscope is not high, a plug-in special for displaying the mathematical formula can be installed to read the mathematical formula by combining the TTS voice function, and the accuracy is improved.

The method mainly aims at the test paper with a certain typesetting format, and has high success rate and good applicability.

The invention collects complete test paper images in non-column format through the electronic visual aid, preprocesses and segments the images, stores the images into a buffer area of the electronic visual aid, identifies correction marks based on pixel values of the images in the buffer area, uses a trained neural network model to identify the correction marks in the test paper, compares the coordinate midpoint of a target frame of the identified correction marks with the coordinate points of all subjects of the segmented test paper images, determines the current subject as a wrong subject when any midpoint coordinate falls in the coordinate range of any subject, distinguishes handwritten fonts and printed fonts of the wrong subject images, identifies the characters of the wrong subjects, splices the processed wrong subject images after the identification is finished, and generates an error problem book.

According to the method, each question of the test paper is segmented by a horizontal projection method and a vertical projection method in an image processing technology, so that the uncertainty of the segmentation precision by a neural network and the complexity of training data collection and labeling are overcome; the correction mark is identified by the target detection network, the error problem is identified by combining the image processing technology, the printed font and the handwritten font are distinguished by the two classification neural networks, the handwritten font is erased, the error problem after the handwritten font is erased is identified by the OCR identification function of the electronic typoscope, the correction mark is removed, the problems that the error area, the correction mark and the error answer cannot be automatically intercepted, the accuracy of the error problem book can be improved, the function of error problem identification can be implanted into the electronic typoscope, a low-vision patient does not need to generate the error problem book by a hand copying mode, the error problem area and the correction mark cannot be manually intercepted, the time is effectively saved, and the error problem book can be read and the error problem book can be redone in various modes.

Claims

1. An error book generation method suitable for an electronic typoscope, characterized by: the method comprises the following steps:

and step 3: reading the pixel value of the image in the buffer area and identifying the correction mark, comprising the following steps:

step 3.5: extracting the contour of the correction identification color area, recording the corresponding position of the current contour in the image, and placing the position in an output address;

and 4, step 4: obtaining a trained neural network model for identifying the correction mark, and identifying the correction mark in the test paper;

2. The method according to claim 1, wherein the method further comprises: in the step 1, the pretreatment comprises the following steps:

3. The method according to claim 1, wherein the method further comprises: the step 2 comprises the following steps:

4. The method according to claim 3, wherein the method further comprises: the step 2.1 comprises the following steps:

5. The method according to claim 3, wherein the method further comprises: said step 2.3 comprises the steps of:

if the recognition result of the initial position of the current line is continuous English and symbols and the horizontal coordinate value of the initial position is greater than or equal to the horizontal coordinate value of the initial position of the line image of the previous line, the current line and the previous line are considered to be the same question and are options of choice questions, the coordinate of the lower right corner of the current line is taken as the updated coordinate of the lower right corner, and the step 2.3.4 is repeated;

if the current line image is empty, the next step is carried out;

6. The method according to claim 5, wherein the method further comprises: when the identification result of the initial position of any line is that continuous capital figures and symbols exist, Chinese characters or the combination of the Chinese characters and the figures and/or English exist, and the line is started by special symbols, the line is the initial line of the subject stem.

7. The method according to claim 1, wherein the method further comprises: in the step 4, identifying the correction mark in the test paper includes the following steps:

8. The method according to claim 1, wherein the method further comprises: the step 6 comprises the following steps:

9. The method according to claim 1, wherein the method further comprises: the step 7 comprises the following steps:

step 7.1: confirming that all the images are identified;

step 7.6: and generating an error problem book.