CN113901952A - Print form and handwritten form separated character recognition method based on deep learning - Google Patents
Print form and handwritten form separated character recognition method based on deep learning Download PDFInfo
- Publication number
- CN113901952A CN113901952A CN202111309350.2A CN202111309350A CN113901952A CN 113901952 A CN113901952 A CN 113901952A CN 202111309350 A CN202111309350 A CN 202111309350A CN 113901952 A CN113901952 A CN 113901952A
- Authority
- CN
- China
- Prior art keywords
- model
- picture
- handwriting
- handwritten
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Character Discrimination (AREA)
Abstract
A method for recognizing characters by separating print form from handwriting based on deep learning. The method comprises the following steps: after preprocessing, classifying and identifying the print form and the handwritten form by using a deep learning model, respectively obtaining the print form picture and the handwritten form picture according to the classification and identification results, and finally respectively carrying out character identification on the print form picture and the handwritten form picture. The method can realize the separated character recognition of the print form and the handwritten form for the document picture containing the print form and the handwritten form simultaneously, is completely automatic, intelligent and self-adaptive, and does not need to manually set parameters.
Description
Technical Field
The invention relates to a print and handwriting separated character recognition method based on deep learning, and belongs to the field of artificial intelligence vision.
Background
The text recognition of the document picture is a relatively mature technology, but when the document picture contains both the print and the handwriting, the existing text recognition technology is to put the print and the handwriting together for recognition, and the print and the handwriting cannot be separated. The method has many important applications of respectively carrying out character recognition on document pictures containing printed forms and handwritten forms, such as character recognition of bank notes, automatic correction of student test papers, conversion of litigation documents into electronic files and the like. The key technical problem to realize the character recognition of the printed form and the handwritten form respectively is the classification recognition of the printed form and the handwritten form. The print and handwriting classification problem can be defined as: for pictures containing both print and handwriting, it is desirable to implement pixel-level classification of print, handwriting and background in the picture. The traditional machine vision method is difficult to realize pixel-level classification recognition on print and handwriting, especially when the print and the handwriting have cross overlap. The semantic segmentation technology in the deep learning method can realize pixel level classification and identification, and can well solve the problem. Popular algorithms in semantic segmentation techniques include full convolution neural network (FCN) and FCN with hole convolution.
Disclosure of Invention
The invention adopts the following technical scheme for realizing the purpose of separating the character recognition of the printed form and the handwritten form:
a print form and handwriting form separated character recognition method based on deep learning comprises the following steps.
The method comprises the following steps of (1) making a training sample data set, wherein the training sample data set comprises the following steps:
(1.1) preparing a piece of paper, wherein the printing font on the paper is required to be black, and the blank of the paper is white;
(1.2) writing on paper with a red pen;
(1.3) the picture obtained by photographing the paper on which the characters are written is recorded as a picture a. This step can also be scanned with a scanner;
and (1.4) preprocessing the graph a through an algorithm program, wherein the preprocessing comprises frame cutting and binarization to form a black-and-white graph, the obtained result graph is a graph b, and the graph b is an input sample of the training model. The print and the handwriting of fig. b are black, the background becomes white;
(1.5) the print, handwriting and background in figure a are classified at pixel level by an algorithm program, and the result is recorded as figure c. The classification principle of the algorithm program is that the red handwriting is easily distinguished from the black printing and the white background based on the different sizes of the pixel values corresponding to different colors. Graph c is an output sample of the training model. Here, the pixel of the background is represented by 0, the pixel of the print font is represented by 1, and the pixel of the handwritten font is represented by 2. Therefore, pixel-level labeling of handwritten fonts, printed fonts and backgrounds is realized.
And (2) establishing a deep learning model, which is mainly an artificial intelligence model in the field of image semantic segmentation, wherein the artificial intelligence model can be a full convolution neural network (FCN), a hole convolution (scaled convolution), an FCN model with hole convolution, a segNet network, a U-Net network or the like.
The model training in the step (3) is to input training samples into the model for training, and comprises the following steps:
(3.1) input of training samples
Inputting the graph b as an input of a training sample into the artificial intelligence model;
(3.2) output of training samples
And c is taken as the target output of the training sample. Graph c includes 3 categories, denoted respectively as: the printing form is marked as 1, the handwritten font is marked as 2, and the background is marked as 0;
and (3.3) training the model by preparing the model and the training sample, wherein the training of the model can be trained on a personal computer, or on a CPU server or a GPU server. If the sample size is large, it is best trained on the GPU server. After training the model, the model is saved. Meanwhile, the model is detected by using a detection sample, the classification accuracy of the model is detected, and the method can be used for practical application if the classification accuracy is high.
And (4) acquiring a picture containing a print form and a handwritten form, wherein the method comprises the following steps: the paper containing the print and the handwriting is photographed by a mobile phone, a scanner or a high-speed camera.
And (5) preprocessing the picture, including picture correcting, shadow removing and the like. Picture rectification may also be understood as border cropping. Unshaded may also be understood as binarized.
And (6) inputting the preprocessed picture into a trained model for detection to obtain a pixel-level classification result graph, wherein the method comprises the following steps:
(6.1) loading the trained model;
(6.2) dividing the preprocessed picture into N small pictures, for example, N =4 may be taken. The division into small pictures is used for improving the detection speed;
and (6.3) inputting each small picture into the model for detection, and obtaining a classification result through parallel calculation by a multithreading method.
And (7) respectively obtaining a print form picture and a handwritten form picture according to the classification recognition result, so that the separation of the print form and the handwritten form is realized.
And (8) respectively performing character recognition on the separated print form and the handwriting picture, wherein the character recognition software can adopt a hundred-degree open source program PaddleOCR.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 contains both print and handwritten original pictures;
FIG. 3 isolates the print from the original picture;
FIG. 4 separates handwriting from original pictures;
FIG. 5 is a diagram illustrating character recognition of the separated print;
fig. 6 performs character recognition on the separated handwriting.
Detailed Description
The following describes embodiments of the present invention in further detail with reference to the accompanying drawings.
FIG. 1 is a flow chart of the present invention, which includes the following steps.
The method comprises the following steps of (1) making a training sample data set, wherein the training sample data set comprises the following steps:
(1.1) preparing a piece of paper, wherein the printing font on the paper is required to be black, and the blank of the paper is white;
(1.2) writing on paper with a red pen;
(1.3) the picture obtained by photographing the paper on which the characters are written is recorded as a picture a. This step can also be scanned with a scanner;
and (1.4) preprocessing the graph a through an algorithm program, wherein the preprocessing comprises frame cutting and binarization to form a black-and-white graph, the obtained result graph is a graph b, and the graph b is an input sample of the training model. The print and the handwriting of fig. b are black, the background becomes white;
(1.5) the print, handwriting and background in figure a are classified at pixel level by an algorithm program, and the result is recorded as figure c. The classification principle of the algorithm program is that the red handwriting is easily distinguished from the black printing and the white background based on the different sizes of the pixel values corresponding to different colors. Graph c is an output sample of the training model. Here, the pixel of the background is represented by 0, the pixel of the print font is represented by 1, and the pixel of the handwritten font is represented by 2. Thereby realizing the pixel level labeling of the handwritten font, the printed font and the background;
(1.6) to achieve the desired effect, the sample number is much more beneficial, for example, 2200 sheets can be made, 2000 sheets are used for training and 200 sheets are used for detection. The languages of characters in the currently produced pictures are mainly Chinese, English and mathematical characters. The same method can be used to make samples of pictures in other languages.
And (2) establishing a deep learning model, which is mainly an artificial intelligence model in the field of image semantic segmentation, wherein the artificial intelligence model can be a full convolution neural network (FCN), a hole convolution (scaled convolution), an FCN model with hole convolution, a segNet network, a U-Net network or the like.
In consideration of practical engineering application, the constructed model needs to be as simple as possible and has high classification accuracy. Too complex models are computationally expensive and time consuming. An example of one of the models is as follows:
(1) the input layer of the model is a 3-channel graph of 1024 multiplied by 1024;
(2) a void convolution layer having 32 feature maps, a convolution kernel of 5 × 5, and an expansion ratio = 2;
(3) dropout layer with a Dropout probability of 20%;
(4) a void convolution layer having 32 feature maps, a convolution kernel of 5 × 5, and an expansion ratio = 3;
(5) a pooling layer with a sampling factor of 2 × 2;
(6) a hole convolution layer having 64 feature maps, a convolution kernel of 5 × 5, and an expansion ratio = 2;
(7) dropout layer with a Dropout probability of 20%;
(8) a hole convolution layer having 64 feature maps, a convolution kernel of 5 × 5, and an expansion ratio = 3;
(9) a pooling layer with a sampling factor of 2 × 2;
(10) a hole convolution layer having 64 feature maps, a convolution kernel of 5 × 5, and an expansion ratio = 2;
(11) dropout layer with a Dropout probability of 20%;
(12) a hole convolution layer having 64 feature maps, a convolution kernel of 5 × 5, and an expansion ratio = 3;
(13) a convolution layer having 32 feature maps, the convolution kernel being 1 × 1;
(14) the deconvolution layer has 3 signatures, a convolution kernel of 4 × 4 and a step size of 4 × 4.
The model training in the step (3) is to input training samples into the model for training, and comprises the following steps:
(3.1) input of training samples
Inputting the graph b as an input of a training sample into the artificial intelligence model;
(3.2) output of training samples
And c is taken as the target output of the training sample. Graph c includes 3 categories, denoted respectively as: the printing form is marked as 1, the handwritten font is marked as 2, and the background is marked as 0;
and (3.3) training the model by preparing the model and the training sample, wherein the training of the model can be trained on a personal computer, or on a CPU server or a GPU server. If the sample size is large, it is best trained on the GPU server. After training the model, the model is saved. Meanwhile, the model is detected by using a detection sample, the classification accuracy of the model is detected, and the method can be used for practical application if the classification accuracy is high.
And (4) acquiring a picture containing a print form and a handwritten form, wherein the method comprises the following steps: the paper containing the print and the handwriting is photographed by a mobile phone, a scanner or a high-speed camera. Fig. 2 is a part of the original image obtained by photographing.
And (5) preprocessing the picture, including picture correcting, shadow removing and the like. Picture rectification may also be understood as border cropping. Unshaded may also be understood as binarized.
And (6) inputting the preprocessed picture into a trained model for detection to obtain a pixel-level classification result graph, wherein the method comprises the following steps:
(6.1) loading the trained model;
(6.2) dividing the preprocessed picture into N small pictures, for example, N =4 may be taken. The division into small pictures is used for improving the detection speed;
and (6.3) inputting each small picture into the model for detection, and obtaining a classification result through parallel calculation by a multithreading method.
And (7) respectively obtaining a print form picture and a handwritten form picture according to the classification recognition result, so that the separation of the print form and the handwritten form is realized. Fig. 3 shows a printed form separated from the original picture, and fig. 4 shows a handwritten form separated from the original picture.
And (8) respectively performing character recognition on the separated print form and the handwritten form picture, wherein character recognition software can adopt a Baidu open source program PaddleOCR, for example, FIG. 5 shows the result of character recognition on the separated print form, and FIG. 6 shows the result of character recognition on the separated handwritten form. Most of the recognition is correct as can be seen from the character recognition result.
Those skilled in the art will appreciate that the invention may be practiced without these specific details.
Claims (9)
1. A print form and handwriting form separated character recognition method based on deep learning is characterized in that: the method comprises the following steps:
(1) making a training sample, namely making a training sample data set containing handwritten fonts and printed fonts;
(2) establishing a deep learning model;
(3) training a model;
(4) acquiring a picture containing a print form and a handwritten form;
(5) preprocessing the picture;
(6) detecting the image input model to obtain a pixel level classification result graph;
(7) respectively obtaining a print picture and a handwritten picture according to the classification and identification results;
(8) and respectively carrying out character recognition on the printed form and the handwritten form picture.
2. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: the method comprises the following steps of (1) making a training sample data set, wherein the training sample data set comprises the following steps:
(1.1) preparing a piece of paper, wherein the printing font on the paper is required to be black, and the blank of the paper is white;
(1.2) writing on paper with a red pen;
(1.3) taking a picture of the paper with the written characters, and recording the picture as a picture a, wherein the step can be scanned by a scanner;
(1.4) preprocessing the graph a by an algorithm program, wherein the preprocessing comprises frame cutting and binary conversion into a black-white graph, the obtained result graph is marked as a graph b, the graph b is an input sample of a training model, the print and the handwriting of the graph b are black, and the background is changed into white;
(1.5) carrying out pixel-level classification on the print, the handwriting and the background in the graph a through an algorithm program, and recording an obtained result as a graph c; the classification principle of the algorithm program is that based on the fact that pixel values corresponding to different colors are different in size, a red handwritten form is easily distinguished from a black printed form and a white background; FIG. c is an output sample of the training model; here, the pixel of the background is represented by 0, the pixel of the printed font is represented by 1, and the pixel of the handwritten font is represented by 2; therefore, pixel-level labeling of handwritten fonts, printed fonts and backgrounds is realized.
3. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: and (2) establishing a deep learning model, which is mainly an artificial intelligence model in the field of image semantic segmentation, wherein the artificial intelligence model can be a full convolution neural network (FCN), a hole convolution (scaled convolution), an FCN model with hole convolution, a segNet network, a U-Net network or the like.
4. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: the model training in the step (3) is to input training samples into the model for training, and comprises the following steps:
(3.1) input of training samples
Inputting the graph b as an input of a training sample into the artificial intelligence model;
(3.2) output of training samples
Taking the graph c as the target output of the training sample, the graph c includes 3 categories, which are respectively recorded as: the printing form is marked as 1, the handwritten font is marked as 2, the background is marked as 0, the model can be trained after the model and the training sample are prepared, and the training of the model can be carried out on a personal computer, a CPU (central processing unit) server or a GPU (graphics processing unit) server; if the sample size is large, training is preferably carried out on a GPU server, and the model is stored after the model is trained; meanwhile, the model is detected by using a detection sample, the classification accuracy of the model is detected, and the method can be used for practical application if the classification accuracy is high.
5. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: and (4) acquiring a picture containing a print form and a handwritten form, wherein the method comprises the following steps: the paper containing the print and the handwriting is photographed by a mobile phone, a scanner or a high-speed camera.
6. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: and (5) preprocessing the picture, wherein the preprocessing comprises picture correction, shadow removal and the like, the picture correction can also be understood as frame clipping, and the shadow removal can also be understood as binarization.
7. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: and (6) inputting the preprocessed picture into a trained model for detection to obtain a pixel-level classification result graph, wherein the method comprises the following steps:
(6.1) loading the trained model;
(6.2) dividing the preprocessed picture into N small pictures, for example, N =4, wherein the division into the small pictures is to improve the detection speed;
and (6.3) inputting each small picture into the model for detection, and obtaining a classification result through parallel calculation by a multithreading method.
8. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: and (7) respectively obtaining a print form picture and a handwritten form picture according to the classification recognition result, so that the separation of the print form and the handwritten form is realized.
9. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: and (8) respectively performing character recognition on the separated print form and the handwriting picture, wherein the character recognition software can adopt a hundred-degree open source program PaddleOCR.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111309350.2A CN113901952A (en) | 2021-11-06 | 2021-11-06 | Print form and handwritten form separated character recognition method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111309350.2A CN113901952A (en) | 2021-11-06 | 2021-11-06 | Print form and handwritten form separated character recognition method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113901952A true CN113901952A (en) | 2022-01-07 |
Family
ID=79193517
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111309350.2A Pending CN113901952A (en) | 2021-11-06 | 2021-11-06 | Print form and handwritten form separated character recognition method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113901952A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114690982A (en) * | 2022-03-31 | 2022-07-01 | 呼和浩特民族学院 | Intelligent teaching method for physics teaching |
CN115100656A (en) * | 2022-08-25 | 2022-09-23 | 江西风向标智能科技有限公司 | Blank answer sheet identification method, system, storage medium and computer equipment |
CN115880704A (en) * | 2023-02-16 | 2023-03-31 | 中国人民解放军总医院第一医学中心 | Automatic case cataloging method, system, equipment and storage medium |
CN117115195A (en) * | 2023-10-24 | 2023-11-24 | 成都信息工程大学 | Tamper-proof identification method and tamper-proof identification system based on block chain |
-
2021
- 2021-11-06 CN CN202111309350.2A patent/CN113901952A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114690982A (en) * | 2022-03-31 | 2022-07-01 | 呼和浩特民族学院 | Intelligent teaching method for physics teaching |
CN114690982B (en) * | 2022-03-31 | 2023-03-31 | 呼和浩特民族学院 | Intelligent teaching method for physics teaching |
CN115100656A (en) * | 2022-08-25 | 2022-09-23 | 江西风向标智能科技有限公司 | Blank answer sheet identification method, system, storage medium and computer equipment |
CN115880704A (en) * | 2023-02-16 | 2023-03-31 | 中国人民解放军总医院第一医学中心 | Automatic case cataloging method, system, equipment and storage medium |
CN117115195A (en) * | 2023-10-24 | 2023-11-24 | 成都信息工程大学 | Tamper-proof identification method and tamper-proof identification system based on block chain |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110210413B (en) | Multidisciplinary test paper content detection and identification system and method based on deep learning | |
CN109948510B (en) | Document image instance segmentation method and device | |
Karthick et al. | Steps involved in text recognition and recent research in OCR; a study | |
CN113901952A (en) | Print form and handwritten form separated character recognition method based on deep learning | |
US8494273B2 (en) | Adaptive optical character recognition on a document with distorted characters | |
Marinai | Introduction to document analysis and recognition | |
CN109784342B (en) | OCR (optical character recognition) method and terminal based on deep learning model | |
CN114299528B (en) | Information extraction and structuring method for scanned document | |
CN111461122B (en) | Certificate information detection and extraction method | |
US20240037969A1 (en) | Recognition of handwritten text via neural networks | |
CN109635805B (en) | Image text positioning method and device and image text identification method and device | |
CN113128442A (en) | Chinese character calligraphy style identification method and scoring method based on convolutional neural network | |
Demilew et al. | Ancient Geez script recognition using deep learning | |
CN116071763B (en) | Teaching book intelligent correction system based on character recognition | |
CN110956167A (en) | Classification discrimination and strengthened separation method based on positioning characters | |
CN112446259A (en) | Image processing method, device, terminal and computer readable storage medium | |
Kaundilya et al. | Automated text extraction from images using OCR system | |
CN112508000B (en) | Method and equipment for generating OCR image recognition model training data | |
Sarkar et al. | Suppression of non-text components in handwritten document images | |
CN109508712A (en) | A kind of Chinese written language recognition methods based on image | |
CN113139535A (en) | OCR document recognition method | |
Ovodov | Optical braille recognition using object detection neural network | |
CN110766001B (en) | Bank card number positioning and end-to-end identification method based on CNN and RNN | |
Aravinda et al. | Template matching method for Kannada handwritten recognition based on correlation analysis | |
CN112036330A (en) | Text recognition method, text recognition device and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |