CN113901952A - Print form and handwritten form separated character recognition method based on deep learning - Google Patents

Print form and handwritten form separated character recognition method based on deep learning Download PDF

Info

Publication number
CN113901952A
CN113901952A CN202111309350.2A CN202111309350A CN113901952A CN 113901952 A CN113901952 A CN 113901952A CN 202111309350 A CN202111309350 A CN 202111309350A CN 113901952 A CN113901952 A CN 113901952A
Authority
CN
China
Prior art keywords
model
print
picture
handwriting
handwritten
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111309350.2A
Other languages
Chinese (zh)
Inventor
方海泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Xingsuan Technology Co ltd
Original Assignee
Zhejiang Xingsuan Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Xingsuan Technology Co ltd filed Critical Zhejiang Xingsuan Technology Co ltd
Priority to CN202111309350.2A priority Critical patent/CN113901952A/en
Publication of CN113901952A publication Critical patent/CN113901952A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)

Abstract

A method for recognizing characters by separating print form from handwriting based on deep learning. The method comprises the following steps: after preprocessing, classifying and identifying the print form and the handwritten form by using a deep learning model, respectively obtaining the print form picture and the handwritten form picture according to the classification and identification results, and finally respectively carrying out character identification on the print form picture and the handwritten form picture. The method can realize the separated character recognition of the print form and the handwritten form for the document picture containing the print form and the handwritten form simultaneously, is completely automatic, intelligent and self-adaptive, and does not need to manually set parameters.

Description

Print form and handwritten form separated character recognition method based on deep learning
Technical Field
The invention relates to a print and handwriting separated character recognition method based on deep learning, and belongs to the field of artificial intelligence vision.
Background
The text recognition of the document picture is a relatively mature technology, but when the document picture contains both the print and the handwriting, the existing text recognition technology is to put the print and the handwriting together for recognition, and the print and the handwriting cannot be separated. The method has many important applications of respectively carrying out character recognition on document pictures containing printed forms and handwritten forms, such as character recognition of bank notes, automatic correction of student test papers, conversion of litigation documents into electronic files and the like. The key technical problem to realize the character recognition of the printed form and the handwritten form respectively is the classification recognition of the printed form and the handwritten form. The print and handwriting classification problem can be defined as: for pictures containing both print and handwriting, it is desirable to implement pixel-level classification of print, handwriting and background in the picture. The traditional machine vision method is difficult to realize pixel-level classification recognition on print and handwriting, especially when the print and the handwriting have cross overlap. The semantic segmentation technology in the deep learning method can realize pixel level classification and identification, and can well solve the problem. Popular algorithms in semantic segmentation techniques include full convolution neural network (FCN) and FCN with hole convolution.
Disclosure of Invention
The invention adopts the following technical scheme for realizing the purpose of separating the character recognition of the printed form and the handwritten form:
a print form and handwriting form separated character recognition method based on deep learning comprises the following steps.
The method comprises the following steps of (1) making a training sample data set, wherein the training sample data set comprises the following steps:
(1.1) preparing a piece of paper, wherein the printing font on the paper is required to be black, and the blank of the paper is white;
(1.2) writing on paper with a red pen;
(1.3) the picture obtained by photographing the paper on which the characters are written is recorded as a picture a. This step can also be scanned with a scanner;
and (1.4) preprocessing the graph a through an algorithm program, wherein the preprocessing comprises frame cutting and binarization to form a black-and-white graph, the obtained result graph is a graph b, and the graph b is an input sample of the training model. The print and the handwriting of fig. b are black, the background becomes white;
(1.5) the print, handwriting and background in figure a are classified at pixel level by an algorithm program, and the result is recorded as figure c. The classification principle of the algorithm program is that the red handwriting is easily distinguished from the black printing and the white background based on the different sizes of the pixel values corresponding to different colors. Graph c is an output sample of the training model. Here, the pixel of the background is represented by 0, the pixel of the print font is represented by 1, and the pixel of the handwritten font is represented by 2. Therefore, pixel-level labeling of handwritten fonts, printed fonts and backgrounds is realized.
And (2) establishing a deep learning model, which is mainly an artificial intelligence model in the field of image semantic segmentation, wherein the artificial intelligence model can be a full convolution neural network (FCN), a hole convolution (scaled convolution), an FCN model with hole convolution, a segNet network, a U-Net network or the like.
The model training in the step (3) is to input training samples into the model for training, and comprises the following steps:
(3.1) input of training samples
Inputting the graph b as an input of a training sample into the artificial intelligence model;
(3.2) output of training samples
And c is taken as the target output of the training sample. Graph c includes 3 categories, denoted respectively as: the printing form is marked as 1, the handwritten font is marked as 2, and the background is marked as 0;
and (3.3) training the model by preparing the model and the training sample, wherein the training of the model can be trained on a personal computer, or on a CPU server or a GPU server. If the sample size is large, it is best trained on the GPU server. After training the model, the model is saved. Meanwhile, the model is detected by using a detection sample, the classification accuracy of the model is detected, and the method can be used for practical application if the classification accuracy is high.
And (4) acquiring a picture containing a print form and a handwritten form, wherein the method comprises the following steps: the paper containing the print and the handwriting is photographed by a mobile phone, a scanner or a high-speed camera.
And (5) preprocessing the picture, including picture correcting, shadow removing and the like. Picture rectification may also be understood as border cropping. Unshaded may also be understood as binarized.
And (6) inputting the preprocessed picture into a trained model for detection to obtain a pixel-level classification result graph, wherein the method comprises the following steps:
(6.1) loading the trained model;
(6.2) dividing the preprocessed picture into N small pictures, for example, N =4 may be taken. The division into small pictures is used for improving the detection speed;
and (6.3) inputting each small picture into the model for detection, and obtaining a classification result through parallel calculation by a multithreading method.
And (7) respectively obtaining a print form picture and a handwritten form picture according to the classification recognition result, so that the separation of the print form and the handwritten form is realized.
And (8) respectively performing character recognition on the separated print form and the handwriting picture, wherein the character recognition software can adopt a hundred-degree open source program PaddleOCR.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 contains both print and handwritten original pictures;
FIG. 3 isolates the print from the original picture;
FIG. 4 separates handwriting from original pictures;
FIG. 5 is a diagram illustrating character recognition of the separated print;
fig. 6 performs character recognition on the separated handwriting.
Detailed Description
The following describes embodiments of the present invention in further detail with reference to the accompanying drawings.
FIG. 1 is a flow chart of the present invention, which includes the following steps.
The method comprises the following steps of (1) making a training sample data set, wherein the training sample data set comprises the following steps:
(1.1) preparing a piece of paper, wherein the printing font on the paper is required to be black, and the blank of the paper is white;
(1.2) writing on paper with a red pen;
(1.3) the picture obtained by photographing the paper on which the characters are written is recorded as a picture a. This step can also be scanned with a scanner;
and (1.4) preprocessing the graph a through an algorithm program, wherein the preprocessing comprises frame cutting and binarization to form a black-and-white graph, the obtained result graph is a graph b, and the graph b is an input sample of the training model. The print and the handwriting of fig. b are black, the background becomes white;
(1.5) the print, handwriting and background in figure a are classified at pixel level by an algorithm program, and the result is recorded as figure c. The classification principle of the algorithm program is that the red handwriting is easily distinguished from the black printing and the white background based on the different sizes of the pixel values corresponding to different colors. Graph c is an output sample of the training model. Here, the pixel of the background is represented by 0, the pixel of the print font is represented by 1, and the pixel of the handwritten font is represented by 2. Thereby realizing the pixel level labeling of the handwritten font, the printed font and the background;
(1.6) to achieve the desired effect, the sample number is much more beneficial, for example, 2200 sheets can be made, 2000 sheets are used for training and 200 sheets are used for detection. The languages of characters in the currently produced pictures are mainly Chinese, English and mathematical characters. The same method can be used to make samples of pictures in other languages.
And (2) establishing a deep learning model, which is mainly an artificial intelligence model in the field of image semantic segmentation, wherein the artificial intelligence model can be a full convolution neural network (FCN), a hole convolution (scaled convolution), an FCN model with hole convolution, a segNet network, a U-Net network or the like.
In consideration of practical engineering application, the constructed model needs to be as simple as possible and has high classification accuracy. Too complex models are computationally expensive and time consuming. An example of one of the models is as follows:
(1) the input layer of the model is a 3-channel graph of 1024 multiplied by 1024;
(2) a void convolution layer having 32 feature maps, a convolution kernel of 5 × 5, and an expansion ratio = 2;
(3) dropout layer with a Dropout probability of 20%;
(4) a void convolution layer having 32 feature maps, a convolution kernel of 5 × 5, and an expansion ratio = 3;
(5) a pooling layer with a sampling factor of 2 × 2;
(6) a hole convolution layer having 64 feature maps, a convolution kernel of 5 × 5, and an expansion ratio = 2;
(7) dropout layer with a Dropout probability of 20%;
(8) a hole convolution layer having 64 feature maps, a convolution kernel of 5 × 5, and an expansion ratio = 3;
(9) a pooling layer with a sampling factor of 2 × 2;
(10) a hole convolution layer having 64 feature maps, a convolution kernel of 5 × 5, and an expansion ratio = 2;
(11) dropout layer with a Dropout probability of 20%;
(12) a hole convolution layer having 64 feature maps, a convolution kernel of 5 × 5, and an expansion ratio = 3;
(13) a convolution layer having 32 feature maps, the convolution kernel being 1 × 1;
(14) the deconvolution layer has 3 signatures, a convolution kernel of 4 × 4 and a step size of 4 × 4.
The model training in the step (3) is to input training samples into the model for training, and comprises the following steps:
(3.1) input of training samples
Inputting the graph b as an input of a training sample into the artificial intelligence model;
(3.2) output of training samples
And c is taken as the target output of the training sample. Graph c includes 3 categories, denoted respectively as: the printing form is marked as 1, the handwritten font is marked as 2, and the background is marked as 0;
and (3.3) training the model by preparing the model and the training sample, wherein the training of the model can be trained on a personal computer, or on a CPU server or a GPU server. If the sample size is large, it is best trained on the GPU server. After training the model, the model is saved. Meanwhile, the model is detected by using a detection sample, the classification accuracy of the model is detected, and the method can be used for practical application if the classification accuracy is high.
And (4) acquiring a picture containing a print form and a handwritten form, wherein the method comprises the following steps: the paper containing the print and the handwriting is photographed by a mobile phone, a scanner or a high-speed camera. Fig. 2 is a part of the original image obtained by photographing.
And (5) preprocessing the picture, including picture correcting, shadow removing and the like. Picture rectification may also be understood as border cropping. Unshaded may also be understood as binarized.
And (6) inputting the preprocessed picture into a trained model for detection to obtain a pixel-level classification result graph, wherein the method comprises the following steps:
(6.1) loading the trained model;
(6.2) dividing the preprocessed picture into N small pictures, for example, N =4 may be taken. The division into small pictures is used for improving the detection speed;
and (6.3) inputting each small picture into the model for detection, and obtaining a classification result through parallel calculation by a multithreading method.
And (7) respectively obtaining a print form picture and a handwritten form picture according to the classification recognition result, so that the separation of the print form and the handwritten form is realized. Fig. 3 shows a printed form separated from the original picture, and fig. 4 shows a handwritten form separated from the original picture.
And (8) respectively performing character recognition on the separated print form and the handwritten form picture, wherein character recognition software can adopt a Baidu open source program PaddleOCR, for example, FIG. 5 shows the result of character recognition on the separated print form, and FIG. 6 shows the result of character recognition on the separated handwritten form. Most of the recognition is correct as can be seen from the character recognition result.
Those skilled in the art will appreciate that the invention may be practiced without these specific details.

Claims (9)

1. A print form and handwriting form separated character recognition method based on deep learning is characterized in that: the method comprises the following steps:
(1) making a training sample, namely making a training sample data set containing handwritten fonts and printed fonts;
(2) establishing a deep learning model;
(3) training a model;
(4) acquiring a picture containing a print form and a handwritten form;
(5) preprocessing the picture;
(6) detecting the image input model to obtain a pixel level classification result graph;
(7) respectively obtaining a print picture and a handwritten picture according to the classification and identification results;
(8) and respectively carrying out character recognition on the printed form and the handwritten form picture.
2. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: the method comprises the following steps of (1) making a training sample data set, wherein the training sample data set comprises the following steps:
(1.1) preparing a piece of paper, wherein the printing font on the paper is required to be black, and the blank of the paper is white;
(1.2) writing on paper with a red pen;
(1.3) taking a picture of the paper with the written characters, and recording the picture as a picture a, wherein the step can be scanned by a scanner;
(1.4) preprocessing the graph a by an algorithm program, wherein the preprocessing comprises frame cutting and binary conversion into a black-white graph, the obtained result graph is marked as a graph b, the graph b is an input sample of a training model, the print and the handwriting of the graph b are black, and the background is changed into white;
(1.5) carrying out pixel-level classification on the print, the handwriting and the background in the graph a through an algorithm program, and recording an obtained result as a graph c; the classification principle of the algorithm program is that based on the fact that pixel values corresponding to different colors are different in size, a red handwritten form is easily distinguished from a black printed form and a white background; FIG. c is an output sample of the training model; here, the pixel of the background is represented by 0, the pixel of the printed font is represented by 1, and the pixel of the handwritten font is represented by 2; therefore, pixel-level labeling of handwritten fonts, printed fonts and backgrounds is realized.
3. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: and (2) establishing a deep learning model, which is mainly an artificial intelligence model in the field of image semantic segmentation, wherein the artificial intelligence model can be a full convolution neural network (FCN), a hole convolution (scaled convolution), an FCN model with hole convolution, a segNet network, a U-Net network or the like.
4. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: the model training in the step (3) is to input training samples into the model for training, and comprises the following steps:
(3.1) input of training samples
Inputting the graph b as an input of a training sample into the artificial intelligence model;
(3.2) output of training samples
Taking the graph c as the target output of the training sample, the graph c includes 3 categories, which are respectively recorded as: the printing form is marked as 1, the handwritten font is marked as 2, the background is marked as 0, the model can be trained after the model and the training sample are prepared, and the training of the model can be carried out on a personal computer, a CPU (central processing unit) server or a GPU (graphics processing unit) server; if the sample size is large, training is preferably carried out on a GPU server, and the model is stored after the model is trained; meanwhile, the model is detected by using a detection sample, the classification accuracy of the model is detected, and the method can be used for practical application if the classification accuracy is high.
5. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: and (4) acquiring a picture containing a print form and a handwritten form, wherein the method comprises the following steps: the paper containing the print and the handwriting is photographed by a mobile phone, a scanner or a high-speed camera.
6. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: and (5) preprocessing the picture, wherein the preprocessing comprises picture correction, shadow removal and the like, the picture correction can also be understood as frame clipping, and the shadow removal can also be understood as binarization.
7. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: and (6) inputting the preprocessed picture into a trained model for detection to obtain a pixel-level classification result graph, wherein the method comprises the following steps:
(6.1) loading the trained model;
(6.2) dividing the preprocessed picture into N small pictures, for example, N =4, wherein the division into the small pictures is to improve the detection speed;
and (6.3) inputting each small picture into the model for detection, and obtaining a classification result through parallel calculation by a multithreading method.
8. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: and (7) respectively obtaining a print form picture and a handwritten form picture according to the classification recognition result, so that the separation of the print form and the handwritten form is realized.
9. The method for print-to-handwriting separated character recognition based on deep learning according to claim 1, wherein: and (8) respectively performing character recognition on the separated print form and the handwriting picture, wherein the character recognition software can adopt a hundred-degree open source program PaddleOCR.
CN202111309350.2A 2021-11-06 2021-11-06 Print form and handwritten form separated character recognition method based on deep learning Pending CN113901952A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111309350.2A CN113901952A (en) 2021-11-06 2021-11-06 Print form and handwritten form separated character recognition method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111309350.2A CN113901952A (en) 2021-11-06 2021-11-06 Print form and handwritten form separated character recognition method based on deep learning

Publications (1)

Publication Number Publication Date
CN113901952A true CN113901952A (en) 2022-01-07

Family

ID=79193517

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111309350.2A Pending CN113901952A (en) 2021-11-06 2021-11-06 Print form and handwritten form separated character recognition method based on deep learning

Country Status (1)

Country Link
CN (1) CN113901952A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114690982A (en) * 2022-03-31 2022-07-01 呼和浩特民族学院 Intelligent teaching method for physics teaching
CN115100656A (en) * 2022-08-25 2022-09-23 江西风向标智能科技有限公司 Blank answer sheet identification method, system, storage medium and computer equipment
CN115880704A (en) * 2023-02-16 2023-03-31 中国人民解放军总医院第一医学中心 Automatic case cataloging method, system, equipment and storage medium
CN117115195A (en) * 2023-10-24 2023-11-24 成都信息工程大学 Tamper-proof identification method and tamper-proof identification system based on block chain

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114690982A (en) * 2022-03-31 2022-07-01 呼和浩特民族学院 Intelligent teaching method for physics teaching
CN114690982B (en) * 2022-03-31 2023-03-31 呼和浩特民族学院 Intelligent teaching method for physics teaching
CN115100656A (en) * 2022-08-25 2022-09-23 江西风向标智能科技有限公司 Blank answer sheet identification method, system, storage medium and computer equipment
CN115880704A (en) * 2023-02-16 2023-03-31 中国人民解放军总医院第一医学中心 Automatic case cataloging method, system, equipment and storage medium
CN117115195A (en) * 2023-10-24 2023-11-24 成都信息工程大学 Tamper-proof identification method and tamper-proof identification system based on block chain

Similar Documents

Publication Publication Date Title
CN110210413B (en) Multidisciplinary test paper content detection and identification system and method based on deep learning
CN109948510B (en) Document image instance segmentation method and device
Karthick et al. Steps involved in text recognition and recent research in OCR; a study
CN113901952A (en) Print form and handwritten form separated character recognition method based on deep learning
US8494273B2 (en) Adaptive optical character recognition on a document with distorted characters
Marinai Introduction to document analysis and recognition
CN109784342B (en) OCR (optical character recognition) method and terminal based on deep learning model
CN114299528B (en) Information extraction and structuring method for scanned document
CN111461122B (en) Certificate information detection and extraction method
US20240037969A1 (en) Recognition of handwritten text via neural networks
CN109635805B (en) Image text positioning method and device and image text identification method and device
CN113128442A (en) Chinese character calligraphy style identification method and scoring method based on convolutional neural network
Demilew et al. Ancient Geez script recognition using deep learning
CN116071763B (en) Teaching book intelligent correction system based on character recognition
CN110956167A (en) Classification discrimination and strengthened separation method based on positioning characters
CN112446259A (en) Image processing method, device, terminal and computer readable storage medium
Kaundilya et al. Automated text extraction from images using OCR system
CN112508000B (en) Method and equipment for generating OCR image recognition model training data
Sarkar et al. Suppression of non-text components in handwritten document images
CN109508712A (en) A kind of Chinese written language recognition methods based on image
CN113139535A (en) OCR document recognition method
Ovodov Optical braille recognition using object detection neural network
CN110766001B (en) Bank card number positioning and end-to-end identification method based on CNN and RNN
Aravinda et al. Template matching method for Kannada handwritten recognition based on correlation analysis
CN112036330A (en) Text recognition method, text recognition device and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination