CN108985175B - Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning - Google Patents
Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning Download PDFInfo
- Publication number
- CN108985175B CN108985175B CN201810634249.6A CN201810634249A CN108985175B CN 108985175 B CN108985175 B CN 108985175B CN 201810634249 A CN201810634249 A CN 201810634249A CN 108985175 B CN108985175 B CN 108985175B
- Authority
- CN
- China
- Prior art keywords
- picture
- standard peripheral
- character
- peripheral outline
- handwritten chinese
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/30—Writer recognition; Reading and verifying signatures
- G06V40/33—Writer recognition; Reading and verifying signatures based only on signature image, e.g. static signature recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Character Input (AREA)
Abstract
The invention provides a handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning, which comprises the following steps: (1) character cutting: writing the Chinese character sentence set to be identified into specific paper with standard peripheral outline, scanning the paper, and cutting the scanned picture to obtain single character picture with standard peripheral outline. (2) Picture processing: and removing the standard peripheral outline of the cut single character picture, and amplifying and binarizing the single character picture. (3) Character recognition: and calling an identification module to identify the single character picture after the picture processing to obtain an identification result. The character cutting part of the invention introduces the standard peripheral outline, thereby effectively avoiding the error cutting of characters and improving the accuracy of character recognition.
Description
Technical Field
The invention belongs to the technical field of image processing, relates to identification of handwritten Chinese, and discloses a handwritten Chinese sentence set identification method based on standard peripheral outline and deep learning.
Background
The examination system is an important mechanism for talent selection in China and is an objective embodiment of learning results of students and teaching results of teachers. Among many examination subjects, the expression of Chinese composition as a reading and writing capability becomes an indispensable examination subject in schools of middle and primary schools in China. The traditional Chinese composition marking adopts a paper marking mode. However, with the advent of the "paperless" era, the conventional paper scoring method cannot meet the daily requirements of schools of middle and primary schools, and has many disadvantages, such as: the teacher can see the student information unintentionally due to loose binding, so that the unfairness of the examination is caused according to the individual subjective impression; when the paper review language composition is used for practice at ordinary times, a teacher needs to write comments by hand, so that the paper review period is long, and the practice chances of students are reduced. In order to solve the above problems, electronic paper marking is becoming a mainstream paper marking method. The electronic paper marking saves the binding link of paper answer paper, so that the efficiency of paper marking is improved, students can obtain more exercise opportunities, and the unfair examination phenomenon caused by paper marking is avoided.
In the traditional electronic paper reading of Chinese composition, a teacher directly starts to read after the test paper is scanned, so that the teacher still sees handwritten Chinese characters, different people have different writing styles, and the visual fatigue of the paper reading teacher is easily caused when the number of the test paper is large, so that the probability of misjudgment and misjudgment is greatly increased. Therefore, in order to relieve visual fatigue caused by long-time paper marking, a character recognition system is needed to be found, so that different handwritten Chinese sentence sets are converted into uniform printing forms.
Most character recognition systems are usually implemented on the basis of individual characters, so that to recognize a set of handwritten Chinese sentences, a cut of individual characters must first be made. At present, there are many research methods for character segmentation at home and abroad, including character segmentation methods based on recognition, character segmentation methods based on projection methods, and segmentation methods based on analyzing specific background areas and adopting different segmentation strategies for specific sticky numbers. However, the existing character cutting methods still have the defects of low cutting efficiency, poor recognition instantaneity and the like. Therefore, how to accurately cut each character becomes an important problem in the research of the character recognition system.
Disclosure of Invention
In order to solve the defects of the prior art in the related field, the invention provides a handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning, which adopts the technical scheme that:
character cutting: writing the Chinese character sentence set to be identified into specific paper with standard peripheral outline, scanning the paper, and cutting the scanned picture to obtain single character picture with standard peripheral outline.
Picture processing: and removing the standard peripheral outline of the cut single character picture, and amplifying and binarizing the single character picture.
Character recognition: and calling an identification module to identify the single character picture after the picture processing to obtain an identification result.
Compared with the prior art, the invention has the following advantages:
(1) the character cutting part of the invention introduces the standard peripheral outline, thereby effectively avoiding the error cutting of characters and improving the accuracy of character recognition.
(2) The introduction of the deep learning technology makes up the defects of the traditional character recognition technology to a great extent, realizes the recognition of handwritten Chinese characters, greatly improves the accuracy of character recognition, and has good robustness for characters with complex backgrounds and lower resolution.
(3) The improvement of the deep convolutional neural network structure reduces the complexity of the network and improves the portability of the network.
Drawings
FIG. 1 is a drawing of a particular sheet with a standard peripheral outline;
FIG. 2 is a diagram of a set of handwritten Chinese sentences to be recognized;
FIG. 3 is a diagram of a word after word segmentation;
FIG. 4 is a block diagram with standard peripheral outlines removed;
FIG. 5 is an enlarged, binarized single-word graph;
fig. 6 is a diagram showing the result of character recognition.
Detailed Description
The invention will be further described in detail by means of specific embodiments with reference to the accompanying drawings.
A handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning mainly comprises the steps of character cutting, picture processing, character recognition and the like, so that picture information is converted into text information. The character cutting and picture processing part is realized by calling an opencv interface on a Visual Studio platform, and the character recognition module is realized by adopting an improved version of an Alexnet model under a Caffe open source framework.
Character cutting: as shown in fig. 1, a specific sheet with a standard peripheral outline is written with a set of chinese sentences to be recognized and scanned, as shown in fig. 2. The scanned picture is cut by a minimum circumscribed rectangle method, and the handwritten Chinese character sentence set picture with the standard peripheral outline is cut into single character pictures with the standard peripheral outline and is stored in sequence, as shown in fig. 3.
Picture processing: removing the standard peripheral outline of the single character picture in fig. 3 by adjusting the RGB color channel threshold, as shown in fig. 4; in practical situations, most of the input handwritten Chinese characters are multi-channel images, and in consideration of the accuracy of character recognition, a single character image with a standard peripheral outline removed is amplified and converted into a binary image by adjusting a color channel threshold, and then the image is stored as shown in fig. 5.
Character recognition: and calling the identification module for each single character picture in sequence to obtain an identification result, as shown in fig. 6. The identification module needs model training before being called, and can be obtained by training in advance through the following steps:
the method comprises the following steps: and (3) analyzing GNT format data of the HWDB1.1 data set, carrying out binarization processing, dividing 240 multiplied by 1000 handwritten Chinese character samples into a training set and a test set, and converting the training set and the test set into a Caffe available lmdb format data set.
Step two: and initially configuring the convolutional neural network.
Step three: and carrying out repeated supervised learning on the 1000-class handwritten Chinese character training set by utilizing an improved version of the deep convolutional neural network Alexnet, and continuously adjusting the connection weight among all layers of the network according to the learning error during the repeated supervised learning. Meanwhile, when the learning is repeated for a set number of times, the test set is sent to the network for testing and the test accuracy is obtained. Wherein, the improvement to Alexnet model includes: the number of convolution kernels in the convolution layer 1 is changed from 96 to 80, and the corresponding pooling layer is also correspondingly improved; the first fully connected layer is removed.
Step four: and when the learning error of the convolutional neural network is lower than a first preset value and the test accuracy is higher than a second preset value, stopping training and storing the connection weight between the layers of the current network to obtain the optimal recognition model.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various changes and modifications can be made without departing from the inventive concept, and these changes and modifications are all within the scope of the present invention.
Claims (3)
1. A handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning comprises the following steps:
(1) preparing specific paper with a standard peripheral outline, writing a Chinese character sentence set to be identified on the specific paper, and scanning the Chinese character sentence set;
(2) cutting the scanned picture by using a minimum external rectangle method, cutting the handwritten Chinese character sentence set picture with the standard peripheral outline into single character pictures with the standard peripheral outline, and storing the single character pictures in sequence;
(3) removing the standard peripheral outline of the single character picture by adjusting the RGB color channel threshold;
(4) amplifying the single character picture without the standard peripheral outline, converting the single character picture into a binary picture by adjusting a color channel threshold value, and then storing the picture;
(5) and calling a character recognition module according to the sequence of each single character picture to obtain a recognition result.
2. The method for recognizing a handwritten Chinese sentence set based on standard peripheral outlines and deep learning of claim 1, wherein: the recognition module needs model training before being called, 240-1000 handwritten Chinese characters are divided into a test set and a training set by adopting a handwritten Chinese character data set HWDB1.1, and a deep convolutional neural network Alexnet model is called to repeatedly train and predict 1000 types of handwritten Chinese characters to finally obtain an optimal recognition model and weight parameters thereof.
3. The method for recognizing a handwritten Chinese sentence set based on standard peripheral outlines and deep learning of claim 2, wherein: the number of convolution kernels in the first convolution layer of the Alexnet model is 80, and the first full-link layer is removed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810634249.6A CN108985175B (en) | 2018-06-20 | 2018-06-20 | Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810634249.6A CN108985175B (en) | 2018-06-20 | 2018-06-20 | Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108985175A CN108985175A (en) | 2018-12-11 |
CN108985175B true CN108985175B (en) | 2021-06-04 |
Family
ID=64540752
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810634249.6A Active CN108985175B (en) | 2018-06-20 | 2018-06-20 | Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108985175B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110399850B (en) * | 2019-07-30 | 2021-10-15 | 西安工业大学 | Continuous sign language recognition method based on deep neural network |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102750556A (en) * | 2012-06-01 | 2012-10-24 | 山东大学 | Off-line handwritten form Chinese character recognition method |
CN103513898A (en) * | 2012-06-21 | 2014-01-15 | 夏普株式会社 | Handwritten character segmenting method and electronic equipment |
CN104239879A (en) * | 2014-09-29 | 2014-12-24 | 小米科技有限责任公司 | Character segmentation method and device |
CN104484643A (en) * | 2014-10-27 | 2015-04-01 | 中国科学技术大学 | Intelligent identification method and system for hand-written table |
CN105574486A (en) * | 2015-11-25 | 2016-05-11 | 成都数联铭品科技有限公司 | Image table character segmenting method |
CN105654087A (en) * | 2015-12-30 | 2016-06-08 | 李宇 | Color template-based offline handwritten character extraction method |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101930545A (en) * | 2009-06-24 | 2010-12-29 | 夏普株式会社 | Handwriting recognition method and device |
-
2018
- 2018-06-20 CN CN201810634249.6A patent/CN108985175B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102750556A (en) * | 2012-06-01 | 2012-10-24 | 山东大学 | Off-line handwritten form Chinese character recognition method |
CN103513898A (en) * | 2012-06-21 | 2014-01-15 | 夏普株式会社 | Handwritten character segmenting method and electronic equipment |
CN104239879A (en) * | 2014-09-29 | 2014-12-24 | 小米科技有限责任公司 | Character segmentation method and device |
CN104484643A (en) * | 2014-10-27 | 2015-04-01 | 中国科学技术大学 | Intelligent identification method and system for hand-written table |
CN105574486A (en) * | 2015-11-25 | 2016-05-11 | 成都数联铭品科技有限公司 | Image table character segmenting method |
CN105654087A (en) * | 2015-12-30 | 2016-06-08 | 李宇 | Color template-based offline handwritten character extraction method |
Also Published As
Publication number | Publication date |
---|---|
CN108985175A (en) | 2018-12-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108932508B (en) | Method and system for intelligently identifying and correcting subjects | |
CN110348400B (en) | Score obtaining method and device and electronic equipment | |
US10339428B2 (en) | Intelligent scoring method and system for text objective question | |
CN107506762B (en) | Score automatic input method based on image analysis | |
CN101685482A (en) | Electric marking system capable of automatically processing marking results and method thereof | |
CN105427696A (en) | Method for distinguishing answer to target question | |
CN104463101A (en) | Answer recognition method and system for textual test question | |
CN110414563A (en) | Total marks of the examination statistical method, system and computer readable storage medium | |
CN111814616A (en) | Automatic examination paper marking processing system without answer sheet and implementation method thereof | |
WO2022161293A1 (en) | Image processing method and apparatus, and electronic device and storage medium | |
CN113159014B (en) | Objective question reading method, device, equipment and storage medium based on handwritten question number | |
CN110956138A (en) | Family education equipment-based auxiliary learning method and family education equipment | |
CN106815814B (en) | Image pollution processing method applied to paper marking system | |
CN111008594B (en) | Error-correction question review method, related device and readable storage medium | |
CN111611854B (en) | Classroom condition evaluation method based on pattern recognition | |
CN110837793A (en) | Intelligent recognition handwriting mathematical formula reading and amending system | |
CN110298236B (en) | Automatic Braille image identification method and system based on deep learning | |
CN116303871A (en) | Exercise book reading method | |
CN107220610A (en) | A kind of subjective item fraction recognition methods applied to marking system | |
CN108985175B (en) | Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning | |
CN108681713A (en) | A kind of system for teaching quality evaluation for teachers | |
CN111428623A (en) | Chinese blackboard-writing style analysis system based on big data and computer vision | |
CN115294573A (en) | Job correction method, device, equipment and medium | |
CN113903039A (en) | Color-based answer area acquisition method for answer sheet | |
JP4710707B2 (en) | Additional recording information processing method, additional recording information processing apparatus, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |