CN108985175B - Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning - Google Patents
Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning Download PDFInfo
- Publication number
- CN108985175B CN108985175B CN201810634249.6A CN201810634249A CN108985175B CN 108985175 B CN108985175 B CN 108985175B CN 201810634249 A CN201810634249 A CN 201810634249A CN 108985175 B CN108985175 B CN 108985175B
- Authority
- CN
- China
- Prior art keywords
- standard peripheral
- picture
- peripheral outline
- character
- handwritten chinese
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000002093 peripheral effect Effects 0.000 title claims abstract description 29
- 238000000034 method Methods 0.000 title claims abstract description 19
- 238000013135 deep learning Methods 0.000 title claims abstract description 10
- 238000012360 testing method Methods 0.000 claims description 9
- 238000012549 training Methods 0.000 claims description 8
- 238000013527 convolutional neural network Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 abstract description 9
- 230000011218 segmentation Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 230000007547 defect Effects 0.000 description 3
- 208000003464 asthenopia Diseases 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 1
- 238000013077 scoring method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/30—Writer recognition; Reading and verifying signatures
- G06V40/33—Writer recognition; Reading and verifying signatures based only on signature image, e.g. static signature recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Character Input (AREA)
Abstract
The invention provides a handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning, which comprises the following steps: (1) character cutting: writing the Chinese character sentence set to be identified into specific paper with standard peripheral outline, scanning the paper, and cutting the scanned picture to obtain single character picture with standard peripheral outline. (2) Picture processing: and removing the standard peripheral outline of the cut single character picture, and amplifying and binarizing the single character picture. (3) Character recognition: and calling an identification module to identify the single character picture after the picture processing to obtain an identification result. The character cutting part of the invention introduces the standard peripheral outline, thereby effectively avoiding the error cutting of characters and improving the accuracy of character recognition.
Description
Technical Field
The invention belongs to the technical field of image processing, relates to identification of handwritten Chinese, and discloses a handwritten Chinese sentence set identification method based on standard peripheral outline and deep learning.
Background
The examination system is an important mechanism for talent selection in China and is an objective embodiment of learning results of students and teaching results of teachers. Among many examination subjects, the expression of Chinese composition as a reading and writing capability becomes an indispensable examination subject in schools of middle and primary schools in China. The traditional Chinese composition marking adopts a paper marking mode. However, with the advent of the "paperless" era, the conventional paper scoring method cannot meet the daily requirements of schools of middle and primary schools, and has many disadvantages, such as: the teacher can see the student information unintentionally due to loose binding, so that the unfairness of the examination is caused according to the individual subjective impression; when the paper review language composition is used for practice at ordinary times, a teacher needs to write comments by hand, so that the paper review period is long, and the practice chances of students are reduced. In order to solve the above problems, electronic paper marking is becoming a mainstream paper marking method. The electronic paper marking saves the binding link of paper answer paper, so that the efficiency of paper marking is improved, students can obtain more exercise opportunities, and the unfair examination phenomenon caused by paper marking is avoided.
In the traditional electronic paper reading of Chinese composition, a teacher directly starts to read after the test paper is scanned, so that the teacher still sees handwritten Chinese characters, different people have different writing styles, and the visual fatigue of the paper reading teacher is easily caused when the number of the test paper is large, so that the probability of misjudgment and misjudgment is greatly increased. Therefore, in order to relieve visual fatigue caused by long-time paper marking, a character recognition system is needed to be found, so that different handwritten Chinese sentence sets are converted into uniform printing forms.
Most character recognition systems are usually implemented on the basis of individual characters, so that to recognize a set of handwritten Chinese sentences, a cut of individual characters must first be made. At present, there are many research methods for character segmentation at home and abroad, including character segmentation methods based on recognition, character segmentation methods based on projection methods, and segmentation methods based on analyzing specific background areas and adopting different segmentation strategies for specific sticky numbers. However, the existing character cutting methods still have the defects of low cutting efficiency, poor recognition instantaneity and the like. Therefore, how to accurately cut each character becomes an important problem in the research of the character recognition system.
Disclosure of Invention
In order to solve the defects of the prior art in the related field, the invention provides a handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning, which adopts the technical scheme that:
character cutting: writing the Chinese character sentence set to be identified into specific paper with standard peripheral outline, scanning the paper, and cutting the scanned picture to obtain single character picture with standard peripheral outline.
Picture processing: and removing the standard peripheral outline of the cut single character picture, and amplifying and binarizing the single character picture.
Character recognition: and calling an identification module to identify the single character picture after the picture processing to obtain an identification result.
Compared with the prior art, the invention has the following advantages:
(1) the character cutting part of the invention introduces the standard peripheral outline, thereby effectively avoiding the error cutting of characters and improving the accuracy of character recognition.
(2) The introduction of the deep learning technology makes up the defects of the traditional character recognition technology to a great extent, realizes the recognition of handwritten Chinese characters, greatly improves the accuracy of character recognition, and has good robustness for characters with complex backgrounds and lower resolution.
(3) The improvement of the deep convolutional neural network structure reduces the complexity of the network and improves the portability of the network.
Drawings
FIG. 1 is a drawing of a particular sheet with a standard peripheral outline;
FIG. 2 is a diagram of a set of handwritten Chinese sentences to be recognized;
FIG. 3 is a diagram of a word after word segmentation;
FIG. 4 is a block diagram with standard peripheral outlines removed;
FIG. 5 is an enlarged, binarized single-word graph;
fig. 6 is a diagram showing the result of character recognition.
Detailed Description
The invention will be further described in detail by means of specific embodiments with reference to the accompanying drawings.
A handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning mainly comprises the steps of character cutting, picture processing, character recognition and the like, so that picture information is converted into text information. The character cutting and picture processing part is realized by calling an opencv interface on a Visual Studio platform, and the character recognition module is realized by adopting an improved version of an Alexnet model under a Caffe open source framework.
Character cutting: as shown in fig. 1, a specific sheet with a standard peripheral outline is written with a set of chinese sentences to be recognized and scanned, as shown in fig. 2. The scanned picture is cut by a minimum circumscribed rectangle method, and the handwritten Chinese character sentence set picture with the standard peripheral outline is cut into single character pictures with the standard peripheral outline and is stored in sequence, as shown in fig. 3.
Picture processing: removing the standard peripheral outline of the single character picture in fig. 3 by adjusting the RGB color channel threshold, as shown in fig. 4; in practical situations, most of the input handwritten Chinese characters are multi-channel images, and in consideration of the accuracy of character recognition, a single character image with a standard peripheral outline removed is amplified and converted into a binary image by adjusting a color channel threshold, and then the image is stored as shown in fig. 5.
Character recognition: and calling the identification module for each single character picture in sequence to obtain an identification result, as shown in fig. 6. The identification module needs model training before being called, and can be obtained by training in advance through the following steps:
the method comprises the following steps: and (3) analyzing GNT format data of the HWDB1.1 data set, carrying out binarization processing, dividing 240 multiplied by 1000 handwritten Chinese character samples into a training set and a test set, and converting the training set and the test set into a Caffe available lmdb format data set.
Step two: and initially configuring the convolutional neural network.
Step three: and carrying out repeated supervised learning on the 1000-class handwritten Chinese character training set by utilizing an improved version of the deep convolutional neural network Alexnet, and continuously adjusting the connection weight among all layers of the network according to the learning error during the repeated supervised learning. Meanwhile, when the learning is repeated for a set number of times, the test set is sent to the network for testing and the test accuracy is obtained. Wherein, the improvement to Alexnet model includes: the number of convolution kernels in the convolution layer 1 is changed from 96 to 80, and the corresponding pooling layer is also correspondingly improved; the first fully connected layer is removed.
Step four: and when the learning error of the convolutional neural network is lower than a first preset value and the test accuracy is higher than a second preset value, stopping training and storing the connection weight between the layers of the current network to obtain the optimal recognition model.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various changes and modifications can be made without departing from the inventive concept, and these changes and modifications are all within the scope of the present invention.
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810634249.6A CN108985175B (en) | 2018-06-20 | 2018-06-20 | Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810634249.6A CN108985175B (en) | 2018-06-20 | 2018-06-20 | Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108985175A CN108985175A (en) | 2018-12-11 |
CN108985175B true CN108985175B (en) | 2021-06-04 |
Family
ID=64540752
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810634249.6A Active CN108985175B (en) | 2018-06-20 | 2018-06-20 | Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108985175B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110399850B (en) * | 2019-07-30 | 2021-10-15 | 西安工业大学 | A Continuous Sign Language Recognition Method Based on Deep Neural Network |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102750556A (en) * | 2012-06-01 | 2012-10-24 | 山东大学 | Off-line handwritten form Chinese character recognition method |
CN103513898A (en) * | 2012-06-21 | 2014-01-15 | 夏普株式会社 | Handwritten character segmenting method and electronic equipment |
CN104239879A (en) * | 2014-09-29 | 2014-12-24 | 小米科技有限责任公司 | Character segmentation method and device |
CN104484643A (en) * | 2014-10-27 | 2015-04-01 | 中国科学技术大学 | Intelligent identification method and system for hand-written table |
CN105574486A (en) * | 2015-11-25 | 2016-05-11 | 成都数联铭品科技有限公司 | Image table character segmenting method |
CN105654087A (en) * | 2015-12-30 | 2016-06-08 | 李宇 | Color template-based offline handwritten character extraction method |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101930545A (en) * | 2009-06-24 | 2010-12-29 | 夏普株式会社 | Handwriting recognition method and device |
-
2018
- 2018-06-20 CN CN201810634249.6A patent/CN108985175B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102750556A (en) * | 2012-06-01 | 2012-10-24 | 山东大学 | Off-line handwritten form Chinese character recognition method |
CN103513898A (en) * | 2012-06-21 | 2014-01-15 | 夏普株式会社 | Handwritten character segmenting method and electronic equipment |
CN104239879A (en) * | 2014-09-29 | 2014-12-24 | 小米科技有限责任公司 | Character segmentation method and device |
CN104484643A (en) * | 2014-10-27 | 2015-04-01 | 中国科学技术大学 | Intelligent identification method and system for hand-written table |
CN105574486A (en) * | 2015-11-25 | 2016-05-11 | 成都数联铭品科技有限公司 | Image table character segmenting method |
CN105654087A (en) * | 2015-12-30 | 2016-06-08 | 李宇 | Color template-based offline handwritten character extraction method |
Also Published As
Publication number | Publication date |
---|---|
CN108985175A (en) | 2018-12-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11790641B2 (en) | Answer evaluation method, answer evaluation system, electronic device, and medium | |
CN108764074B (en) | Subjective item intelligently reading method, system and storage medium based on deep learning | |
CN110210413B (en) | A system and method for content detection and recognition of multi-disciplinary test papers based on deep learning | |
US10339428B2 (en) | Intelligent scoring method and system for text objective question | |
CN110298236B (en) | Automatic Braille image identification method and system based on deep learning | |
CN111814616A (en) | Automatic examination paper marking processing system without answer sheet and implementation method thereof | |
CN104463101A (en) | Answer recognition method and system for textual test question | |
WO2022161293A1 (en) | Image processing method and apparatus, and electronic device and storage medium | |
CN105427696A (en) | Method for distinguishing answer to target question | |
CN110414563A (en) | Total marks of the examination statistical method, system and computer readable storage medium | |
CN111832551B (en) | Text image processing method, device, electronic scanning equipment and storage medium | |
CN111008594B (en) | Error-correction question review method, related device and readable storage medium | |
CN108985175B (en) | Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning | |
CN107220610A (en) | A kind of subjective item fraction recognition methods applied to marking system | |
CN115346222A (en) | Handwritten Chinese character quality evaluation model acquisition method, evaluation method and device | |
CN110503101A (en) | Font evaluation method, apparatus, device and computer-readable storage medium | |
CN111428623B (en) | Chinese blackboard-writing style analysis system based on big data and computer vision | |
CN113903039A (en) | Color-based answer area acquisition method for answer sheet | |
CN119379500A (en) | A system for software operation skill assessment and its answering and scoring method | |
Roque et al. | Assistive technology for braille reading using optical braille recognition and text-to-speech | |
CN111814606A (en) | An automatic scoring system and implementation method for technical image processing and pattern recognition | |
CN110705610A (en) | Evaluation system and method based on handwriting detection and temporary writing capability | |
CN115171109A (en) | Handwritten braille identification method and system based on deep learning | |
JP4710707B2 (en) | Additional recording information processing method, additional recording information processing apparatus, and program | |
CN113947078A (en) | Newspaper dictation method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |