CN108985175B - Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning - Google Patents

Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning Download PDF

Info

Publication number
CN108985175B
CN108985175B CN201810634249.6A CN201810634249A CN108985175B CN 108985175 B CN108985175 B CN 108985175B CN 201810634249 A CN201810634249 A CN 201810634249A CN 108985175 B CN108985175 B CN 108985175B
Authority
CN
China
Prior art keywords
picture
standard peripheral
character
peripheral outline
handwritten chinese
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810634249.6A
Other languages
Chinese (zh)
Other versions
CN108985175A (en
Inventor
王琦琦
尹成娟
王以忠
杨国威
许素霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University of Science and Technology
Original Assignee
Tianjin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University of Science and Technology filed Critical Tianjin University of Science and Technology
Priority to CN201810634249.6A priority Critical patent/CN108985175B/en
Publication of CN108985175A publication Critical patent/CN108985175A/en
Application granted granted Critical
Publication of CN108985175B publication Critical patent/CN108985175B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/30Writer recognition; Reading and verifying signatures
    • G06V40/33Writer recognition; Reading and verifying signatures based only on signature image, e.g. static signature recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Character Input (AREA)

Abstract

The invention provides a handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning, which comprises the following steps: (1) character cutting: writing the Chinese character sentence set to be identified into specific paper with standard peripheral outline, scanning the paper, and cutting the scanned picture to obtain single character picture with standard peripheral outline. (2) Picture processing: and removing the standard peripheral outline of the cut single character picture, and amplifying and binarizing the single character picture. (3) Character recognition: and calling an identification module to identify the single character picture after the picture processing to obtain an identification result. The character cutting part of the invention introduces the standard peripheral outline, thereby effectively avoiding the error cutting of characters and improving the accuracy of character recognition.

Description

Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning
Technical Field
The invention belongs to the technical field of image processing, relates to identification of handwritten Chinese, and discloses a handwritten Chinese sentence set identification method based on standard peripheral outline and deep learning.
Background
The examination system is an important mechanism for talent selection in China and is an objective embodiment of learning results of students and teaching results of teachers. Among many examination subjects, the expression of Chinese composition as a reading and writing capability becomes an indispensable examination subject in schools of middle and primary schools in China. The traditional Chinese composition marking adopts a paper marking mode. However, with the advent of the "paperless" era, the conventional paper scoring method cannot meet the daily requirements of schools of middle and primary schools, and has many disadvantages, such as: the teacher can see the student information unintentionally due to loose binding, so that the unfairness of the examination is caused according to the individual subjective impression; when the paper review language composition is used for practice at ordinary times, a teacher needs to write comments by hand, so that the paper review period is long, and the practice chances of students are reduced. In order to solve the above problems, electronic paper marking is becoming a mainstream paper marking method. The electronic paper marking saves the binding link of paper answer paper, so that the efficiency of paper marking is improved, students can obtain more exercise opportunities, and the unfair examination phenomenon caused by paper marking is avoided.
In the traditional electronic paper reading of Chinese composition, a teacher directly starts to read after the test paper is scanned, so that the teacher still sees handwritten Chinese characters, different people have different writing styles, and the visual fatigue of the paper reading teacher is easily caused when the number of the test paper is large, so that the probability of misjudgment and misjudgment is greatly increased. Therefore, in order to relieve visual fatigue caused by long-time paper marking, a character recognition system is needed to be found, so that different handwritten Chinese sentence sets are converted into uniform printing forms.
Most character recognition systems are usually implemented on the basis of individual characters, so that to recognize a set of handwritten Chinese sentences, a cut of individual characters must first be made. At present, there are many research methods for character segmentation at home and abroad, including character segmentation methods based on recognition, character segmentation methods based on projection methods, and segmentation methods based on analyzing specific background areas and adopting different segmentation strategies for specific sticky numbers. However, the existing character cutting methods still have the defects of low cutting efficiency, poor recognition instantaneity and the like. Therefore, how to accurately cut each character becomes an important problem in the research of the character recognition system.
Disclosure of Invention
In order to solve the defects of the prior art in the related field, the invention provides a handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning, which adopts the technical scheme that:
character cutting: writing the Chinese character sentence set to be identified into specific paper with standard peripheral outline, scanning the paper, and cutting the scanned picture to obtain single character picture with standard peripheral outline.
Picture processing: and removing the standard peripheral outline of the cut single character picture, and amplifying and binarizing the single character picture.
Character recognition: and calling an identification module to identify the single character picture after the picture processing to obtain an identification result.
Compared with the prior art, the invention has the following advantages:
(1) the character cutting part of the invention introduces the standard peripheral outline, thereby effectively avoiding the error cutting of characters and improving the accuracy of character recognition.
(2) The introduction of the deep learning technology makes up the defects of the traditional character recognition technology to a great extent, realizes the recognition of handwritten Chinese characters, greatly improves the accuracy of character recognition, and has good robustness for characters with complex backgrounds and lower resolution.
(3) The improvement of the deep convolutional neural network structure reduces the complexity of the network and improves the portability of the network.
Drawings
FIG. 1 is a drawing of a particular sheet with a standard peripheral outline;
FIG. 2 is a diagram of a set of handwritten Chinese sentences to be recognized;
FIG. 3 is a diagram of a word after word segmentation;
FIG. 4 is a block diagram with standard peripheral outlines removed;
FIG. 5 is an enlarged, binarized single-word graph;
fig. 6 is a diagram showing the result of character recognition.
Detailed Description
The invention will be further described in detail by means of specific embodiments with reference to the accompanying drawings.
A handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning mainly comprises the steps of character cutting, picture processing, character recognition and the like, so that picture information is converted into text information. The character cutting and picture processing part is realized by calling an opencv interface on a Visual Studio platform, and the character recognition module is realized by adopting an improved version of an Alexnet model under a Caffe open source framework.
Character cutting: as shown in fig. 1, a specific sheet with a standard peripheral outline is written with a set of chinese sentences to be recognized and scanned, as shown in fig. 2. The scanned picture is cut by a minimum circumscribed rectangle method, and the handwritten Chinese character sentence set picture with the standard peripheral outline is cut into single character pictures with the standard peripheral outline and is stored in sequence, as shown in fig. 3.
Picture processing: removing the standard peripheral outline of the single character picture in fig. 3 by adjusting the RGB color channel threshold, as shown in fig. 4; in practical situations, most of the input handwritten Chinese characters are multi-channel images, and in consideration of the accuracy of character recognition, a single character image with a standard peripheral outline removed is amplified and converted into a binary image by adjusting a color channel threshold, and then the image is stored as shown in fig. 5.
Character recognition: and calling the identification module for each single character picture in sequence to obtain an identification result, as shown in fig. 6. The identification module needs model training before being called, and can be obtained by training in advance through the following steps:
the method comprises the following steps: and (3) analyzing GNT format data of the HWDB1.1 data set, carrying out binarization processing, dividing 240 multiplied by 1000 handwritten Chinese character samples into a training set and a test set, and converting the training set and the test set into a Caffe available lmdb format data set.
Step two: and initially configuring the convolutional neural network.
Step three: and carrying out repeated supervised learning on the 1000-class handwritten Chinese character training set by utilizing an improved version of the deep convolutional neural network Alexnet, and continuously adjusting the connection weight among all layers of the network according to the learning error during the repeated supervised learning. Meanwhile, when the learning is repeated for a set number of times, the test set is sent to the network for testing and the test accuracy is obtained. Wherein, the improvement to Alexnet model includes: the number of convolution kernels in the convolution layer 1 is changed from 96 to 80, and the corresponding pooling layer is also correspondingly improved; the first fully connected layer is removed.
Step four: and when the learning error of the convolutional neural network is lower than a first preset value and the test accuracy is higher than a second preset value, stopping training and storing the connection weight between the layers of the current network to obtain the optimal recognition model.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various changes and modifications can be made without departing from the inventive concept, and these changes and modifications are all within the scope of the present invention.

Claims (3)

1. A handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning comprises the following steps:
(1) preparing specific paper with a standard peripheral outline, writing a Chinese character sentence set to be identified on the specific paper, and scanning the Chinese character sentence set;
(2) cutting the scanned picture by using a minimum external rectangle method, cutting the handwritten Chinese character sentence set picture with the standard peripheral outline into single character pictures with the standard peripheral outline, and storing the single character pictures in sequence;
(3) removing the standard peripheral outline of the single character picture by adjusting the RGB color channel threshold;
(4) amplifying the single character picture without the standard peripheral outline, converting the single character picture into a binary picture by adjusting a color channel threshold value, and then storing the picture;
(5) and calling a character recognition module according to the sequence of each single character picture to obtain a recognition result.
2. The method for recognizing a handwritten Chinese sentence set based on standard peripheral outlines and deep learning of claim 1, wherein: the recognition module needs model training before being called, 240-1000 handwritten Chinese characters are divided into a test set and a training set by adopting a handwritten Chinese character data set HWDB1.1, and a deep convolutional neural network Alexnet model is called to repeatedly train and predict 1000 types of handwritten Chinese characters to finally obtain an optimal recognition model and weight parameters thereof.
3. The method for recognizing a handwritten Chinese sentence set based on standard peripheral outlines and deep learning of claim 2, wherein: the number of convolution kernels in the first convolution layer of the Alexnet model is 80, and the first full-link layer is removed.
CN201810634249.6A 2018-06-20 2018-06-20 Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning Active CN108985175B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810634249.6A CN108985175B (en) 2018-06-20 2018-06-20 Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810634249.6A CN108985175B (en) 2018-06-20 2018-06-20 Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning

Publications (2)

Publication Number Publication Date
CN108985175A CN108985175A (en) 2018-12-11
CN108985175B true CN108985175B (en) 2021-06-04

Family

ID=64540752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810634249.6A Active CN108985175B (en) 2018-06-20 2018-06-20 Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning

Country Status (1)

Country Link
CN (1) CN108985175B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110399850B (en) * 2019-07-30 2021-10-15 西安工业大学 Continuous sign language recognition method based on deep neural network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102750556A (en) * 2012-06-01 2012-10-24 山东大学 Off-line handwritten form Chinese character recognition method
CN103513898A (en) * 2012-06-21 2014-01-15 夏普株式会社 Handwritten character segmenting method and electronic equipment
CN104239879A (en) * 2014-09-29 2014-12-24 小米科技有限责任公司 Character segmentation method and device
CN104484643A (en) * 2014-10-27 2015-04-01 中国科学技术大学 Intelligent identification method and system for hand-written table
CN105574486A (en) * 2015-11-25 2016-05-11 成都数联铭品科技有限公司 Image table character segmenting method
CN105654087A (en) * 2015-12-30 2016-06-08 李宇 Color template-based offline handwritten character extraction method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101930545A (en) * 2009-06-24 2010-12-29 夏普株式会社 Handwriting recognition method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102750556A (en) * 2012-06-01 2012-10-24 山东大学 Off-line handwritten form Chinese character recognition method
CN103513898A (en) * 2012-06-21 2014-01-15 夏普株式会社 Handwritten character segmenting method and electronic equipment
CN104239879A (en) * 2014-09-29 2014-12-24 小米科技有限责任公司 Character segmentation method and device
CN104484643A (en) * 2014-10-27 2015-04-01 中国科学技术大学 Intelligent identification method and system for hand-written table
CN105574486A (en) * 2015-11-25 2016-05-11 成都数联铭品科技有限公司 Image table character segmenting method
CN105654087A (en) * 2015-12-30 2016-06-08 李宇 Color template-based offline handwritten character extraction method

Also Published As

Publication number Publication date
CN108985175A (en) 2018-12-11

Similar Documents

Publication Publication Date Title
CN108932508B (en) Method and system for intelligently identifying and correcting subjects
CN110348400B (en) Score obtaining method and device and electronic equipment
US10339428B2 (en) Intelligent scoring method and system for text objective question
CN107506762B (en) Score automatic input method based on image analysis
CN101685482A (en) Electric marking system capable of automatically processing marking results and method thereof
CN105427696A (en) Method for distinguishing answer to target question
CN104463101A (en) Answer recognition method and system for textual test question
CN110414563A (en) Total marks of the examination statistical method, system and computer readable storage medium
CN111814616A (en) Automatic examination paper marking processing system without answer sheet and implementation method thereof
WO2022161293A1 (en) Image processing method and apparatus, and electronic device and storage medium
CN113159014B (en) Objective question reading method, device, equipment and storage medium based on handwritten question number
CN110956138A (en) Family education equipment-based auxiliary learning method and family education equipment
CN106815814B (en) Image pollution processing method applied to paper marking system
CN111008594B (en) Error-correction question review method, related device and readable storage medium
CN111611854B (en) Classroom condition evaluation method based on pattern recognition
CN110837793A (en) Intelligent recognition handwriting mathematical formula reading and amending system
CN110298236B (en) Automatic Braille image identification method and system based on deep learning
CN116303871A (en) Exercise book reading method
CN107220610A (en) A kind of subjective item fraction recognition methods applied to marking system
CN108985175B (en) Handwritten Chinese sentence set recognition method based on standard peripheral outline and deep learning
CN108681713A (en) A kind of system for teaching quality evaluation for teachers
CN111428623A (en) Chinese blackboard-writing style analysis system based on big data and computer vision
CN115294573A (en) Job correction method, device, equipment and medium
CN113903039A (en) Color-based answer area acquisition method for answer sheet
JP4710707B2 (en) Additional recording information processing method, additional recording information processing apparatus, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant