CN112507914A - OCR (optical character recognition) method and recognition system based on bankbook and bill characters - Google Patents

OCR (optical character recognition) method and recognition system based on bankbook and bill characters Download PDF

Info

Publication number
CN112507914A
CN112507914A CN202011482590.8A CN202011482590A CN112507914A CN 112507914 A CN112507914 A CN 112507914A CN 202011482590 A CN202011482590 A CN 202011482590A CN 112507914 A CN112507914 A CN 112507914A
Authority
CN
China
Prior art keywords
image
passbook
bankbook
layer
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011482590.8A
Other languages
Chinese (zh)
Inventor
孔飞
张文强
褚建民
李卫国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Guoguang Electronic Information Technology Co Ltd
Original Assignee
Jiangsu Guoguang Electronic Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Guoguang Electronic Information Technology Co Ltd filed Critical Jiangsu Guoguang Electronic Information Technology Co Ltd
Priority to CN202011482590.8A priority Critical patent/CN112507914A/en
Publication of CN112507914A publication Critical patent/CN112507914A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/34Smoothing or thinning of the pattern; Morphological operations; Skeletonisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/1475Inclination or skew detection or correction of characters or of image to be recognised
    • G06V30/1478Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Character Input (AREA)

Abstract

The invention discloses an OCR (optical character recognition) method and an OCR recognition system based on bankbook and bill characters, and belongs to the technical field of pattern recognition and computer vision. The method comprises the following steps: step 1, taking bankbook images at any angle, and preprocessing to obtain new bankbook images with corrected angles; step 2, carrying out angle correction on the orientation of the passbook image, and adjusting the orientation to a 0-degree state; step 3, positioning the position of the area to be identified in the corrected passbook image, and constructing a corresponding label of the area to be identified; and 4, recognizing the region to be recognized by adopting an OCR recognition model with an indefinite length, and outputting a recognition result. According to the invention, automatic extraction of the bankbook information is realized through OCR recognition, so that the time cost and labor cost for manually checking the information and inputting the information are reduced, and the working efficiency is greatly improved; by adopting the deep learning model to perform OCR recognition, the information recognition speed and the recognition accuracy are increased, and the method has higher robustness for different printing fonts.

Description

OCR (optical character recognition) method and recognition system based on bankbook and bill characters
Technical Field
The invention belongs to the technical field of pattern recognition and computer vision, and particularly relates to an OCR recognition method and a recognition system based on bankbook and bill characters.
Background
In recent years, computer vision technology is rapidly developed, OCR (optical character recognition) of pictures and characters becomes a popular direction, at present, OCR recognition under complex backgrounds such as natural scenes, financial bills and the like is researched more and has already been applied maturely, and the work efficiency is greatly improved through automatic recognition. The system is designed to be automated aiming at the problems that the efficiency is low in the bank passbook information extraction aspect, the accuracy rate of manually extracting information is reduced and the like.
Through long-term practice and research, the applicant finds that at least the following problems exist in the prior art: 1. the manual mode for extracting the bankbook information is low in efficiency, and the accuracy rate of the manual mode is reduced along with the time increase. 2. The existing OCR recognition system has low compatibility aiming at different application scenes, has higher requirements on the offset angle of a character picture to be recognized and the picture quality, and needs to manually acquire a picture to be recognized in a fixed direction.
Disclosure of Invention
The purpose of the invention is as follows: an OCR recognition method and system based on bankbook and bill characters are provided to solve the problems involved in the background art.
The technical scheme is as follows: an OCR recognition method based on bankbook and bill characters comprises the following steps:
step 1, taking bankbook images at any angle, and preprocessing to obtain new bankbook images with corrected angles;
step 2, carrying out angle correction on the orientation of the passbook image, and adjusting the orientation to a 0-degree state;
step 3, positioning the position of the area to be identified in the corrected passbook image, and constructing a corresponding label of the area to be identified;
and 4, recognizing the region to be recognized by adopting an OCR recognition model with an indefinite length, and outputting a recognition result.
Further, the pretreatment method in step 1 comprises the following steps:
step 11, obtaining a passbook image of any angle shot by a high-speed shooting instrument;
step 12, sharpening and Gaussian smoothing are carried out on the passbook image, and the contrast between the image edge and the surrounding background is enhanced;
step 13, detecting the image edge by using a Sobel operator, performing convolution operation on the image from the transverse direction and the longitudinal direction to obtain an approximate value on a gradient, calculating an image contour, storing the obtained image contour in a point set form, calculating a convex hull of a contour point set and a circumscribed rectangle of the convex hull, and further calculating four vertex coordinates of the contour;
step 14, correcting the bankbook image with the angle offset through perspective transformation, firstly determining coordinates of a new corrected image, and constructing a perspective matrix by the new coordinates and the coordinates of the original bankbook image, wherein the perspective matrix is a 3 x 3 matrix, so that linear transformation, translation transformation and perspective transformation from the original image to the new image are realized;
and step 15, obtaining a new image after angle correction after the passbook image is subjected to perspective matrix transformation.
Further, the angle correction method in step 2 is as follows:
and carrying out angle detection on the image after perspective transformation by using the trained SVM classifier, and turning the image according to different detection angles to obtain a passbook image with the image in a 0-degree state.
Further, the training method of the SVM classifier in step 2 includes the following steps:
step 21, firstly, collecting preset numbers of four types of text pictures turned over by 0 degrees, 90 degrees, 180 degrees and 270 degrees; classifying and identifying the four directions by an SVM classifier, constructing a picture and labels corresponding to classification, wherein the labels 1, 2, 3 and 4 correspond to 0 degrees, 90 degrees, 180 degrees and 270 degrees respectively;
step 22, extracting HOG characteristics of a gradient histogram of the picture, wherein the HOG characteristics reflect gradient change information of the picture, and the gradient information of characters deflected at different angles is different;
step 23, reducing the dimension of the HOG characteristic by adopting a Principal Component Analysis (PCA) method because the HOG characteristic dimension is higher and is not beneficial to the training of a classifier;
and 24, training the SVM classifier by taking the HOG features subjected to dimensionality reduction as input features to obtain the trained SVM classifier.
Further, the Principal Component Analysis (PCA) method comprises the following steps:
step 241, recording s pieces of d-dimensional HOG characteristic data, and combining the data into a data matrix X with s rows and d columns;
step 242, calculating the mean value of each row of the matrix to form a 1-row d-column matrix
Figure BDA0002838030020000021
Subtracting the mean value of each row of X or obtaining a new matrix with zero equalization, and marking as X';
step 243, calculate covariance matrix of new matrix X
Figure BDA0002838030020000022
And calculating the characteristic value and the characteristic vector thereof;
and 244, arranging the eigenvectors into a matrix from top to bottom according to the size of the corresponding eigenvalue, taking the first K rows to form a matrix P, and reducing the dimension to K dimension to obtain the HOG eigenvector matrix Y ═ PX.
Further, the positioning method comprises the following steps: and taking the vertex coordinate of the upper left corner of the angle-corrected passbook image as a fixed point, and positioning each area coordinate to be extracted by adding the offset to the fixed point coordinate as the fixed point position and the position offset relation of the area to be extracted in the passbook, and intercepting the area to be the identification area.
Further, the indefinite length OCR recognition model is based on a densenet network structure, and the model input is a picture needing OCR recognition; firstly, carrying out batch normalization processing on a BN layer, then sending the processed BN layer into a first layer of 3 x 3 convolutional layer, wherein an activation function of the layer is a Relu function, picture features extracted by the convolutional layer are sent into a denseblock layer, the model is provided with three denseblock layers, the middle parts of the denseblock layers are connected through a transition layer, each denseblock layer comprises the BN layer, the Relu activation function and the 3 x 3 convolutional layer, the feature graphs of the layers are consistent in size, and the input of each layer is from the input of all the previous layers; the Transition layer is connected with the two denseblock layers, the size of the characteristic diagram is reduced, and the model is compressed; the layers include a 1 x 1 convolutional layer and a 2 x 2 average pooling layer; and finally, outputting the characteristics output by the third denseblock layer through the BN layer and the full connection layer.
Further, the training method of the OCR recognition model with the indefinite length comprises the following steps: firstly, a corpus is constructed according to character types to be identified, a training data set and a label file of the data set are generated by the corpus, the label file comprises training data names and position information of Chinese characters in the data in the corpus, and the training set is generated; and training the network structure by using a training set, automatically modifying the weight value of the model through the forward calculation result of the training set in the model, stopping training after all training sets have high recognition rate after multiple times of training, and storing the weight value into a model file.
Further, the recognition method of the indefinite length OCR recognition model is: and during identification, calling the network model and the model file through a program, calculating and outputting the label files with the highest probability classification through a softmax function, and outputting a final identification result after retrieving the label files.
The invention also provides a recognition system based on the bankbook and bill character OCR recognition method, which comprises the following steps: the device comprises an image preprocessing module, a detection-oriented module, a positioning module and an OCR recognition module.
The image preprocessing module is used for shooting the passbook image at any angle and processing the passbook image to obtain a new passbook image after angle correction;
the face detection module is used for carrying out angle correction on the face of the passbook image and adjusting the face to a 0-degree state;
the positioning module is used for positioning the position of the area to be identified in the corrected passbook image and constructing a corresponding label of the area to be identified;
and the OCR recognition module is used for recognizing the area to be recognized by adopting an OCR recognition model with an indefinite length and outputting a recognition result.
Has the advantages that: the invention relates to an OCR (optical character recognition) method and an OCR recognition system based on bankbook and bill characters, which have the following advantages compared with the prior art:
1. the bankbook image is sharpened and subjected to Gaussian smoothing processing through an image preprocessing module, the contrast between the image edge and the surrounding background is enhanced, and the recognizable quality of the bankbook image is improved; the passbook images with the angle offset are corrected through perspective transformation, automatic identification of the offset angle of the characters and the pictures is achieved, information can be extracted from the passbook images obtained at any angle, the pictures to be identified do not need to be collected in a manual fixed direction, and using requirements are simplified.
2. The HOG features of the gradient histogram of the picture are extracted, the Principal Component Analysis (PCA) method is adopted to reduce the dimension of the HOG features, training of an SVM classifier is facilitated, the condition that the bill is turned over by 90 degrees or 180 degrees when being sent to be identified possibly exists in practical application, and therefore the calculation speed and accuracy of the classifier for detecting the bill character direction and correcting the direction are improved.
4. The OCR recognition module realizes automatic extraction of the bankbook information, reduces the time cost and labor cost for manually checking and inputting the information, and greatly improves the working efficiency.
5. The self-labeled data is adopted to train the deep learning model for OCR recognition, the model trained by adding the self-labeled practical application data is proved to be more accurate in OCR recognition of bills and passbooks compared with the conventional OCR recognition model while the higher information recognition speed is kept. The addition of the practical application data improves the diversity of the data and ensures that the trained model has higher robustness to different printing forms of the same Chinese character.
6. The image edge method adopted by the invention has a smoothing effect on noise, can provide more accurate calculation edge information, reduces the number of templates required by operation, reduces the complexity of calculation, and has better robustness in the aspect of noise resistance.
In conclusion, the deep learning model is adopted for OCR recognition, so that the labor cost is reduced, the use requirement is simplified, the information recognition speed and the recognition accuracy are increased, and the working efficiency is greatly improved; and has higher robustness for different printing fonts.
Drawings
FIG. 1 is a system flow diagram of the identification system of the present invention.
Figure 2 is a pre-processed passbook picture of the present invention.
Figure 3 is a post-outline passbook picture of the present invention.
Figure 4 is a perspective transformed passbook picture of the present invention.
Figure 5 is a passbook information location interception diagram of the present invention.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a more thorough understanding of the present invention. It will be apparent, however, to one skilled in the art, that the present invention may be practiced without one or more of these specific details. In other instances, well-known features have not been described in order to avoid obscuring the invention.
As shown in fig. 1, a passbook-based, ticket character OCR recognition system includes: the device comprises an image preprocessing module, a detection-oriented module, a positioning module and an OCR recognition module.
The image preprocessing module is used for shooting a passbook image at any angle and processing the passbook image to obtain a new passbook image after angle correction; the orientation detection module is used for carrying out angle correction on the orientation of the passbook image and adjusting the orientation to a 0-degree state; the positioning module is used for positioning the position of the area to be identified in the corrected passbook image and constructing a corresponding label of the area to be identified; and the OCR recognition module recognizes the area to be recognized by adopting an OCR recognition model with an indefinite length and outputs a recognition result.
The identification method is further described based on a bankbook and bill character OCR identification system, and the method specifically comprises the following steps:
step 1, taking bankbook images of any angle, and preprocessing to obtain new bankbook images with corrected angles.
The pretreatment method in the step 1 comprises the following steps:
and 11, acquiring the passbook image at any angle shot by the high-speed shooting instrument.
And step 12, carrying out sharpening and Gaussian smoothing on the passbook image to enhance the contrast between the image edge and the surrounding background. Specifically, let the picture value after graying be F0(x, y), wherein (x, y) is the coordinate value of the pixel point in the image. The image after Gaussian smoothing is calculated, firstly, a weight matrix is determined to be calculated, and any point (x) in the image is used0,y0) For example, a 3 × 3 matrix Z of image coordinates0The calculation is carried out in such a way that,
Figure BDA0002838030020000051
wherein (x)0,y0) For the matrix center, a gaussian function G (x, y) is used.
Figure BDA0002838030020000052
Wherein σ is the variance of x; multiplying each point in the matrix to obtain a new 3 x 3 matrix, and performing weighted average on the new matrix to obtain a weight matrix Z with smooth Gaussian1Is marked as
Figure BDA0002838030020000053
The image pixel point value matrix corresponding to the coordinates is as follows:
Figure BDA0002838030020000054
the final Gaussian processed pixel point value is F1(x0,y0)=∑∑Z1D0And calculating the values of all points in the image after the Gaussian processing by analogy to obtain a new image F1(x,y)。
And step 13, detecting the image edge by using a Sobel operator, performing convolution operation on the image from the transverse direction and the longitudinal direction to obtain an approximate value on the gradient, calculating the image contour, storing the obtained image contour in a point set form, calculating a convex hull of a contour point set and a circumscribed rectangle of the convex hull, and further calculating four vertex coordinates of the contour.
Wherein, the operator for convolution in the x direction is Dx
Figure BDA0002838030020000061
The operator for convolution in the y direction is Dy
Figure BDA0002838030020000062
Respectively calculating the X direction G of the image after Gaussian processingx=Dx*F1In the y direction Gy=Dy*F1The image value after edge detection is | F2|=|Gx|+|GyIn the direction of passing through a set threshold fmaxWhen | F2| is greater than fmaxThe point may be considered a boundary value. Setting the gray value of the point to be 255 or setting the gray value of the point to be 0, and taking the finally obtained image after the edge detection as F2And (4) showing.
Calculating an image F2Scanning the whole picture from left to right and from top to bottom from the upper left corner of the picture, judging the picture as a boundary point when the scanned point has 0 pixel point in an 8 connected domain, marking the 8 connected domain as F (i, j),
Figure BDA0002838030020000063
wherein (i, j) is taken as the center, (i)2,j2) And (5) as a starting point, searching whether the 8 neighborhoods of (i, j) have non-0 pixel points or not in the clockwise direction. If a non-0 pixel is found, the order is (i)1,j1) The first non-0 pixel in the clockwise direction.Is (i)3,j3) Center, in the counterclockwise direction, (i)2,j2) For the starting point (i)3,j3) Whether or not there is a non-0 pixel in the 8 neighborhoods of (i)4,j4) Is the first non-0 pixel in the counter-clockwise direction. If (i)4,j4) (ii) and (i, j)3,j3)=(i1,j1) I.e. back to the point where the boundary starts, it is determined whether (i, j) at this time is equal to 1 and if not equal to continue scanning. If (i)4,j4) (ii) and (i, j)3,j3)=(i1,j1) If not, then (i)2,j2)←(i3,j3),(i3,j3)←(i4,j4) And continuing to perform the operation, and finally obtaining all the contour information and storing the contour information in a point set.
Calculating the circumscribed rectangle of the contour point set, and firstly, taking any point with the minimum abscissa in the contour as p0Connecting peripheral point contour points as a starting point to form line segments, calculating included angles between the line segments and the downward direction, wherein x is 0, arranging the line segments from small to large according to the included angles, and counting the points as p1,p2,p3And the points are calculated to ensure that all contour points are included in the same direction, and the finally formed polygon is a convex hull of the contour. Enumerating the sides of the polygon, making a circumscribed rectangle, comparing the areas of the circumscribed rectangles, and selecting the smallest one as the circumscribed rectangle of the outline. The coordinates of four vertexes of the rectangle are (m)0,n0),(m1,n1),(m2,n2),(m3,n3) (ii) a I.e. the four vertex coordinates of the contour.
Step 14, the passbook image with the angle offset is corrected through perspective transformation, firstly, the coordinates of a new corrected image are determined, a perspective matrix is constructed by the new coordinates and the coordinates of the original passbook image, the perspective matrix is a 3 x 3 matrix and is marked as A, wherein,
Figure BDA0002838030020000071
wherein in the matrix A
Figure BDA0002838030020000072
Partly a linear transformation operation of the image,
Figure BDA0002838030020000073
partly an image perspective transformation operation, [ a ]31 a32]Part of the operation is image translation. Due to two-dimensional plane calculation, a33Defaults to 1. The values of the elements in the matrix A can be obtained by substituting the existing coordinate points into a perspective transformation calculation method, and finally perspective transformation correction from the original image to a new image is realized;
the perspective transformation calculation method comprises the following steps:
Figure BDA0002838030020000074
wherein (M, N) is a coordinate point before transformation (M, N) is a coordinate point after transformation, and u is 1;
calculating the value of the perspective matrix according to the original vertex coordinates, and finally calculating the value of the perspective matrix according to a formula
Figure BDA0002838030020000081
And
Figure BDA0002838030020000082
respectively calculating the coordinates of the transformed images to obtain a new image F after rotation3
And 2, carrying out angle correction on the orientation of the passbook image, and adjusting the orientation to a 0-degree state.
The specific detection-oriented method comprises the following steps: step 21, firstly, collecting preset numbers of four types of text pictures turned over by 0 degrees, 90 degrees, 180 degrees and 270 degrees; classifying and identifying four directions by an SVM classifier, constructing a picture and a label corresponding to classification, 0Degrees, 90 degrees, 180 degrees, and 270 degrees correspond to labels 1, 2, 3, and 4, respectively; step 22, extracting HOG characteristics of a gradient histogram of the picture, wherein the HOG characteristics reflect gradient change information of the picture, and the gradient information of characters deflected at different angles is different; step 23, reducing the dimension of the HOG characteristic by adopting a Principal Component Analysis (PCA) method because the HOG characteristic dimension is higher and is not beneficial to the training of a classifier; step 24, training an SVM classifier by taking the HOG features subjected to dimensionality reduction as input features to obtain the trained SVM classifier; the total number of the d-dimensional HOG characteristic data is recorded. The data are combined into a data matrix X of s rows and d columns. Calculating the mean value of each column of the matrix to form a 1-row and d-column matrix
Figure BDA0002838030020000083
And subtracting the mean value of each column of X or obtaining a new matrix with zero equalization, and marking the new matrix as X'. Calculating covariance matrix of new matrix X
Figure BDA0002838030020000084
And calculates its eigenvalues and eigenvectors. And arranging the eigenvectors into a matrix from top to bottom according to the size of the corresponding eigenvalue, taking the first K rows to form a matrix P, and reducing the dimension of the HOG eigenvector matrix to K dimensions to be Y (PX). And 25, carrying out angle detection on the image after perspective transformation by using the trained SVM classifier, and turning the image according to different detection angles to obtain a passbook image with the image in a 0-degree state.
Step 3, positioning the position of the area to be identified in the corrected passbook image, and constructing a corresponding label of the area to be identified; specifically, the vertex coordinate of the upper left corner of the bankbook image after the angle correction is used as a fixed point, and the position offset relationship between the fixed point and the information region to be extracted in the bankbook is fixed, so that the coordinates of each region to be extracted are positioned by adding the offset to the fixed point coordinate, and the region is intercepted, namely the region is the identification region.
Specifically, the coordinates of the vertex at the upper left corner are recorded as (x, y), the area to be recognized is a rectangular area, any vertex of the rectangle is recorded as (x ', y'), the distance between two points is represented by Vx-x '-x and Vy-y' -y, and the areas to be recognized, such as passbooks, bills and the like, are determined, wherein the values of Vx and Vy are determined, and all the areas to be recognized can be calculated through the coordinates of the vertex and the values of Vx and Vy and can be intercepted according to the coordinates.
And 4, recognizing the region to be recognized by adopting an OCR recognition model with an indefinite length, and outputting a recognition result.
Specifically, the area intercepted by the positioning module contains character characters to be recognized, the number of the characters in actual conditions is different, and an OCR recognition model with an indefinite length is adopted for recognition. Firstly, a recognition model structure is constructed, the model considers the actual OCR recognition needs, and improvement is carried out on the basis of a densenet network structure. The model input is a picture needing OCR recognition, a BN layer is subjected to batch normalization processing and then sent to a first 3 x 3 convolutional layer, the activation function of the layer is a Relu function, picture features extracted by the convolutional layer are sent to a denseblock layer, the model is provided with three denseblock layers, the middle parts of the denseblock layers are connected through a transition layer, the denseblock layer comprises the BN layer, the Relu activation function and the 3 x 3 convolutional layer, the feature graphs of the layers are consistent in size, and the input of each layer is from the input of all the previous layers. The Transition layer connects the two denseblock layers, reducing the feature map size and compressing the model. The layers include a 1 x 1 convolutional layer and a 2 x 2 average pooling layer. And finally, outputting the characteristics output by the third denseblock layer through the BN layer and the full connection layer. After full connection output, calculating by a softmax function, wherein the calculation formula of the softmax function is
Figure BDA0002838030020000091
Wherein x isiFeature vector values representing different classes are output for the full connectivity layer, and M is all classification classes. And mapping each output to be between (0,1), wherein the sum of all mapping values is 1, and the maximum output value after calculation is the classification judged by the model, namely the specific Chinese character.
And (3) performing model training after model construction is completed, and firstly constructing a corpus according to the character types to be identified, wherein the corpus comprises 5990 types of common Chinese characters, numbers, letters and symbols. And generating a training data set and a label file of the data set by the corpus, wherein the label file comprises the name of the training data and the position information of the Chinese characters in the data in the corpus. 100 ten thousand training sets are generated, 1 thousand test sets are used for model training, and the training accuracy is about 95%. Training the weight value of the model automatically modified through the forward calculation result of the training set in the model, stopping training after all training sets have high recognition rate after multiple times of training, and storing the weight value into a model file, wherein the model file is the optimal weight value of the model file represented in a binary form. And during identification, calling the network model and the model file through a program, calculating and outputting the label files with the highest probability classification through a softmax function, and outputting a final identification result after retrieving the label files. The method has high identification accuracy and strong robustness in a complex environment.
It should be noted that the various features described in the above embodiments may be combined in any suitable manner without departing from the scope of the invention. The invention is not described in detail in order to avoid unnecessary repetition.

Claims (10)

1. An OCR recognition method based on bankbook and bill characters is characterized by comprising the following steps:
step 1, taking bankbook images at any angle, and preprocessing to obtain new bankbook images with corrected angles;
step 2, carrying out angle correction on the orientation of the passbook image, and adjusting the orientation to a 0-degree state;
step 3, positioning the position of the area to be identified in the corrected passbook image, and constructing a corresponding label of the area to be identified;
and 4, recognizing the region to be recognized by adopting an OCR recognition model with an indefinite length, and outputting a recognition result.
2. A passbook-based, ticket character OCR recognition method according to claim 1 wherein said preprocessing method in step 1 includes the steps of:
step 11, obtaining a passbook image of any angle shot by a high-speed shooting instrument;
step 12, sharpening and Gaussian smoothing are carried out on the passbook image, and the contrast between the image edge and the surrounding background is enhanced;
step 13, detecting the image edge by using an operator, performing convolution operation on the image from the transverse direction and the longitudinal direction to obtain an approximate value on the gradient, calculating the image contour, storing the obtained image contour in a point set form, calculating a convex hull of a contour point set and a circumscribed rectangle of the convex hull, and further calculating four vertex coordinates of the contour;
step 14, correcting the bankbook image with the angle offset through perspective transformation, firstly determining coordinates of a new corrected image, and constructing a perspective matrix by the new coordinates and the coordinates of the original bankbook image, wherein the perspective matrix is a 3 x 3 matrix, so that linear transformation, translation transformation and perspective transformation from the original image to the new image are realized;
and step 15, obtaining a new image after angle correction after the passbook image is subjected to perspective matrix transformation.
3. An OCR recognition method based on passbook and ticket characters as claimed in claim 1, wherein the angle correction method in step 2 is:
and carrying out angle detection on the image after perspective transformation by using the trained SVM classifier, and turning the image according to different detection angles to obtain a passbook image with the image in a 0-degree state.
4. A bankbook, ticket character based OCR recognition method according to claim 3 wherein the training method of the SVM classifier in step 2 comprises the steps of:
step 21, firstly, collecting preset numbers of four types of text pictures turned over by 0 degrees, 90 degrees, 180 degrees and 270 degrees; classifying and identifying the four directions by an SVM classifier, constructing a picture and labels corresponding to classification, wherein the labels 1, 2, 3 and 4 correspond to 0 degrees, 90 degrees, 180 degrees and 270 degrees respectively;
step 22, extracting HOG characteristics of a gradient histogram of the picture, wherein the HOG characteristics reflect gradient change information of the picture, and the gradient information of characters deflected at different angles is different;
step 23, reducing the dimension of the HOG characteristic by adopting a Principal Component Analysis (PCA) method;
and 24, training the SVM classifier by taking the HOG features subjected to dimensionality reduction as input features to obtain the trained SVM classifier.
5. The passbook-based, instrument character OCR recognition method of claim 4 wherein said principal component analysis PCA method includes the steps of:
step 241, recording s pieces of d-dimensional HOG characteristic data, and combining the data into a data matrix X with s rows and d columns;
step 242, calculating the mean value of each row of the matrix to form a 1-row d-column matrix
Figure FDA0002838030010000021
Subtracting the mean value of each row of X or obtaining a new matrix with zero equalization, and marking as X';
step 243, calculate covariance matrix of new matrix X
Figure FDA0002838030010000022
And calculating the characteristic value and the characteristic vector thereof;
and 244, arranging the eigenvectors into a matrix from top to bottom according to the size of the corresponding eigenvalue, taking the first K rows to form a matrix P, and reducing the dimension to K dimension to obtain the HOG eigenvector matrix Y ═ PX.
6. A passbook-based, ticket character OCR recognition method as recited in claim 1, wherein the method of locating is: and taking the vertex coordinate of the upper left corner of the angle-corrected passbook image as a fixed point, and positioning each area coordinate to be extracted by adding the offset to the fixed point coordinate as the fixed point position and the position offset relation of the area to be extracted in the passbook, and intercepting the area to be the identification area.
7. An OCR recognition method based on bankbook and ticket characters as claimed in claim 1, wherein the OCR recognition model of indefinite length is based on densnet network structure, and the model input is the picture to be OCR recognized; firstly, carrying out batch normalization processing on a BN layer, then sending the processed BN layer into a first layer of 3 x 3 convolutional layer, wherein an activation function of the layer is a Relu function, picture features extracted by the convolutional layer are sent into a denseblock layer, the model is provided with three denseblock layers, the middle parts of the denseblock layers are connected through a transition layer, each denseblock layer comprises the BN layer, the Relu activation function and the 3 x 3 convolutional layer, the feature graphs of the layers are consistent in size, and the input of each layer is from the input of all the previous layers; the Transition layer is connected with the two denseblock layers, the size of the characteristic diagram is reduced, and the model is compressed; the layers include a 1 x 1 convolutional layer and a 2 x 2 average pooling layer; and finally, outputting the characteristics output by the third denseblock layer through the BN layer and the full connection layer.
8. The bankbook-based, ticket character OCR recognition method of claim 7, wherein the training method of the indefinite length OCR recognition model includes the steps of:
firstly, a corpus is constructed according to character types to be identified, a training data set and a label file of the data set are generated by the corpus, the label file comprises training data names and position information of Chinese characters in the data in the corpus, and the training set is generated; and training the network structure by using a training set, automatically modifying the weight value of the model through the forward calculation result of the training set in the model, stopping training after all training sets have high recognition rate after multiple times of training, and storing the weight value into a model file.
9. An OCR recognition method based on passbooks and ticket characters as claimed in claim 8, wherein the recognition method of the OCR recognition model of indefinite length is: and during identification, calling the network model and the model file through a program, calculating and outputting the label files with the highest probability classification through a softmax function, and outputting a final identification result after retrieving the label files.
10. A recognition system based on the bankbook, ticket character OCR recognition method according to any one of claims 1 to 9, comprising:
the image preprocessing module is used for shooting the passbook image at any angle and processing the passbook image to obtain a new passbook image after angle correction;
the face detection module is used for carrying out angle correction on the face of the passbook image and adjusting the face to a 0-degree state;
the positioning module is used for positioning the position of the area to be identified in the corrected passbook image and constructing a corresponding label of the area to be identified;
and the OCR recognition module is used for recognizing the area to be recognized by adopting an OCR recognition model with an indefinite length and outputting a recognition result.
CN202011482590.8A 2020-12-15 2020-12-15 OCR (optical character recognition) method and recognition system based on bankbook and bill characters Pending CN112507914A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011482590.8A CN112507914A (en) 2020-12-15 2020-12-15 OCR (optical character recognition) method and recognition system based on bankbook and bill characters

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011482590.8A CN112507914A (en) 2020-12-15 2020-12-15 OCR (optical character recognition) method and recognition system based on bankbook and bill characters

Publications (1)

Publication Number Publication Date
CN112507914A true CN112507914A (en) 2021-03-16

Family

ID=74972248

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011482590.8A Pending CN112507914A (en) 2020-12-15 2020-12-15 OCR (optical character recognition) method and recognition system based on bankbook and bill characters

Country Status (1)

Country Link
CN (1) CN112507914A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113516597A (en) * 2021-05-19 2021-10-19 中国工商银行股份有限公司 Image correction method and device and server
CN114792422A (en) * 2022-05-16 2022-07-26 合肥优尔电子科技有限公司 Optical character recognition method based on enhanced perspective

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446621A (en) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 Bank slip recognition method, server and computer readable storage medium
CN109242787A (en) * 2018-08-15 2019-01-18 南京光辉互动网络科技股份有限公司 It paints in a kind of assessment of middle and primary schools' art input method
CN110889402A (en) * 2019-11-04 2020-03-17 广州丰石科技有限公司 Business license content identification method and system based on deep learning
CN111428748A (en) * 2020-02-20 2020-07-17 重庆大学 Infrared image insulator recognition and detection method based on HOG characteristics and SVM

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446621A (en) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 Bank slip recognition method, server and computer readable storage medium
CN109242787A (en) * 2018-08-15 2019-01-18 南京光辉互动网络科技股份有限公司 It paints in a kind of assessment of middle and primary schools' art input method
CN110889402A (en) * 2019-11-04 2020-03-17 广州丰石科技有限公司 Business license content identification method and system based on deep learning
CN111428748A (en) * 2020-02-20 2020-07-17 重庆大学 Infrared image insulator recognition and detection method based on HOG characteristics and SVM

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113516597A (en) * 2021-05-19 2021-10-19 中国工商银行股份有限公司 Image correction method and device and server
CN113516597B (en) * 2021-05-19 2024-05-28 中国工商银行股份有限公司 Image correction method, device and server
CN114792422A (en) * 2022-05-16 2022-07-26 合肥优尔电子科技有限公司 Optical character recognition method based on enhanced perspective
CN114792422B (en) * 2022-05-16 2023-12-12 合肥优尔电子科技有限公司 Optical character recognition method based on enhanced perspective

Similar Documents

Publication Publication Date Title
US10055660B1 (en) Arabic handwriting recognition utilizing bag of features representation
EP2383678A1 (en) Handwritten character recognition method and system
Palacios et al. A system for processing handwritten bank checks automatically
CN110458158B (en) Text detection and identification method for assisting reading of blind people
US8224072B2 (en) Method for normalizing displaceable features of objects in images
Ahmed et al. A novel dataset for English-Arabic scene text recognition (EASTR)-42K and its evaluation using invariant feature extraction on detected extremal regions
CN112507914A (en) OCR (optical character recognition) method and recognition system based on bankbook and bill characters
CN114359553B (en) Signature positioning method and system based on Internet of things and storage medium
CN113011426A (en) Method and device for identifying certificate
Singh et al. Handwritten words recognition for legal amounts of bank cheques in English script
CN108921006B (en) Method for establishing handwritten signature image authenticity identification model and authenticity identification method
US8340428B2 (en) Unsupervised writer style adaptation for handwritten word spotting
CN114005127A (en) Image optical character recognition method based on deep learning, storage device and server
CN112364863B (en) Character positioning method and system for license document
Verma et al. A novel approach for structural feature extraction: contour vs. direction
Aravinda et al. Template matching method for Kannada handwritten recognition based on correlation analysis
CN110766001B (en) Bank card number positioning and end-to-end identification method based on CNN and RNN
CN111213157A (en) Express information input method and system based on intelligent terminal
Patel et al. An impact of grid based approach in offline handwritten Kannada word recognition
CN113537216B (en) Dot matrix font text line inclination correction method and device
CN111612045B (en) Universal method for acquiring target detection data set
Kulkarni Handwritten character recognition using HOG, COM by OpenCV & Python
CN108509865B (en) Industrial injury information input method and device
McNeill et al. Coin recognition using vector quantization and histogram modeling
Choksi et al. Hindi optical character recognition for printed documents using fuzzy k-nearest neighbor algorithm: a problem approach in character segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination