CN111027443B - Bill text detection method based on multitask deep learning - Google Patents

Bill text detection method based on multitask deep learning Download PDF

Info

Publication number
CN111027443B
CN111027443B CN201911225976.8A CN201911225976A CN111027443B CN 111027443 B CN111027443 B CN 111027443B CN 201911225976 A CN201911225976 A CN 201911225976A CN 111027443 B CN111027443 B CN 111027443B
Authority
CN
China
Prior art keywords
text
bill
center line
text information
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911225976.8A
Other languages
Chinese (zh)
Other versions
CN111027443A (en
Inventor
刘桂雄
刘思洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201911225976.8A priority Critical patent/CN111027443B/en
Publication of CN111027443A publication Critical patent/CN111027443A/en
Application granted granted Critical
Publication of CN111027443B publication Critical patent/CN111027443B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Character Input (AREA)

Abstract

The invention provides a note text detection method based on multitask deep learning, which comprises the following steps: constructing a multilayer convolutional neural network as an image feature extraction backbone network to realize feature extraction of the bill image; marking a note text region and a region center line on the convolution characteristic graph and training to realize note text information region segmentation and text center line detection; advancing along the text center line in the bill text information region by a sliding window method to realize the single character segmentation of the bill text information region; and sequentially carrying out classification and identification on the single segmented character to form finished bill text information. The invention provides an end-to-end multi-task learning method by utilizing strong feature extraction and induction capabilities of deep learning, realizes bill text region segmentation, text character segmentation and text character recognition, and solves the problems of insufficient applicability and low efficiency of a classical bill text information detection method.

Description

Bill text detection method based on multitask deep learning
Technical Field
The invention relates to the field of bill anti-counterfeiting identification, in particular to a bill text detection method based on multi-task deep learning.
Background
The visual detection and identification technology is widely applied due to high accuracy, non-contact and good applicability. The bill image text information has the characteristics of various text information areas, cross mixing of Chinese characters, numbers and English characters and the like, at present, the bill image text information is read by manpower, the work is boring, the repeatability is high, the misreading and the misreading are easy to occur due to the fact that the spirit is not concentrated under the fatigue work, and the bill image text information acquisition method for researching the robot is the research focus in the field.
In recent years, with the rapid development of the electronic hardware industry and the information industry, the computing capability of a computer is rapidly improved, so that large-scale image computation and reasoning become possible. The image detection method based on deep learning is applied to the field of image text information acquisition and achieves remarkable effect. The text information detection method based on deep learning utilizes multilayer convolution operation to extract image features from an image layer by layer, performs feature operation, processing and induction, and forms a text information reading method with high efficiency and strong universality through multi-task combination of text information region positioning, text character segmentation, text character classification and identification and the like. The manual detection and traditional image classification methods have short boards in the field of bill text information detection, and the bill text information acquisition technology based on deep learning has the advantages of strong universality and high detection efficiency, and is beneficial to improving the development and progress of digitization and intellectualization of the financial industry.
Disclosure of Invention
In order to solve the problems and defects, the invention provides a bill text detection method based on multi-task deep learning, which divides the bill text detection into three tasks of bill text region segmentation, bill text character classification and recognition, integrates the three tasks into a deep learning framework, adopts a supervised learning method to realize the acquisition of bill text information, and solves the problems of dependence on manpower and the like in the conventional acquisition of the bill text information.
The purpose of the invention is realized by the following technical scheme:
a note text detection method based on multitask deep learning comprises the following steps:
a, constructing a multilayer convolutional neural network as an image feature extraction backbone network to realize feature extraction of the bill image;
marking a note text region and a region center line on the convolution characteristic diagram and training to realize note text information region segmentation and text center line detection;
c, advancing along the text center line in the bill text information region by a sliding window method to realize the single character segmentation of the bill text information region;
and D, sequentially classifying and identifying the single segmented character to form finished bill text information.
The invention has the beneficial effects that:
by utilizing the advantages of deep learning in feature extraction, induction and reasoning, the bill text detection is divided into three tasks of bill text region segmentation, bill text character classification and recognition, and a deep neural network is trained under the support of a large amount of labeled data, so that the efficient and accurate detection and recognition of the bill text information are realized.
Drawings
FIG. 1 is a flow chart of a method for detecting a bill text based on multitask deep learning according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following examples and the accompanying drawings.
The invention discloses a note text detection method based on multitask deep learning, which comprises the following steps of:
step 10, constructing a multilayer convolutional neural network as an image feature extraction backbone network to realize feature extraction of the bill image:
introducing cavity convolution into the convolution layer of the feature extraction backbone network, namely performing bilinear difference on the feature map after the convolution operation of the previous layer to enlarge the resolution of the convolution feature map, and then performing convolution operation of the convolution layer to enlarge the convolution receptive field under the condition of ensuring that the convolution kernel parameters are not changed so as to obtain richer bill image features;
in the process of extracting the bill image features by the multi-convolution layer neural network, the output feature vectors of the lower-layer convolution layer and the output vectors of the higher-layer convolution layer are spliced to form a final output feature map so as to retain the edge and texture features in the lower-layer convolution layer and the semantic features in the higher-layer convolution layer.
Step 20, marking a note text region and a region center line on the convolution characteristic diagram and training to realize note text information region segmentation and text center line detection;
the bill text information region segmentation and text center line detection parameters comprise center line pixel point coordinates (x) i ,y i ) And the offset of the central line pixel point to the boundary of the text region
Figure BDA0002302203850000031
Center line pixel to textPresent area lower boundary offset->
Figure BDA0002302203850000032
And training the network by taking the parameters as output targets to obtain the results of segmentation of the bill text information region and detection of the text region center line.
Step 30, advancing along the text center line in the bill text information region by a sliding window method to realize the single character segmentation of the bill text information region;
in the bill text information area, the bill text information area is advanced along the text center line in a sliding window method, and for each pixel (x) on the center line i ,y i ) The distances between four vertexes of upper left, upper right, lower left and lower back of each character and the central line pixel are predicted to be respectively
Figure BDA0002302203850000033
The true distance of each character from the center line is->
Figure BDA0002302203850000034
Constructing a loss function:
Figure BDA0002302203850000035
wherein alpha is lt 、α rt 、α ld 、α rd And correcting terms for each distance loss to control the specific gravity of each distance loss.
Step 40, carrying out classification and identification on the single segmented character in sequence to form finished bill text information:
and (3) pre-training a Softmax multi-classifier on the character image data set, and sequentially adopting the Softmax multi-classifier to classify and identify the single character obtained by the segmentation in the step 30 to form complete bill text information.
Although the embodiments of the present invention have been described above, the above descriptions are only for the convenience of understanding the present invention, and are not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (4)

1. A bill text detection method based on multitask deep learning is characterized by comprising the following steps:
a, constructing a multilayer convolutional neural network as an image feature extraction backbone network to realize feature extraction of the bill image;
marking a note text area and an area center line on the convolution characteristic diagram, and training the note text area and the area center line to realize note text information area segmentation and text center line detection;
c, advancing along the text center line in the bill text information region by a sliding window method to realize the single character segmentation of the bill text information region;
d, sequentially classifying and identifying the single divided character to form finished bill text information;
in the step B, parameters in the bill text information region segmentation and text center line detection are taken as an output target training network, and the bill text information region segmentation and text region center line detection results are obtained; the parameters in the bill text information region segmentation and text center line detection comprise center line pixel point coordinates (x) i ,y i ) And the boundary offset from the central line pixel point to the text area
Figure FDA0004057239490000011
Offset from center line pixel to lower boundary of text region->
Figure FDA0004057239490000012
In step C, each pixel (x) on the center line of the text is passed i ,y i ) Predicting the distances between four vertexes of upper left, upper right, lower left and lower back of each character and the central line pixelFrom is respectively
Figure FDA0004057239490000013
Figure FDA0004057239490000014
The true distance of each character from the center line is->
Figure FDA0004057239490000015
Constructing a loss function:
Figure FDA0004057239490000016
wherein alpha is lt 、α rt 、α ld 、α rd And correcting terms for each distance loss to control the specific gravity of each distance loss.
2. The method for detecting the bill texts based on the multitask deep learning as claimed in claim 1, wherein in the step a, a hole convolution is introduced into a convolution layer of the feature extraction backbone network, that is, a bilinear difference value is performed on a feature map after a previous layer of convolution operation, so that the resolution of the convolution feature map is enlarged, then the convolution operation of the convolution layer is performed, and under the condition that the convolution kernel parameters are not changed, the convolution receptive field is enlarged, so that richer bill image features are obtained.
3. The method for detecting the bill texts based on the multitask deep learning as claimed in claim 1, wherein in the step a, in the process of extracting the bill image features through the multi-convolution layer neural network, the output feature vectors of the low-layer convolution layer and the output vectors of the high-layer convolution layer are spliced to form a final output feature map so as to retain the edge and texture features in the low-layer convolution layer and the semantic features in the high-layer convolution layer.
4. The method for detecting the bill texts based on the multitask deep learning as claimed in claim 1, wherein in the step D, characters Softmax multi-classifiers are pre-trained through a character image data set, and the single characters obtained by the segmentation in the step C are sequentially classified and recognized to form complete bill text information.
CN201911225976.8A 2019-12-04 2019-12-04 Bill text detection method based on multitask deep learning Active CN111027443B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911225976.8A CN111027443B (en) 2019-12-04 2019-12-04 Bill text detection method based on multitask deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911225976.8A CN111027443B (en) 2019-12-04 2019-12-04 Bill text detection method based on multitask deep learning

Publications (2)

Publication Number Publication Date
CN111027443A CN111027443A (en) 2020-04-17
CN111027443B true CN111027443B (en) 2023-04-07

Family

ID=70204203

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911225976.8A Active CN111027443B (en) 2019-12-04 2019-12-04 Bill text detection method based on multitask deep learning

Country Status (1)

Country Link
CN (1) CN111027443B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112926372B (en) * 2020-08-22 2023-03-10 清华大学 Scene character detection method and system based on sequence deformation
CN112101385B (en) * 2020-09-21 2022-06-10 西南大学 Weak supervision text detection method
CN112541491B (en) * 2020-12-07 2024-02-02 沈阳雅译网络技术有限公司 End-to-end text detection and recognition method based on image character region perception
CN113139548B (en) * 2020-12-31 2022-05-06 重庆邮电大学 Mathematical formula identification method based on operator action domain and center line
CN113744278A (en) * 2021-01-20 2021-12-03 北京沃东天骏信息技术有限公司 Text detection method and device
CN112949621A (en) * 2021-03-16 2021-06-11 新东方教育科技集团有限公司 Method and device for marking test paper answering area, storage medium and electronic equipment
CN113191251B (en) * 2021-04-28 2023-04-07 北京有竹居网络技术有限公司 Method and device for detecting stroke order, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103208004A (en) * 2013-03-15 2013-07-17 北京英迈杰科技有限公司 Automatic recognition and extraction method and device for bill information area
CN107133616A (en) * 2017-04-02 2017-09-05 南京汇川图像视觉技术有限公司 A kind of non-division character locating and recognition methods based on deep learning
CN108446621A (en) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 Bank slip recognition method, server and computer readable storage medium
CN109977723A (en) * 2017-12-22 2019-07-05 苏宁云商集团股份有限公司 Big bill picture character recognition methods
CN110033000A (en) * 2019-03-21 2019-07-19 华中科技大学 A kind of text detection and recognition methods of bill images
CN110175610A (en) * 2019-05-23 2019-08-27 上海交通大学 A kind of bill images text recognition method for supporting secret protection
CN110490193A (en) * 2019-07-24 2019-11-22 西安网算数据科技有限公司 Single Text RegionDetection method and ticket contents recognition methods

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103208004A (en) * 2013-03-15 2013-07-17 北京英迈杰科技有限公司 Automatic recognition and extraction method and device for bill information area
CN107133616A (en) * 2017-04-02 2017-09-05 南京汇川图像视觉技术有限公司 A kind of non-division character locating and recognition methods based on deep learning
CN109977723A (en) * 2017-12-22 2019-07-05 苏宁云商集团股份有限公司 Big bill picture character recognition methods
CN108446621A (en) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 Bank slip recognition method, server and computer readable storage medium
CN110033000A (en) * 2019-03-21 2019-07-19 华中科技大学 A kind of text detection and recognition methods of bill images
CN110175610A (en) * 2019-05-23 2019-08-27 上海交通大学 A kind of bill images text recognition method for supporting secret protection
CN110490193A (en) * 2019-07-24 2019-11-22 西安网算数据科技有限公司 Single Text RegionDetection method and ticket contents recognition methods

Also Published As

Publication number Publication date
CN111027443A (en) 2020-04-17

Similar Documents

Publication Publication Date Title
CN111027443B (en) Bill text detection method based on multitask deep learning
CN107392141B (en) Airport extraction method based on significance detection and LSD (least squares distortion) line detection
Rehman et al. Performance analysis of character segmentation approach for cursive script recognition on benchmark database
CN111028217A (en) Image crack segmentation method based on full convolution neural network
CN104408449B (en) Intelligent mobile terminal scene literal processing method
CN105931253A (en) Image segmentation method combined with semi-supervised learning
CN114372955A (en) Casting defect X-ray diagram automatic identification method based on improved neural network
CN109766823A (en) A kind of high-definition remote sensing ship detecting method based on deep layer convolutional neural networks
CN108596952B (en) Rapid deep learning remote sensing image target detection method based on candidate region screening
CN110991439A (en) Method for extracting handwritten characters based on pixel-level multi-feature joint classification
CN110751154A (en) Complex environment multi-shape text detection method based on pixel-level segmentation
CN114581782A (en) Fine defect detection method based on coarse-to-fine detection strategy
Alrehali et al. Historical Arabic manuscripts text recognition using convolutional neural network
Van Phan et al. A nom historical document recognition system for digital archiving
Mahajan et al. Word level script identification using convolutional neural network enhancement for scenic images
CN113269049A (en) Method for detecting handwritten Chinese character area
CN114187595A (en) Document layout recognition method and system based on fusion of visual features and semantic features
CN110910388A (en) Cancer cell image segmentation method based on U-Net and density estimation
CN111832497B (en) Text detection post-processing method based on geometric features
CN105844299B (en) A kind of image classification method based on bag of words
CN112633327A (en) Staged metal surface defect detection method, system, medium, equipment and application
Yao et al. Invoice detection and recognition system based on deep learning
Santosh et al. Scalable arrow detection in biomedical images
Radwan et al. Predictive segmentation using multichannel neural networks in Arabic OCR system
Rong et al. Weakly supervised text attention network for generating text proposals in scene images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant