CN108595544A - A kind of document picture classification method - Google Patents
A kind of document picture classification method Download PDFInfo
- Publication number
- CN108595544A CN108595544A CN201810309072.2A CN201810309072A CN108595544A CN 108595544 A CN108595544 A CN 108595544A CN 201810309072 A CN201810309072 A CN 201810309072A CN 108595544 A CN108595544 A CN 108595544A
- Authority
- CN
- China
- Prior art keywords
- text
- document
- algorithm
- picture
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of document picture classification methods, use object detection method, visually judge whether identity card occur in picture, bank card, driving license, driver's license, passport, business license etc. has very strong feature, the Doctype having a long way to go between classification, the method of target detection can fast and accurately handle the document picture of these classifications, the document picture of other corresponding classifications, picture is first converted into word with the Text region algorithm based on deep neural network, then the word for sorting out identification is handled using file classification method, the method of text classification can distinguish nuance, accuracy rate is high.
Description
Technical field
The present invention relates to a kind of sorting technique, specifically a kind of document picture classification method.
Background technology
Insurance company is when establishing declaration form archives, to need to compile a large amount of document, and the management that classifies stores.With
Digitized revolution, current all documents are required for shooting at digital picture.The present invention is exactly for these document pictures
Automatic classification.The common document classification of insurance company is more than hundreds of, and the gap between some classifications is also very small, such as:Outpatient service
The difference of invoice and in hospital invoice is withdrawn deposit not often between several different words.Document classification is more, and difference is small between classification, leads
Cause this task extremely difficult.For this purpose, combination picture classification, target detection, Text region and the text classification etc. of our creativeness
Method obtains high classification accuracy.
The defect of the prior art
1. the method for picture classification:It is achieved currently based on the picture classification method of depth convolutional neural networks prodigious prominent
It is broken, the level of the mankind has even been surmounted in the task of some picture classifications.But existing picture classification technology is for spy
It seeks peace the classification of significant difference, such as:Distinguish cat and dog, the accuracy rate that it can not also be determined on sophisticated category.Thus,
Existing picture classification technology can not accurately distinguish the small Doctype of certain difference.
2. the method for target detection:The method of target detection based on deep learning has good standard under general task
True rate.Such as:It can accurately judge whether there is the targets such as identity card, bank card from document picture.However, in face of subtle
The outpatient service invoice of difference and invoice, object detection method are also helpless in hospital.
3. the method for text classification:The method development with a long history of text classification also will be ripe, can distinguish subtle text
Word difference.But it cannot be used directly for the classification of document picture.
Invention content
The purpose of the present invention is to provide a kind of document picture classification methods, to solve mentioned above in the background art ask
Topic.
To achieve the above object, the present invention provides the following technical solutions:
A kind of document picture classification method, includes the following steps:(1) it is examined from document picture with algorithm of target detection first
Survey identity card, bank card, driver's license, driving license, business license, working qualification card, Road Transportation demonstrate,prove certificate, if detection at
Work(then directly differentiates document classification;(2) if detection failure, enters the process flow with text classification:2.1 are examined with word
Method of determining and calculating detected the location information of the text strings in picture;2.2 texts that detected using Text region Model Identification
Word string, then all text strings are combined into document by there is sequence of positions;2.3 use Algorithm of documents categorization, and identification document is returned
Class, the category, that is, document picture generic.
As a further solution of the present invention:The algorithm of target detection includes Faster RCNN, SSD, YOLO.
As a further solution of the present invention:The text detection algorithm can either use general algorithm of target detection,
Also the algorithm after optimizing exclusively for text detection can be used.
As a further solution of the present invention:The general algorithm of target detection, including:Faster RCNN、SSD、
YOLO。
As further scheme of the invention:Algorithm after the optimization exclusively for text detection, including:EAST、
RRCNN、TextBoxes、CTPN。
Compared with prior art, the beneficial effects of the invention are as follows:The present invention uses object detection method, visually judges
Whether occurring identity card, bank card, driving license, driver's license, passport, business license etc. in picture has very strong feature, between classification
The Doctype having a long way to go, the method for target detection can fast and accurately handle the document picture of these classifications, correspond to it
Picture is first converted into word with the Text region algorithm based on deep neural network, then used by the document picture of his classification
File classification method sorts out the word of identification to handle, and the method for text classification can distinguish nuance, and accuracy rate is high.
Description of the drawings
Fig. 1 is the flow chart of document picture classification method.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Referring to Fig. 1, in the embodiment of the present invention, a kind of document picture classification method includes the following steps:(1) it uses first
Algorithm of target detection detected from document picture identity card, bank card, driver's license, driving license, business license, working qualification card,
Road Transportation demonstrate,proves certificate, if detecting successfully, directly differentiates document classification;(2) if detection failure, entering has text
The process flow of classification:2.1 detected the location information of the text strings in picture with text detection algorithm;2.2 use text
The text strings that word identification model recognition detection comes out, then all text strings are combined into document by there is sequence of positions;2.3 using
Algorithm of documents categorization will identify document classification, the category, that is, document picture generic.
The algorithm of target detection includes Faster RCNN, SSD, YOLO.
Below its principle is illustrated by example of Faster RCNN:
1) depth convolutional neural networks (conv layers) extraction picture abstract characteristics (feature maps);
2) using area candidate network recommended candidate certificate region (proposal generator);
3) the accurate region (Box Classifier) of certificate is returned from candidate region.
The text detection algorithm can either use general algorithm of target detection, can also use exclusively for text detection
Algorithm after optimization, the general algorithm of target detection, including:Faster RCNN, SSD, YOLO, it is described exclusively for text
Algorithm after word inspection optimization, including:EAST、RRCNN、TextBoxes、CTPN.
It is row that this, which sentences EAST, illustrates how to detect text strings from picture:
1) abstract characteristics for first using convolutional neural networks extraction picture, can use PVANet, MobileNet herein,
The arbitrary convolutional neural networks such as VGG, ResNet.Pay attention to preserving each layer of feature:F1, f2, f3, f4;
2) each layer output feature is up-sampled using transposition convolution technique, and splices convolutional layer feature, obtain h1,
H2, h3, h4;
3) it after above-mentioned two step, then carries out a convolution and obtains:Score map, text boxes or text
Quadrangle coordinates
4) non-maxima suppression algorithm (NMS) screening is used to be most likely to be the region of text strings.
Text region algorithm combines depth convolutional neural networks and Recognition with Recurrent Neural Network, realizes that picture turns to word
It changes.Its algorithm principle and steps are as follows:
1) convolutional network is used to extract ear tag picture feature;
2) the bidirectional circulating neural network for constituting features described above input LSTM;
3) CTC algorithms are used to merge reduplicated word and placeholder, the maximum word sequence of output probability;
The method of text classification has very much.In general, it can undergo:Text segments, and term vector indicates, the steps such as document representation
Suddenly.Thereafter, text classification can be carried out using arbitrary sorting technique.Such as:Support vector machines (SVM), naive Bayesian
Grader, K- neighbours (KNN), decision tree, random forest etc..Or by document representation at term vector matrix after, can use volume
Product neural network or Recognition with Recurrent Neural Network are classified.Below text classification is carried out using depth convolutional neural networks with regard to introducing
Method:
1) by each word or word (w0, w1, w2, w3 etc.), it is expressed as term vector (embedding).Can be with method
The arbitrary term vector algorithm such as one-hot, skip-word, glovec, fastText;
2) all term vectors are spliced into matrix, then convolutional neural networks (CNN) are used to extract feature;
3) further text feature calculate with two layers of full articulamentum again and be abstracted;
4) classified to file characteristics using softmax layers.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie
In the case of without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter
From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power
Profit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent requirements of the claims
Variation is included within the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.
In addition, it should be understood that although this specification is described in terms of embodiments, but not each embodiment is only wrapped
Containing an independent technical solution, this description of the specification is merely for the sake of clarity, and those skilled in the art should
It considers the specification as a whole, the technical solutions in the various embodiments may also be suitably combined, forms those skilled in the art
The other embodiment being appreciated that.
Claims (5)
1. a kind of document picture classification method, which is characterized in that include the following steps:(1)Use algorithm of target detection from text first
Detection identity card, bank card, driver's license, driving license, business license, working qualification's card, Road Transportation card card in shelves picture
Part directly differentiates document classification if detecting successfully;(2)If detection failure, enters the process flow with text classification:
2.1 detected the location information of the text strings in picture with text detection algorithm;2.2 are examined using Text region Model Identification
The text strings come are measured, then all text strings are combined into document by there is sequence of positions;2.3 use Algorithm of documents categorization, will
Identify document classification, the category, that is, document picture generic.
2. document picture classification method according to claim 1, which is characterized in that the algorithm of target detection includes
Faster RCNN、SSD、YOLO。
3. document picture classification method according to claim 1, which is characterized in that the text detection algorithm can either make
With general algorithm of target detection, the algorithm after optimizing exclusively for text detection can be also used.
4. document picture classification method according to claim 3, which is characterized in that the general algorithm of target detection,
Including:Faster RCNN、SSD、YOLO.
5. document picture classification method according to claim 3, which is characterized in that described to optimize exclusively for text detection
Algorithm afterwards, including:EAST、RRCNN、TextBoxes、CTPN.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810309072.2A CN108595544A (en) | 2018-04-09 | 2018-04-09 | A kind of document picture classification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810309072.2A CN108595544A (en) | 2018-04-09 | 2018-04-09 | A kind of document picture classification method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108595544A true CN108595544A (en) | 2018-09-28 |
Family
ID=63621357
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810309072.2A Pending CN108595544A (en) | 2018-04-09 | 2018-04-09 | A kind of document picture classification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108595544A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109344815A (en) * | 2018-12-13 | 2019-02-15 | 深源恒际科技有限公司 | A kind of file and picture classification method |
CN109543773A (en) * | 2018-12-12 | 2019-03-29 | 泰康保险集团股份有限公司 | Image processing method, device, medium and electronic equipment |
CN109919331A (en) * | 2019-02-15 | 2019-06-21 | 华南理工大学 | A kind of airborne equipment intelligent maintaining auxiliary system and method |
CN110069252A (en) * | 2019-04-11 | 2019-07-30 | 浙江网新恒天软件有限公司 | A kind of source code file multi-service label mechanized classification method |
CN110135264A (en) * | 2019-04-16 | 2019-08-16 | 深圳壹账通智能科技有限公司 | Data entry method, device, computer equipment and storage medium |
CN110175625A (en) * | 2019-04-11 | 2019-08-27 | 淮阴工学院 | A kind of identification of wechat group information and management method based on improved SSD algorithm |
CN110490232A (en) * | 2019-07-18 | 2019-11-22 | 北京捷通华声科技股份有限公司 | Method, apparatus, the equipment, medium of training literal line direction prediction model |
CN110598686A (en) * | 2019-09-17 | 2019-12-20 | 携程计算机技术(上海)有限公司 | Invoice identification method, system, electronic equipment and medium |
CN111241897A (en) * | 2018-11-28 | 2020-06-05 | 塔塔咨询服务有限公司 | Industrial checklist digitization by inferring visual relationships |
CN111444876A (en) * | 2020-04-08 | 2020-07-24 | 证通股份有限公司 | Image-text processing method and system and computer readable storage medium |
CN111476165A (en) * | 2020-04-07 | 2020-07-31 | 同方赛威讯信息技术有限公司 | Method for detecting fingerprint characteristics of title seal in electronic document based on deep learning |
CN112036421A (en) * | 2019-05-16 | 2020-12-04 | 搜狗(杭州)智能科技有限公司 | Image processing method and device and electronic equipment |
CN112036421B (en) * | 2019-05-16 | 2024-07-02 | 北京搜狗科技发展有限公司 | Image processing method and device and electronic equipment |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102750541A (en) * | 2011-04-22 | 2012-10-24 | 北京文通科技有限公司 | Document image classifying distinguishing method and device |
US9760806B1 (en) * | 2016-05-11 | 2017-09-12 | TCL Research America Inc. | Method and system for vision-centric deep-learning-based road situation analysis |
CN107153822A (en) * | 2017-05-19 | 2017-09-12 | 北京航空航天大学 | A kind of smart mask method of the semi-automatic image based on deep learning |
CN107423760A (en) * | 2017-07-21 | 2017-12-01 | 西安电子科技大学 | Based on pre-segmentation and the deep learning object detection method returned |
CN107480665A (en) * | 2017-08-09 | 2017-12-15 | 北京小米移动软件有限公司 | Character detecting method, device and computer-readable recording medium |
CN107766809A (en) * | 2017-10-09 | 2018-03-06 | 平安科技(深圳)有限公司 | Electronic installation, billing information recognition methods and computer-readable recording medium |
CN107798299A (en) * | 2017-10-09 | 2018-03-13 | 平安科技(深圳)有限公司 | Billing information recognition methods, electronic installation and readable storage medium storing program for executing |
CN107832765A (en) * | 2017-09-13 | 2018-03-23 | 百度在线网络技术(北京)有限公司 | Picture recognition to including word content and picture material |
-
2018
- 2018-04-09 CN CN201810309072.2A patent/CN108595544A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102750541A (en) * | 2011-04-22 | 2012-10-24 | 北京文通科技有限公司 | Document image classifying distinguishing method and device |
US9760806B1 (en) * | 2016-05-11 | 2017-09-12 | TCL Research America Inc. | Method and system for vision-centric deep-learning-based road situation analysis |
CN107153822A (en) * | 2017-05-19 | 2017-09-12 | 北京航空航天大学 | A kind of smart mask method of the semi-automatic image based on deep learning |
CN107423760A (en) * | 2017-07-21 | 2017-12-01 | 西安电子科技大学 | Based on pre-segmentation and the deep learning object detection method returned |
CN107480665A (en) * | 2017-08-09 | 2017-12-15 | 北京小米移动软件有限公司 | Character detecting method, device and computer-readable recording medium |
CN107832765A (en) * | 2017-09-13 | 2018-03-23 | 百度在线网络技术(北京)有限公司 | Picture recognition to including word content and picture material |
CN107766809A (en) * | 2017-10-09 | 2018-03-06 | 平安科技(深圳)有限公司 | Electronic installation, billing information recognition methods and computer-readable recording medium |
CN107798299A (en) * | 2017-10-09 | 2018-03-13 | 平安科技(深圳)有限公司 | Billing information recognition methods, electronic installation and readable storage medium storing program for executing |
Non-Patent Citations (1)
Title |
---|
张翮: "复杂背景下证件识别技术的研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111241897A (en) * | 2018-11-28 | 2020-06-05 | 塔塔咨询服务有限公司 | Industrial checklist digitization by inferring visual relationships |
CN111241897B (en) * | 2018-11-28 | 2023-06-23 | 塔塔咨询服务有限公司 | System and implementation method for digitizing industrial inspection sheets by inferring visual relationships |
CN109543773A (en) * | 2018-12-12 | 2019-03-29 | 泰康保险集团股份有限公司 | Image processing method, device, medium and electronic equipment |
CN109344815B (en) * | 2018-12-13 | 2021-08-13 | 深源恒际科技有限公司 | Document image classification method |
CN109344815A (en) * | 2018-12-13 | 2019-02-15 | 深源恒际科技有限公司 | A kind of file and picture classification method |
CN109919331A (en) * | 2019-02-15 | 2019-06-21 | 华南理工大学 | A kind of airborne equipment intelligent maintaining auxiliary system and method |
CN110175625A (en) * | 2019-04-11 | 2019-08-27 | 淮阴工学院 | A kind of identification of wechat group information and management method based on improved SSD algorithm |
CN110069252A (en) * | 2019-04-11 | 2019-07-30 | 浙江网新恒天软件有限公司 | A kind of source code file multi-service label mechanized classification method |
CN110069252B (en) * | 2019-04-11 | 2023-04-07 | 浙江网新恒天软件有限公司 | Automatic classification method for source code file multi-service labels |
CN110135264A (en) * | 2019-04-16 | 2019-08-16 | 深圳壹账通智能科技有限公司 | Data entry method, device, computer equipment and storage medium |
CN112036421A (en) * | 2019-05-16 | 2020-12-04 | 搜狗(杭州)智能科技有限公司 | Image processing method and device and electronic equipment |
CN112036421B (en) * | 2019-05-16 | 2024-07-02 | 北京搜狗科技发展有限公司 | Image processing method and device and electronic equipment |
CN110490232A (en) * | 2019-07-18 | 2019-11-22 | 北京捷通华声科技股份有限公司 | Method, apparatus, the equipment, medium of training literal line direction prediction model |
CN110598686A (en) * | 2019-09-17 | 2019-12-20 | 携程计算机技术(上海)有限公司 | Invoice identification method, system, electronic equipment and medium |
CN111476165A (en) * | 2020-04-07 | 2020-07-31 | 同方赛威讯信息技术有限公司 | Method for detecting fingerprint characteristics of title seal in electronic document based on deep learning |
CN111444876A (en) * | 2020-04-08 | 2020-07-24 | 证通股份有限公司 | Image-text processing method and system and computer readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108595544A (en) | A kind of document picture classification method | |
RU2737720C1 (en) | Retrieving fields using neural networks without using templates | |
US20190164015A1 (en) | Machine learning techniques for evaluating entities | |
US20210158015A1 (en) | Classifying digital documents in multi-document transactions based on signatory role analysis | |
CN106296195A (en) | A kind of Risk Identification Method and device | |
US20210064908A1 (en) | Identification of fields in documents with neural networks using global document context | |
US20200005032A1 (en) | Classifying digital documents in multi-document transactions based on embedded dates | |
CN110705952A (en) | Contract auditing method and device | |
CN111611933B (en) | Information extraction method and system for document image | |
CN112685374B (en) | Log classification method and device and electronic equipment | |
CN117112782A (en) | Method for extracting bid announcement information | |
CN113658002B (en) | Transaction result generation method and device based on decision tree, electronic equipment and medium | |
CN114140649A (en) | Bill classification method, bill classification device, electronic apparatus, and storage medium | |
US20200089817A1 (en) | Composition Engine for Analytical Models | |
CN117195319A (en) | Verification method and device for electronic part of file, electronic equipment and medium | |
CN112668857A (en) | Data classification method, device, equipment and storage medium for grading quality inspection | |
US20230134218A1 (en) | Continuous learning for document processing and analysis | |
CN114663899A (en) | Financial bill processing method, device, equipment and medium | |
CN116861226A (en) | Data processing method and related device | |
CN107577760A (en) | A kind of file classification method and device based on constrained qualification | |
CN110895564A (en) | Potential customer data processing method and device | |
CN110019778B (en) | Item classification method and device | |
CN114372532B (en) | Method, device, equipment, medium and product for determining label labeling quality | |
CN112418354B (en) | Goods source information classification method and device, electronic equipment and storage medium | |
CN117058432B (en) | Image duplicate checking method and device, electronic equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180928 |