CN111008635A - OCR-based multi-bill automatic identification method and system - Google Patents
OCR-based multi-bill automatic identification method and system Download PDFInfo
- Publication number
- CN111008635A CN111008635A CN201911192294.1A CN201911192294A CN111008635A CN 111008635 A CN111008635 A CN 111008635A CN 201911192294 A CN201911192294 A CN 201911192294A CN 111008635 A CN111008635 A CN 111008635A
- Authority
- CN
- China
- Prior art keywords
- image
- bill
- ocr
- module
- based multi
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 238000007781 pre-processing Methods 0.000 claims abstract description 19
- 238000012545 processing Methods 0.000 claims description 13
- 230000011218 segmentation Effects 0.000 claims description 13
- 230000002457 bidirectional effect Effects 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 5
- 238000005520 cutting process Methods 0.000 claims description 3
- 238000001514 detection method Methods 0.000 claims description 3
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 238000012015 optical character recognition Methods 0.000 description 24
- 238000010586 diagram Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 238000003860 storage Methods 0.000 description 6
- 238000004590 computer program Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- XEEYBQQBJWHFJM-UHFFFAOYSA-N iron Substances [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 1
- 229910052742 iron Inorganic materials 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/142—Image acquisition using hand-held instruments; Constructional details of the instruments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Abstract
The invention discloses an OCR-based multi-bill automatic identification method and an OCR-based multi-bill automatic identification system, which comprise the following steps of obtaining an OCR bill sample; the image acquisition module acquires a bill image to be identified; the bill image is input into an image preprocessing module to be processed to obtain a secondary image; the denoising module is used for denoising the secondary image to obtain a standard image; and the standard image is input into the bill recognition module to be detected and recognized. The invention has the beneficial effects that: the OCR-based multi-bill automatic recognition method can reduce recognition difference of recognizing the existence of a plurality of different bills in one image.
Description
Technical Field
The invention relates to the technical field of character recognition, in particular to a multi-bill automatic recognition method based on OCR and a multi-bill automatic recognition system based on OCR.
Background
In recent years, bill identification services are developed rapidly, but the bill identification rate is still relatively low, so that after bill identification, bill entry personnel need to perform comprehensive manual verification on each identified field to correct error information of automatic identification. The recognition rate is low, the manual verification process is relatively time-consuming, and the commercial utilization rate of the bill recognition service is always low.
In the intelligent financial reimbursement system based on AI, can carry out the automatic identification of invoice with the help of techniques such as OCR to reduce reimburser and type in work load, reimburse auditor's examination work load etc. promote reimbursement degree of automation and reimbursement efficiency. For a long time, the bill recognition engines do not form a uniform specification, and service APIs provided by the recognition engines are greatly different and cannot be mutually compatible. Despite the increasing development of electronic payment, electronic bills and the like, the traditional paper bills are still one of the widely used modes in real work and life, such as various paper invoices, financial bills and the like. The existing bill identification aims at different types of identification samples, and the detection and identification effects of characters of the existing bill identification are greatly different.
Disclosure of Invention
This section is for the purpose of summarizing some aspects of embodiments of the invention and to briefly introduce some preferred embodiments. In this section, as well as in the abstract and the title of the invention of this application, simplifications or omissions may be made to avoid obscuring the purpose of the section, the abstract and the title, and such simplifications or omissions are not intended to limit the scope of the invention.
The present invention has been made in view of the above-mentioned conventional problems.
Therefore, one technical problem solved by the present invention is: a method for identifying different types of bills is provided, and small identification difference can be kept when different samples are identified.
In order to solve the technical problems, the invention provides the following technical scheme: a multi-bill automatic identification method based on OCR comprises the following steps of obtaining a bill sample of OCR; the image acquisition module acquires a bill image to be identified; the bill image is input into an image preprocessing module to be processed to obtain a secondary image; the denoising module is used for denoising the secondary image to obtain a standard image; and the standard image is input into the bill recognition module to be detected and recognized.
As a preferable scheme of the OCR-based multi-note automatic recognition method of the present invention, wherein: the image preprocessing module comprises the following preprocessing steps of rotating or perspective zooming the bill image; aligning the characters in the bill image along the horizontal and vertical directions after rotating or perspective zooming; and clipping the aligned image to obtain the secondary image.
As a preferable scheme of the OCR-based multi-note automatic recognition method of the present invention, wherein: the denoising module comprises the following steps of performing denoising processing on the secondary image; adjusting histogram information of the secondary image; reserving light pixels in the light area and dark pixels in the dark area; the standard image of a high contrast sample is obtained.
As a preferable scheme of the OCR-based multi-note automatic recognition method of the present invention, wherein: the bill recognition module comprises the following recognition processing steps of analyzing a structure of the standard image containing the character to be recognized; denoising and correcting the object to be detected by using a threshold value; performing row-column segmentation on the text information; and introducing the divided character image into a recognition model for processing to obtain character information in the original image.
As a preferable scheme of the OCR-based multi-note automatic recognition method of the present invention, wherein: the identification model adopts a CTPN algorithm model and comprises the following identification steps of detecting different unit blocks formed by dividing horizontal characters in a complex scene; adding a vertical Anchor to detect vertical characters; learning spatial features and sequence features in the image by using a bidirectional LSTM layer; the regular expression is used to find the corresponding meaning of each character in the bill image.
As a preferable scheme of the OCR-based multi-note automatic recognition method of the present invention, wherein: the character segmentation comprises the following steps of cutting a single character by an image non-uniform segmentation method; obtaining the width of each character by using a function, and selecting a group suitable for segmentation from a plurality of approximate classifications; and using a CNN algorithm model to identify and recognize the classified group of characters.
As a preferable scheme of the OCR-based multi-note automatic recognition method of the present invention, wherein: the CTPN algorithm model comprises the following steps of obtaining feature maps with the first 5 Convstage of VGG16, wherein the size of the feature maps is W × H × C; extracting features on the feature map with 3 x 3 sliding windows; predicting a target to-be-selected area defined by a plurality of anchors by utilizing the extracted features; outputting W x 256 results in the LSTM layer with bidirectional extracted characteristic input values; inputting the result to a 512-dimensional full connection layer; and finally, obtaining the recognized output through classification or regression.
As a preferable scheme of the OCR-based multi-note automatic recognition method of the present invention, wherein: the output comprises the coordinates of the height and the center of the selection frame on the y axis, the category information of k anchors and the horizontal offset of the selection frame; the category information can specify whether it is a character.
As a preferable scheme of the OCR-based multi-note automatic recognition method of the present invention, wherein: the image preprocessing module comprises the following steps of obtaining the secondary image through uniform size and alignment; setting a global threshold value T for the secondary image; dividing data of the image into two parts by T, wherein the two parts comprise pixel groups larger than T and pixel groups smaller than T; the pixel values of the pixel groups larger than T are set to white and the pixel values of the pixel groups smaller than T are set to black.
The invention solves the technical problems that: the method is realized by depending on the system, and can keep small identification difference when different samples are identified.
In order to solve the technical problems, the invention provides the following technical scheme: an OCR-based multi-bill automatic identification system comprises an image acquisition module, an image preprocessing module, a noise removing module and a bill identification module; the image acquisition module is used for acquiring a bill image to be identified; the image preprocessing module is used for processing the acquired image to obtain a secondary image; the denoising module is used for denoising the secondary image to obtain a standard image; the bill identification module is used for detecting and identifying the standard image to generate an identification result.
The invention has the beneficial effects that: the OCR-based multi-bill automatic recognition method can reduce recognition difference of recognizing the existence of a plurality of different bills in one image.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise. Wherein:
fig. 1 is a schematic flow chart of a CTPN algorithm according to a first embodiment of this embodiment;
fig. 2 is a schematic diagram of a cnn algorithm model according to the first embodiment of the present invention;
FIG. 3 is a schematic diagram of a two-way LSTM structure model according to the first embodiment of the present invention;
FIG. 4 is a schematic structural diagram of an overall principle of an OCR-based multi-ticket automatic recognition system according to a second embodiment of the present embodiment;
fig. 5 is a schematic diagram of an actual recognition effect according to the second embodiment of this embodiment;
fig. 6 is a schematic diagram of an actual recognition effect of multiple tickets according to the second embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, specific embodiments accompanied with figures are described in detail below, and it is apparent that the described embodiments are a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present invention, shall fall within the protection scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.
Furthermore, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
The present invention will be described in detail with reference to the drawings, wherein the cross-sectional views illustrating the structure of the device are not enlarged partially in general scale for convenience of illustration, and the drawings are only exemplary and should not be construed as limiting the scope of the present invention. In addition, the three-dimensional dimensions of length, width and depth should be included in the actual fabrication.
Meanwhile, in the description of the present invention, it should be noted that the terms "upper, lower, inner and outer" and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of describing the present invention and simplifying the description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation and operate, and thus, cannot be construed as limiting the present invention. Furthermore, the terms first, second, or third are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The terms "mounted, connected and connected" in the present invention are to be understood broadly, unless otherwise explicitly specified or limited, for example: can be fixedly connected, detachably connected or integrally connected; they may be mechanically, electrically, or directly connected, or indirectly connected through intervening media, or may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
Example 1
In this embodiment, an OCR (optical character recognition) is proposed, which refers to a process in which an electronic device (e.g., a scanner or a digital camera) checks a character printed on paper, determines a shape thereof by detecting a dark and light pattern, and then translates the shape into a computer text by a character recognition method; the method is a technology for converting characters in a paper document into an image file with a black-white dot matrix in an optical mode aiming at print characters, and converting the characters in the image into a text format through recognition software for further editing and processing by word processing software. How to debug or use the auxiliary information to improve the recognition accuracy is a main indicator for measuring the performance of the OCR system: the rejection rate, the false recognition rate, the recognition speed, the user interface friendliness, the product stability, the usability, the feasibility and the like.
Referring to the illustrations of fig. 1 to 3, the OCR utilizes optical technology and computer technology to read out characters printed or written on paper and converts the characters into a form that can be understood by both a computer and a person, and the embodiment provides an OCR-based multi-note automatic recognition method, which specifically comprises the following steps,
s1: acquiring an OCR bill sample;
s2: the image acquisition module 100 acquires a bill image to be identified;
s3: the bill image is input into an image preprocessing module 200 to be processed to obtain a secondary image; the image pre-processing module 200 in this step comprises the following pre-processing steps,
rotating or perspective zooming the bill image;
aligning the characters in the bill image along the horizontal and vertical directions after rotating or perspective zooming;
the aligned image is cropped to obtain a secondary image.
In order to reduce the recognition difference between different images, the method also comprises the following steps,
obtaining a secondary image through uniform size and alignment;
setting a global threshold value T for the secondary image;
dividing data of the image into two parts by T, wherein the two parts comprise pixel groups larger than T and pixel groups smaller than T;
setting the pixel value of the pixel group larger than T as white and the pixel value of the pixel group smaller than T as black
S4: the denoising module 300 denoises the secondary image to obtain a standard image; the denoising module 300 includes the following steps,
performing decolorizing processing on the secondary image;
adjusting histogram information of the secondary image;
reserving light pixels in the light area and dark pixels in the dark area;
a standard image of the high contrast sample is obtained.
S5: the standard image is input into the bill recognition module 400 for detection and recognition.
The ticket identification module 400 includes the following identification processing steps,
analyzing a structure of a standard image containing characters to be recognized;
denoising and correcting the object to be detected by using a threshold value;
performing row-column segmentation on the text information;
and introducing the divided character image into a recognition model for processing to obtain character information in the original image.
Furthermore, the identification model adopts a CTPN algorithm model and comprises the following identification steps,
detecting different unit blocks divided by characters in horizontal rows in a complex scene;
adding a vertical Anchor to detect vertical characters;
learning spatial features and sequence features in the image by using a bidirectional LSTM layer;
the regular expression is used to find the corresponding meaning of each character in the bill image.
The text segmentation in this embodiment comprises the following steps,
cutting a single character by an image non-uniform segmentation method;
obtaining the width of each character by using a function, and selecting a group suitable for segmentation from a plurality of approximate classifications;
and using a CNN algorithm model to identify and recognize the classified group of characters.
The CTPN algorithm model includes the following steps,
using the first 5 Convstage of VGG16 to obtain feature map with the size of W × H × C;
extracting features on feature map with 3 × 3 sliding window;
predicting a target to-be-selected area defined by a plurality of anchors by utilizing the extracted features;
outputting W x 256 results in the LSTM layer with bidirectional extracted characteristic input values;
inputting the result to a 512-dimensional full connection layer;
and finally, obtaining the recognized output through classification or regression.
Wherein the output comprises the height of the selection frame, the coordinate of the y axis of the center, the category information of k anchors and the horizontal offset of the selection frame; the category information can specify whether it is a character or not.
Scene one:
the technical effects adopted in the method are verified and explained, different methods selected in the embodiment and the method are adopted for comparison and test, and the test results are compared by means of scientific demonstration to verify the real effect of the method. The traditional technical scheme has the defect of insufficient accuracy in identification.
In order to verify that the method has higher identification precision compared with other methods.
In this embodiment, the ocr algorithm tesseract of the google open source and the method are adopted to respectively identify different bills, so as to compare the bills.
In the embodiment, 50 high-iron tickets and 50 value-added tax invoices are used as test samples to test the performances of the two methods. tesseract is mainly tested using google open source code. The method herein is tested in the python programming language.
The results are shown in table 1 below.
Table 1: and (6) testing results.
False recognition rate of 50 high railway tickets | Increment of 50 sheetsFalse rate of tax receipt | |
tesseract | 10% | 16% |
Methods of the |
4% | 8% |
From the table above it can be seen that the method proposed herein is superior to tesseract.
It should be recognized that embodiments of the present invention can be realized and implemented by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer readable memory. The methods may be implemented in a computer program using standard programming techniques, including a non-transitory computer-readable storage medium configured with the computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner, according to the methods and figures described in the detailed description. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.
Further, the operations of processes described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes described herein (or variations and/or combinations thereof) may be performed under the control of one or more computer systems configured with executable instructions, and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications) collectively executed on one or more processors, by hardware, or combinations thereof. The computer program includes a plurality of instructions executable by one or more processors.
Further, the method may be implemented in any type of computing platform operatively connected to a suitable interface, including but not limited to a personal computer, mini computer, mainframe, workstation, networked or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, and the like. Aspects of the invention may be embodied in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optically read and/or write storage medium, RAM, ROM, or the like, such that it may be read by a programmable computer, which when read by the storage medium or device, is operative to configure and operate the computer to perform the procedures described herein. Further, the machine-readable code, or portions thereof, may be transmitted over a wired or wireless network. The invention described herein includes these and other different types of non-transitory computer-readable storage media when such media include instructions or programs that implement the steps described above in conjunction with a microprocessor or other data processor. The invention also includes the computer itself when programmed according to the methods and techniques described herein. A computer program can be applied to input data to perform the functions described herein to transform the input data to generate output data that is stored to non-volatile memory. The output information may also be applied to one or more output devices, such as a display. In a preferred embodiment of the invention, the transformed data represents physical and tangible objects, including particular visual depictions of physical and tangible objects produced on a display.
Example 2
Referring to the illustrations of fig. 4-6, an OCR-based multi-bill automatic identification system includes an image acquisition module 100, an image preprocessing module 200, a noise removing module 300 and a bill identification module 400; the image acquisition module 100 is used for acquiring a bill image to be identified; the image preprocessing module 200 is used for processing the acquired image to obtain a secondary image; the denoising module 300 is used for denoising the secondary image to obtain a standard image; the bill identifying module 400 is used for detecting the identification standard image and generating an identification result. The image capturing module 100 is a shooting device of a camera, and acquires a front-end image. The image preprocessing module 200, the denoising module 300 and the bill identifying module 400 are software modules of a computer, which are corresponding to hardware parts of a computer processor, and practice corresponding processing and identifying functions through program codes.
Referring to fig. 6, a schematic diagram of a plurality of bill identifications included in an image is shown. The segmentation and field identification of a plurality of invoices on one picture can be obviously seen from the identification effect of fig. 6, the method provided by the embodiment can identify the edge of each invoice and successfully identify the effective field.
As used in this application, the terms "component," "module," "system," and the like are intended to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, or software in execution. For example, a component may be, but is not limited to being: a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of example, both an application running on a computing device and the computing device can be a component. One or more components can reside within a process and/or thread of execution and a component can be localized on one computer and/or distributed between two or more computers. In addition, these components can execute from various computer readable media having various data structures thereon. The components may communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from one component interacting with another component in a local system, distributed system, and/or across a network such as the internet with other systems by way of the signal).
It should be noted that the above-mentioned embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, which should be covered by the claims of the present invention.
Claims (10)
1. An OCR-based multi-bill automatic identification method is characterized in that: comprises the following steps of (a) carrying out,
acquiring an OCR bill sample;
the image acquisition module (100) acquires a bill image to be identified;
the bill image is input into an image preprocessing module (200) to be processed to obtain a secondary image;
a denoising module (300) is used for denoising the secondary image to obtain a standard image;
and the standard image is input into a bill recognition module (400) for detection and recognition.
2. An OCR-based multi-note automatic recognition method according to claim 1, characterized in that: the image pre-processing module (200) comprises the following pre-processing steps,
rotating or perspective zooming the bill image;
aligning the characters in the bill image along the horizontal and vertical directions after rotating or perspective zooming;
and clipping the aligned image to obtain the secondary image.
3. An OCR-based multi-note automatic recognition method according to claim 1 or 2, characterized in that: the denoising module (300) comprises the steps of,
performing a decoloring process on the secondary image;
adjusting histogram information of the secondary image;
reserving light pixels in the light area and dark pixels in the dark area;
the standard image of a high contrast sample is obtained.
4. An OCR-based multi-note automatic recognition method according to claim 3, characterized in that: the ticket identification module (400) includes the following identification processing steps,
analyzing the structure of the standard image containing the character to be recognized;
denoising and correcting the object to be detected by using a threshold value;
performing row-column segmentation on the text information;
and introducing the divided character image into a recognition model for processing to obtain character information in the original image.
5. An OCR-based multi-note automatic recognition method according to any one of claims 1 to 2 or 4, characterized in that: the identification model adopts a CTPN algorithm model and comprises the following identification steps,
detecting different unit blocks divided by characters in horizontal rows in a complex scene;
adding a vertical Anchor to detect vertical characters;
learning spatial features and sequence features in the image by using a bidirectional LSTM layer;
the regular expression is used to find the corresponding meaning of each character in the bill image.
6. An OCR-based multi-note automatic recognition method according to claim 5, characterized in that: the text segmentation comprises the following steps of,
cutting a single character by an image non-uniform segmentation method;
obtaining the width of each character by using a function, and selecting a group suitable for segmentation from a plurality of approximate classifications;
and using a CNN algorithm model to identify and recognize the classified group of characters.
7. An OCR-based multi-note automatic recognition method according to claim 6, characterized in that: the CTPN algorithm model includes the following steps,
using the first 5 Convstage of VGG16 to obtain feature map with the size of W × H × C;
extracting features on the feature map with 3 x 3 sliding windows;
predicting a target to-be-selected area defined by a plurality of anchors by utilizing the extracted features;
outputting W x 256 results in the LSTM layer with bidirectional extracted characteristic input values;
inputting the result to a 512-dimensional full connection layer;
and finally, obtaining the recognized output through classification or regression.
8. An OCR-based multi-note automatic recognition method according to claim 7, characterized in that: the output comprises the coordinates of the height and the center of the selection frame on the y axis, the category information of k anchors and the horizontal offset of the selection frame; the category information can specify whether it is a character.
9. An OCR-based multi-note automatic recognition method according to claim 8, characterized in that: the image pre-processing module (200) comprises the steps of,
obtaining the secondary image through uniform size and alignment;
setting a global threshold value T for the secondary image;
dividing data of the image into two parts by T, wherein the two parts comprise pixel groups larger than T and pixel groups smaller than T;
the pixel values of the pixel groups larger than T are set to white and the pixel values of the pixel groups smaller than T are set to black.
10. An OCR-based multi-bill automatic identification system is characterized in that: the device comprises an image acquisition module (100), an image preprocessing module (200), a noise-removing module (300) and a bill identification module (400);
the image acquisition module (100) is used for acquiring a bill image to be identified;
the image preprocessing module (200) is used for processing the acquired image to obtain a secondary image;
the denoising module (300) is used for denoising the secondary image to obtain a standard image;
the bill identification module (400) is used for detecting and identifying the standard image to generate an identification result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911192294.1A CN111008635A (en) | 2019-11-28 | 2019-11-28 | OCR-based multi-bill automatic identification method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911192294.1A CN111008635A (en) | 2019-11-28 | 2019-11-28 | OCR-based multi-bill automatic identification method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111008635A true CN111008635A (en) | 2020-04-14 |
Family
ID=70112157
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911192294.1A Pending CN111008635A (en) | 2019-11-28 | 2019-11-28 | OCR-based multi-bill automatic identification method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111008635A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111931664A (en) * | 2020-08-12 | 2020-11-13 | 腾讯科技(深圳)有限公司 | Mixed note image processing method and device, computer equipment and storage medium |
CN112883954A (en) * | 2021-02-22 | 2021-06-01 | 的卢技术有限公司 | OCR bill recognition method, device, computer equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108717543A (en) * | 2018-05-14 | 2018-10-30 | 北京市商汤科技开发有限公司 | A kind of invoice recognition methods and device, computer storage media |
CN110363199A (en) * | 2019-07-16 | 2019-10-22 | 济南浪潮高新科技投资发展有限公司 | Certificate image text recognition method and system based on deep learning |
-
2019
- 2019-11-28 CN CN201911192294.1A patent/CN111008635A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108717543A (en) * | 2018-05-14 | 2018-10-30 | 北京市商汤科技开发有限公司 | A kind of invoice recognition methods and device, computer storage media |
CN110363199A (en) * | 2019-07-16 | 2019-10-22 | 济南浪潮高新科技投资发展有限公司 | Certificate image text recognition method and system based on deep learning |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111931664A (en) * | 2020-08-12 | 2020-11-13 | 腾讯科技(深圳)有限公司 | Mixed note image processing method and device, computer equipment and storage medium |
CN111931664B (en) * | 2020-08-12 | 2024-01-12 | 腾讯科技(深圳)有限公司 | Mixed-pasting bill image processing method and device, computer equipment and storage medium |
CN112883954A (en) * | 2021-02-22 | 2021-06-01 | 的卢技术有限公司 | OCR bill recognition method, device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110766014B (en) | Bill information positioning method, system and computer readable storage medium | |
CN103065134B (en) | A kind of fingerprint identification device and method with information | |
US8611662B2 (en) | Text detection using multi-layer connected components with histograms | |
CN110781885A (en) | Text detection method, device, medium and electronic equipment based on image processing | |
CN103034848B (en) | A kind of recognition methods of form types | |
US9679354B2 (en) | Duplicate check image resolution | |
US11657644B2 (en) | Automatic ruler detection | |
CN111259891B (en) | Method, device, equipment and medium for identifying identity card in natural scene | |
CN111626249B (en) | Method and device for identifying geometric figure in topic image and computer storage medium | |
US9396389B2 (en) | Techniques for detecting user-entered check marks | |
CN114549993A (en) | Method, system and device for scoring line segment image in experiment and readable storage medium | |
CN111008635A (en) | OCR-based multi-bill automatic identification method and system | |
CN111199240A (en) | Training method of bank card identification model, and bank card identification method and device | |
JP2017521011A (en) | Symbol optical detection method | |
CN110222660B (en) | Signature authentication method and system based on dynamic and static feature fusion | |
US9378428B2 (en) | Incomplete patterns | |
US20230048495A1 (en) | Method and platform of generating document, electronic device and storage medium | |
CN114663899A (en) | Financial bill processing method, device, equipment and medium | |
CN113780116A (en) | Invoice classification method and device, computer equipment and storage medium | |
CN112861861A (en) | Method and device for identifying nixie tube text and electronic equipment | |
Bhatt et al. | Text Extraction & Recognition from Visiting Cards | |
CN111612045A (en) | Universal method for acquiring target detection data set | |
CN115471846B (en) | Image correction method and device, electronic equipment and readable storage medium | |
WO2024001051A1 (en) | Spatial omics single cell data acquisition method and apparatus, and electronic device | |
CN113537026B (en) | Method, device, equipment and medium for detecting graphic elements in building plan |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 11th Floor, Building A1, Huizhi Science and Technology Park, No. 8 Hengtai Road, Nanjing Economic and Technological Development Zone, Jiangsu Province, 211000 Applicant after: DILU TECHNOLOGY Co.,Ltd. Address before: Building C4, No.55 Liyuan South Road, moling street, Nanjing, Jiangsu Province Applicant before: DILU TECHNOLOGY Co.,Ltd. |