CN108268641A - Invoice information recognition methods and invoice information identification device, equipment and storage medium - Google Patents

Invoice information recognition methods and invoice information identification device, equipment and storage medium Download PDF

Info

Publication number
CN108268641A
CN108268641A CN201810051333.5A CN201810051333A CN108268641A CN 108268641 A CN108268641 A CN 108268641A CN 201810051333 A CN201810051333 A CN 201810051333A CN 108268641 A CN108268641 A CN 108268641A
Authority
CN
China
Prior art keywords
invoice
image
default
information
identification information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810051333.5A
Other languages
Chinese (zh)
Other versions
CN108268641B (en
Inventor
裴海鹏
范立波
张瑜
葛召
康嘉鑫
李蓓
杨笑宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ele Cloud Information Technology Co ltd
Original Assignee
Ele Cloud Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ele Cloud Information Technology Co ltd filed Critical Ele Cloud Information Technology Co ltd
Priority to CN201810051333.5A priority Critical patent/CN108268641B/en
Publication of CN108268641A publication Critical patent/CN108268641A/en
Application granted granted Critical
Publication of CN108268641B publication Critical patent/CN108268641B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words

Abstract

The present invention provides a kind of invoice information recognition methods, including:Receive invoice image to be identified;Whether judge in the invoice image comprising default ID;If so, obtain the invoice identification information included in the default ID;If it is not, the invoice identification information included in obtaining the predeterminable area of the invoice image;It is inquired in invoice database according to the invoice identification information and obtains the full ticket information of invoice corresponding with the invoice image.Correspondingly, the invention also provides invoice information identification device, computer equipment and computer readable storage mediums.By technical scheme of the present invention, the specific aim of invoice information identification can be effectively improved, so as to improve the efficiency of invoice information identification, promotes user experience.

Description

Invoice information recognition methods and invoice information identification device, equipment and storage medium
Technical field
The present invention relates to information discriminating technology fields, are identified in particular to invoice information recognition methods, invoice information Device, computer equipment and computer readable storage medium.
Background technology
At present, existing invoice recognition methods is generally template matching method and template characteristic matching method, wherein template matches Method is one of classical sorting technique, needs to define corresponding standard form for every a kind of images to be recognized, then by that will treat Identification image is compared one by one with standard form, and the similarity degree compared judges to wait to know according to the height of similarity degree Which class other image belongs to.The template matching method has higher differentiation rate, if identified digital with treating in standard form Number in identification image unanimously then judges to match, but can then cause the handling result of this method bad if there is noise, And it can also make the handling result of this method not satisfactory when the font of number is different, the font typeface of number changes. Moreover, the template matching method needs each pixel value by images to be recognized and standard form to be compared in matching, Computationally intensive, long operational time and treatment effeciency is low.As it can be seen that it is similar between two images that suitable balancing method is selected to compare Degree is very crucial.
And template characteristic matching method is compared to template matching method, be not direct relatively images to be recognized and standard form it Between each pixel value, but need extract corresponding feature respectively to the two, the phase between the feature extracted by comparing Achieve the purpose that discrimination like degree.Template characteristic matching method is a kind of common and higher utilization rate sorting technique, and general Logical template matching method is compared, and there is faster recognition rate simultaneously can preferably control noise.Specifically, the template characteristic Method of completing the square extracts feature in original image first, is compared without individual element, thus calculation amount it is smaller and calculate when Between it is apparent less.And the adaptability for being characteristics of image, improving deformation compared when being compared.It selects and extracts properly The feature that can preferably react character design feature it is very crucial in template characteristic matching process.
In addition, existing invoice identifying schemes are only the picture nominal value identification based on some way, there is no based on thereafter Platform structural data is identified, thus there are recognition efficiency it is relatively low the problem of.
Invention content
The purpose of the present invention is to provide a kind of invoice information identifying schemes, so overcome at least to a certain extent due to One or more problem caused by the limitation of the relevant technologies and defect.
Other characteristics and advantages of the present invention will be by the following detailed description apparent from or partially by the present invention Practice and acquistion.
According to the first aspect of the invention, a kind of invoice information recognition methods is provided, is included the following steps:
Receive invoice image to be identified;
Whether judge in the invoice image comprising default ID;
If so, obtain the invoice identification information included in the default ID;
If it is not, the invoice identification information included in obtaining the predeterminable area of the invoice image;
It is complete that acquisition invoice corresponding with the invoice image is inquired in invoice database according to the invoice identification information Ticket information.
In some embodiments of the invention, based on aforementioned schemes, judge whether know in the invoice image comprising default It the step of other code, specifically includes:
Image preprocessing is carried out to the invoice image, to obtain targeted color image;
The targeted color image is converted into corresponding first gray level image;
Extract all rectangular areas included in first gray level image;
Calculate the edge intensity value computing of each rectangular area in all rectangular areas;
When the edge intensity value computing is more than predetermined threshold value, determine to include the default ID in the invoice image, Otherwise it determines not include the default ID in the invoice image.
In some embodiments of the invention, it based on aforementioned schemes, obtains and is included in the predeterminable area of the invoice image Invoice identification information the step of, specifically include:
Intercept the first object rectangular area in the predeterminable area;
Using the default angular vertex of the first object rectangular area as datum mark first mesh is determined for searching The black-pixel region included in mark rectangular area;
The second target rectangle region of generation is marked according to the beginning and end of the black-pixel region;
Extract all black font regions included in the second target rectangle region;
Character recognition is carried out to each black font region according to trained default interacting deep learning network, to obtain The invoice identification information.
In some embodiments of the invention, based on aforementioned schemes, according to trained default interacting deep learning network Character recognition is carried out to each black font region, the step of to obtain the invoice identification information, is specifically included:
Each black font region is converted into corresponding second gray level image;
The gray value of second gray level image is determined as target feature vector;
The target feature vector is input in the default interacting deep learning network, and according to gray value and character The correspondence of classification carries out character recognition, to obtain the invoice identification information.
In some embodiments of the invention, based on aforementioned schemes, judge whether included in the invoice image described Before the step of default ID, further include:The image pixel of the invoice image received is detected whether in presetted pixel In the range of.
According to the second aspect of the invention, a kind of invoice information identification device is provided, including:
Receiving module, for receiving invoice image to be identified;
Judgment module, for whether judging in the invoice image comprising default ID;
First processing module is judged in the invoice image for working as the judgment module comprising the default ID When, obtain the invoice identification information included in the default ID;
Second processing module judges that the invoice image does not include the default ID for working as the judgment module When, obtain the invoice identification information included in the predeterminable area of the invoice image;
Acquisition module, for inquiring acquisition and the invoice image in invoice database according to the invoice identification information The corresponding full ticket information of invoice.
In some embodiments of the invention, based on aforementioned schemes, the judgment module specifically includes:
Submodule is handled, for carrying out image preprocessing to the invoice image, to obtain targeted color image;
Transform subblock, for the targeted color image to be converted to corresponding first gray level image;
First extracting sub-module, for extracting all rectangular areas included in first gray level image;
Computational submodule, for calculating the edge intensity value computing of each rectangular area in all rectangular areas;
First determination sub-module, for when the edge intensity value computing is more than predetermined threshold value, determining in the invoice image Comprising the default ID, otherwise determine not include the default ID in the invoice image.
In some embodiments of the invention, based on aforementioned schemes, the Second processing module specifically includes:
Submodule is intercepted, for intercepting the first object rectangular area in the predeterminable area;
Second determination sub-module, for using the default angular vertex of the first object rectangular area as datum mark for Search the black-pixel region for determining to be included in the first object rectangular area;
Submodule is generated, for marking the second target rectangle area of generation according to the beginning and end of the black-pixel region Domain;
Second extracting sub-module, for extracting all black font regions included in the second target rectangle region;
Submodule is identified, for being carried out according to trained default interacting deep learning network to each black font region Character recognition, to obtain the invoice identification information.
In some embodiments of the invention, based on aforementioned schemes, the identification submodule is specifically used for:
Each black font region is converted into corresponding second gray level image;
The gray value of second gray level image is determined as target feature vector;
The target feature vector is input in the default interacting deep learning network, and according to gray value and character The correspondence of classification carries out character recognition, to obtain the invoice identification information.
In some embodiments of the invention, based on aforementioned schemes, which further includes:Detect mould Block, for whether comprising before default ID, detecting the reception mould in judging the invoice image in the judgment module Whether the image pixel of the invoice image that block receives is in the range of presetted pixel.
According to the third aspect of the invention we, a kind of computer equipment is provided, including:
Processor;
For storing the memory of the processor-executable instruction, wherein, the processor is used to perform the storage The step of as above any one of embodiment of first aspect the method is realized during the executable instruction stored in device.
According to the fourth aspect of the invention, a kind of computer readable storage medium is provided, is stored thereon with computer journey Sequence realizes the step of as above any one of embodiment of first aspect the method when the computer program is executed by processor Suddenly.
In the technical solution provided in some embodiments of the present invention, the identification region of ticket image is published by frame, and Targetedly invoice information is carried out based on the identification region that frame is selected to identify, without identifying the entire nominal value of invoice image It rapidly identifies required invoice information, the efficiency of invoice information identification is effectively improved, so as to improve user experience.
Further, specifically on the basis of the default ID of identification invoice, optimization nominal value identification based on invoice Structural data is retrieved, and achievees the purpose that improve invoice information recognition efficiency.
Description of the drawings
Attached drawing herein is incorporated into specification and forms the part of this specification, shows the implementation for meeting the present invention Example, and be used to explain the principle of the present invention together with specification.It should be evident that the accompanying drawings in the following description is only the present invention Some embodiments, for those of ordinary skill in the art, can be with root under the premise of power is not made the creative labor Other attached drawings are obtained according to these attached drawings.In the accompanying drawings:
Fig. 1 diagrammatically illustrates the flow chart of invoice information recognition methods according to an embodiment of the invention;
Whether Fig. 2 is diagrammatically illustrated according to an embodiment of the invention judges in invoice image comprising default identification The method flow diagram of code;
Fig. 3 diagrammatically illustrates the invoice mark letter included in acquisition invoice image according to an embodiment of the invention The method course figure of breath;
Fig. 4 diagrammatically illustrates the flow chart of invoice information recognition methods according to another embodiment of the present invention;
Fig. 5 diagrammatically illustrates the block diagram of invoice information identification device according to one embodiment of present invention;
Fig. 6 diagrammatically illustrates the block diagram of judgment module shown in fig. 5;
Fig. 7 diagrammatically illustrates the block diagram of Second processing module shown in fig. 5;
Fig. 8 diagrammatically illustrates the block diagram of computer equipment according to an embodiment of the invention.
Specific embodiment
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be with a variety of shapes Formula is implemented, and is not understood as limited to example set forth herein;On the contrary, these embodiments are provided so that the present invention will more Fully and completely, and by the design of example embodiment comprehensively it is communicated to those skilled in the art.
In addition, described feature, structure or feature can be incorporated in one or more implementations in any suitable manner In example.In the following description, many details are provided to fully understand the embodiment of the present invention so as to provide.However, It will be appreciated by persons skilled in the art that can put into practice the present invention to technical solution without one in specific detail or more It is more or other methods, constituent element, device, step etc. may be used.In other cases, known in being not shown in detail or describing Method, apparatus, realization or operation are to avoid fuzzy each aspect of the present invention.
Attached block diagram shown in figure is only functional entity, not necessarily must be corresponding with physically separate entity. I.e., it is possible to it realizes these functional entitys using software form or is realized in one or more hardware modules or integrated circuit These functional entitys realize these functional entitys in heterogeneous networks and/or processor device and/or microcontroller device.
Attached flow chart shown in figure feed exemplary illustration, it is not necessary to including all contents and operation/step, It is not required to perform by described sequence.For example, some operation/steps can also decompose, and some operation/steps can close And or partly merge, therefore the sequence actually performed is possible to be changed according to actual conditions.
Fig. 1 diagrammatically illustrates the flow chart of invoice information recognition methods according to an embodiment of the invention.
With reference to Fig. 1, invoice information recognition methods according to an embodiment of the invention includes the following steps:
Step S10 receives invoice image to be identified.
It is understood that the invoice image can be by image capture devices such as camera or mobile phones to paper invoice Or the mode of the copy progress Image Acquisition of electronic invoice (such as PDF editions electronic invoice) obtains.
Whether step S12 is judged comprising default ID in the invoice image, if it is determined that it is to perform step S14 to be, Otherwise step S16 is performed.
Whether include quickly to obtain on the invoice image it is understood that detecting first after invoice image is got The default ID of the invoice of invoice information is got, specifically the default ID can preferably be Quick Response Code.
An exemplary embodiment according to the present invention, as shown in Fig. 2, step S12 can specifically include:
Step S120 carries out image preprocessing, to obtain targeted color image to the invoice image.
It is understood that invoice image carry out image preprocessing the step of, specifically can include to invoice image into Row normalized, the image rotation processing for showing invoice image front, pretreatment that color enhancement is carried out to image etc..
The targeted color image is converted to corresponding first gray level image by step S122.
Step S124 extracts all rectangular areas included in first gray level image.
Step S126 calculates the edge intensity value computing of each rectangular area in all rectangular areas.
Step S128 when the edge intensity value computing is more than predetermined threshold value, is determined in the invoice image comprising described pre- If identification code, otherwise determine not include the default ID in the invoice image.
Step S14 obtains the invoice identification information included in the default ID, i.e., default when being included in invoice image During identification code, invoice identification information is placed in default ID and is stored.
For above-mentioned steps S14, provide an exemplary embodiment and illustrate to get the invoice included in invoice image A kind of scheme of identification information obtains the invoice identification information included in invoice image by code identification technology.
Step S16 obtains the invoice identification information included in the predeterminable area of the invoice image, i.e., when in invoice image During not comprising default ID, invoice identification information is placed in the predeterminable area of invoice and is stored.
An exemplary embodiment according to the present invention, as shown in figure 3, step S16 can specifically include:
Step S160 intercepts the first object rectangular area in the predeterminable area.
Step S162 determines institute using the default angular vertex of the first object rectangular area as datum mark for searching State the black-pixel region included in first object rectangular area.
Step S164 marks the second target rectangle region of generation according to the beginning and end of the black-pixel region.
Step S166 extracts all black font regions included in the second target rectangle region.
Step S168 knows each black font region into line character according to trained default interacting deep learning network Not, to obtain the invoice identification information.
Further, an exemplary embodiment according to the present invention, step S168 can specifically include:It will be described each Black font region is converted to corresponding second gray level image;The gray value of second gray level image is determined as target signature Vector;The target feature vector is input in the default interacting deep learning network, and according to gray value and character type Other correspondence carries out character recognition, to obtain the invoice identification information.
For above-mentioned steps S16, provide an exemplary embodiment and illustrate to get the invoice included in invoice image Another scheme of identification information obtains the invoice identification information included in invoice image, such as mould by image recognition technology Formula identifies.
For above-mentioned steps S14 and step S16, the present invention provides two kinds of exemplary embodiment explanations to get invoice figure The scheme of invoice identification information included as in, wherein, the invoice identification information includes invoice codes, invoice number, makes out an invoice Date, without one or more of tax volume, the confidential information randomly generated and check code.
It is corresponding with the invoice image to inquire acquisition according to the invoice identification information in invoice database by step S18 The full ticket information of invoice.
The embodiment of the present invention publishes the identification region of ticket image by frame, and the identification region selected based on frame is carried out Targetedly invoice information identifies, the entire nominal value without identifying invoice image can rapidly identify required invoice letter Breath is effectively improved the efficiency of invoice information identification, so as to improve user experience.
Wherein, the full ticket information of invoice is specially the structural data stored in invoice database, can specifically be included:Hair The type of ticket, code, number, date of making out an invoice, purchaser's title, purchaser's Taxpayer Identification Number, pin side's title, pin side taxpayer identification Number, the amount of money, the amount of tax to be paid, valency tax add up to.
Further, in some embodiments of the invention, based on aforementioned schemes, before step S12 is performed, the hair Ticket information identifying method further includes:The image pixel of the invoice image received is detected whether in the range of presetted pixel.
Further, in some embodiments of the invention, based on aforementioned schemes, hair is got in execution of step S18 During the full ticket information of ticket, it can be shown or printed.
Fig. 4 diagrammatically illustrates the flow chart of invoice information recognition methods according to another embodiment of the present invention.
Step S402 receives invoice picture stream, preprocessed to obtain Target Photo.
Wherein, the invoice picture stream can either the Image Acquisition such as mobile phone or image barcode scanning equipment be set by camera It is standby to obtain.
Specifically, the coloured image I1 of invoice is obtained from invoice picture stream, the coloured image I1 of the invoice is carried out Normalized forms coloured image I2 or increases the processing of image rotation, and coloured image I2 fronts is made to show or carry out The processing such as trimming.
Further, in such an embodiment, it is preferred to can be returned the coloured image I1 using bilinear interpolation method One turns to the image of L × H sizes, and wherein L, H is respectively the width and height of image after scaling, and unit is pixel, and value can It is set according to practical situations, for example, can be the image size after 4160 × 3120 scalings by original color image size It is 4000 × 3000.
Further, in this embodiment, image can also be carried out to the coloured image I2 using guiding filtering algorithm Enhancing pretreatment, forms coloured image I3.
Step S404 judges whether the picture pixels of Target Photo are eligible, if performing step S406, otherwise terminates Flow.
Specifically, such as when the picture pixels of Target Photo are 1280 × 720, it is believed that it is qualified picture, when So, other pixel thresholds can also be set according to actual conditions.
Step S406 identifies Target Photo upper left side region.
Whether step S408, the upper left side region for detecting Target Photo have Quick Response Code mark, and step S410 is performed if having, no Then, step S412 is performed.
Specifically, based on said program, can Target Photo be first converted into gray level image, then utilizes polygon approach Method detection Target Photo in all rectangular areas, further Canny operator (Canny) is utilized to detect each rectangular area Interior edge, and the mean value of each rectangular area inward flange amplitude is calculated, using the mean value as the edge of corresponding rectangular area Intensity value.
Further, if the edge intensity value computing is more than preset strength value, which identifies for Quick Response Code, otherwise should Rectangular area is not Quick Response Code mark, so traverses all rectangular areas, wherein, preset strength value is set according to practical application It is fixed.
Step S410 identifies Quick Response Code and carries out barcode scanning decoding, identifies the corresponding Quick Response Code letter included in Target Photo It ceases (i.e. invoice identification information), such as:Invoice type, invoice number, invoice codes, the date of making out an invoice, without tax volume and verification The contents such as code, the confidential information randomly generated, further perform step S418.
Step S412 identifies the upper right side region of Target Photo.
Step S414 is modified the upper right side region of Target Photo using algorithm for pattern recognition.
Specifically, the rectangular area R1 that interception Target Photo upper right side size is r1 × r2, with the upper right of the rectangular area Point on the basis of angular vertex, searches black-pixel region to the left or downwards, and records the beginning and end of black-pixel region, label New rectangular area R2.
Further, each black font region in the R2 of rectangular area is partitioned into using sciagraphy, and it is returned One change is handled, and wherein the size of r1, r2 can be set according to practical application experience.
Further, utilization trained interacting deep learning network to black font region into the identification of line character.
Wherein, the interacting deep learning network is a kind of by depth belief network (DBN) and depth Boltzmann machine The deep learning network model that model (DBM) is combined, construction method include the following steps:DBN and DBM are combined structure Build 6 layer deep learning networks;1st layer is input layer, inputs the vector to determine dimension (length);6th layer is prediction interval (or label Layer), utilize this spy's output prediction result of logic;2nd, the 3 layer of non-directed graph being made of RBM connects entirely;4th, 5 layer then by RBM groups Into digraph connection;Each interlayer is attached by weight vector.
Wherein, the interacting deep learning network training method, includes the following steps:It is M that foundation sum, which is n size, The sample of the character sample (English alphabet, numerical character) of × N, and the classification of each character is identified, wherein, n > 10000, n value is bigger, and e-learning effect is better.And in training method, using " unsupervised pre-training+there is supervision to finely tune " Mode, pre-training successively is carried out to obtain the better weights of performance to deep layer network by different models, then by this One weights are finely adjusted depth network with counterpropagation network for initial value;
Further, using above-mentioned sample database and training method, the interacting deep learning network built is trained.
Specifically, when carrying out character recognition to black font region, gray level image is mainly converted it to, then by it Gray value is input to trained interacting deep learning network as feature vector, then according to gray value and character class Correspondence realizes the identification of character.It repeats the above steps and each black font region is identified, and according to identification Sequentially, inverted order output is carried out.
Step S416, the upper right side region recognition detected whether in Target Photo go out invoice identification information, such as invoice class Type, invoice codes, invoice number, date of making out an invoice, purchaser's title, purchaser's Taxpayer Identification Number, pin side's title, pin side taxpayer know Alias, the amount of money, the amount of tax to be paid, valency tax such as add up at the contents, if performing step 418, otherwise terminate flow.
Step S418 connects invoice database query structure data according to the invoice identification information identified.
Wherein, invoice identification information includes invoice number, invoice codes, date of making out an invoice, without tax volume and check code etc. Carry out unanimous vote face structured data query.
Step S420 is detected and whether is inquired data in invoice database, if so, performing step S422, is otherwise performed Step S424.
Step S422 returns to the full ticket information of the invoice got, and it is shown or printed;Further, may be used To dock the enterprise information management system or derived type structure data.
Further, the operations such as the storage management of typing structural data and invoice image can also be carried out.
Step S424 returns to the invoice identification information identified.
The above embodiment of the present invention is integrated with Quick Response Code identification technology, image pattern recognition and structuring number It is investigated that asking service and big data service, the scheme of the embodiment is by the way that Quick Response Code is identified, pattern-recognition, structural data are looked into Inquiry is organically combined, and when identifying general invoice, online linking Internet picture streaming data, can fast and accurately know at any time Do not go out the structural data in invoice unanimous vote face, return structure data provide convenient and reliable invoice structural data to the user.
Fig. 5 diagrammatically illustrates the block diagram of invoice information identification device according to an embodiment of the invention.
Reference Fig. 5, invoice information identification device 50 according to an embodiment of the invention, including:Receiving module 502, Judgment module 504, first processing module 506, Second processing module 508 and acquisition module 510.
Wherein, receiving module 502 is used to receive invoice image to be identified;Judgment module 504 is used to judge the invoice Whether default ID is included in image;First processing module 506 is used to judge the invoice image when the judgment module 504 In include the default ID when, obtain the invoice identification information included in the default ID;Second processing module 508 For when the judgment module 504 judges that the invoice image does not include the default ID, obtaining the invoice image Predeterminable area in the invoice identification information that includes;Acquisition module 510 is used for according to the invoice identification information in invoice data Inquiry obtains the full ticket information of invoice corresponding with the invoice image in library.
The identification region of ticket image is published by frame, and the identification region selected based on frame is carried out targetedly invoice and believed Breath identification, the entire nominal value without identifying invoice image can rapidly identify required invoice information, be effectively improved The efficiency of invoice information identification, so as to improve user experience.
In some embodiments of the invention, based on aforementioned schemes, the judgment module 504 specifically includes:Handle submodule Block 5040, transform subblock 5042, the first extracting sub-module 5044,5046 and first determination sub-module 5048 of computational submodule, With reference to Fig. 6.
Wherein, processing submodule 5040 is used to carry out image preprocessing to the invoice image, to obtain targeted color figure Picture;Transform subblock 5042 is used to the targeted color image being converted to corresponding first gray level image;First extraction submodule Block 5044 is used to extract all rectangular areas included in first gray level image;Computational submodule 5046 is described for calculating The edge intensity value computing of each rectangular area in all rectangular areas;First determination sub-module 5048 is used for when the edge strength When value is more than predetermined threshold value, determine otherwise to determine in the invoice image comprising the default ID in the invoice image The default ID is not included.
In some embodiments of the invention, based on aforementioned schemes, the Second processing module 508 specifically includes:Interception Submodule 5100, the second determination sub-module 5102, generation module, the second extracting sub-module 5106 and identification submodule 5108, ginseng According to Fig. 7.
Wherein, interception submodule 5100 is used to intercept the first object rectangular area in the predeterminable area;Second determines Submodule 5102 is used for determining described for searching using the default angular vertex of the first object rectangular area as datum mark The black-pixel region included in first object rectangular area;Submodule 5104 is generated to be used for according to the black-pixel region The second target rectangle region of beginning and end label generation;Second extracting sub-module 5106 is used to extract second target rectangle All black font regions included in region;Identify that submodule 5108 is used to learn net according to trained default interacting depth Network carries out character recognition to each black font region, to obtain the invoice identification information.
In some embodiments of the invention, based on aforementioned schemes, the identification submodule 5108 is specifically used for:By described in Each black font region is converted to corresponding second gray level image;The gray value of second gray level image is determined as target Feature vector;The target feature vector is input in the default interacting deep learning network, and according to gray value and word The correspondence for according with classification carries out character recognition, to obtain the invoice identification information.
In some embodiments of the invention, based on aforementioned schemes, which further includes:Detection module 512, with reference to Fig. 5.
Wherein, whether detection module 512 is used in the judgment module 504 judges the invoice image comprising default know Before other code, whether the image pixel of the invoice image that the receiving module 502 receives is detected in presetted pixel range It is interior.
Fig. 8 diagrammatically illustrates the block diagram of computer equipment according to an embodiment of the invention.
With reference to Fig. 8, computer equipment 80 according to an embodiment of the invention, including processor 802 and memory 804, wherein, the computer program that can be run on processor 802, wherein memory 804 and processing are stored on memory 804 It can be connected between device 802 by bus, it is real when the processor 802 is for performing the computer program stored in memory 804 The step of invoice information recognition methods described in example is now performed as described above.
Step in the method for the embodiment of the present invention can be sequentially adjusted, merged and deleted according to actual needs.
It should be noted that although several modules or list for acting the equipment performed are referred in above-detailed Member, but this division is not enforceable.In fact, according to the embodiment of the present invention, two or more above-described moulds The feature and function of block either unit can embody in a module or unit.A conversely, above-described module Either the feature and function of unit can be further divided into multiple modules or unit to embody.
Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented Mode can be realized by software, can also be realized in a manner that software is with reference to necessary hardware.Therefore, according to the present invention The technical solution of embodiment can be embodied in the form of software product, the software product can be stored in one it is non-volatile Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions so that a calculating Equipment (can be personal computer, server, touch control terminal or network equipment etc.) is performed according to embodiment of the present invention Method.
Those skilled in the art will readily occur to the present invention its after considering specification and putting into practice invention disclosed herein His embodiment.This application is intended to cover the present invention any variations, uses, or adaptations, these modifications, purposes or Person's adaptive change follows the general principle of the present invention and including the present invention and undocumented in the art known normal Knowledge or conventional techniques.Description and embodiments are considered only as illustratively, and true scope and spirit of the invention are by following Claim point out.
It should be understood that the invention is not limited in the precision architecture for being described above and being shown in the drawings, and And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is only limited by appended claim.

Claims (8)

1. a kind of invoice information recognition methods, which is characterized in that including:
Receive invoice image to be identified;
Whether judge in the invoice image comprising default ID;
If so, obtain the invoice identification information included in the default ID;
If it is not, the invoice identification information included in obtaining the predeterminable area of the invoice image;
It is inquired in invoice database according to the invoice identification information and obtains invoice unanimous vote face corresponding with the invoice image Information.
2. invoice information recognition methods according to claim 1, which is characterized in that be in the judgement invoice image It no the step of including default ID, specifically includes:
Image preprocessing is carried out to the invoice image, to obtain targeted color image;
The targeted color image is converted into corresponding first gray level image;
Extract all rectangular areas included in first gray level image;
Calculate the edge intensity value computing of each rectangular area in all rectangular areas;
When the edge intensity value computing is more than predetermined threshold value, determine to include the default ID in the invoice image, otherwise It determines not include the default ID in the invoice image.
3. invoice information recognition methods according to claim 1, which is characterized in that described if it is not, obtaining the invoice figure It the step of invoice identification information included in the predeterminable area of picture, specifically includes:
Intercept the first object rectangular area in the predeterminable area;
Using the default angular vertex of the first object rectangular area as datum mark the first object square is determined for searching The black-pixel region included in shape region;
The second target rectangle region of generation is marked according to the beginning and end of the black-pixel region;
Extract all black font regions included in the second target rectangle region;
Character recognition is carried out to each black font region according to trained default interacting deep learning network, with described in acquisition Invoice identification information.
4. invoice information recognition methods according to claim 3, which is characterized in that described according to trained default mixing Deep learning network carries out character recognition to each black font region, the step of to obtain the invoice identification information, specifically Including:
Each black font region is converted into corresponding second gray level image;
The gray value of second gray level image is determined as target feature vector;
The target feature vector is input in the default interacting deep learning network, and according to gray value and character class Correspondence carry out character recognition, to obtain the invoice identification information.
5. the invoice information recognition methods according to any one of claims 1 to 4, which is characterized in that described in the judgement Before the step of whether including default ID in invoice image, further include:Detect the image of the invoice image received Whether pixel is in the range of presetted pixel.
6. a kind of invoice information identification device, which is characterized in that including:
Receiving module, for receiving invoice image to be identified;
Judgment module, for whether judging in the invoice image comprising default ID;
First processing module, for when the judgment module judges to include the default ID in the invoice image, obtaining Take the invoice identification information included in the default ID;
Second processing module, for when the judgment module judges that the invoice image does not include the default ID, obtaining Take the invoice identification information included in the invoice image;
Acquisition module, it is corresponding with the invoice image for inquiring acquisition in invoice database according to the invoice identification information The full ticket information of invoice.
7. invoice information identification device according to claim 6, which is characterized in that the judgment module specifically includes:
Submodule is handled, for carrying out image preprocessing to the invoice image, to obtain targeted color image;
Transform subblock, for the targeted color image to be converted to corresponding first gray level image;
First extracting sub-module, for extracting all rectangular areas included in first gray level image;
Computational submodule, for calculating the edge intensity value computing of each rectangular area in all rectangular areas;
First determination sub-module, for when the edge intensity value computing is more than predetermined threshold value, determining to include in the invoice image Otherwise the default ID determines not include the default ID in the invoice image.
8. invoice information identification device according to claim 6, which is characterized in that the Second processing module is specifically wrapped It includes:
Submodule is intercepted, for intercepting the first object rectangular area in the invoice image;
Second determination sub-module, for being searched using the default angular vertex of the first object rectangular area as datum mark Determine the black-pixel region included in the first object rectangular area;
Submodule is generated, for marking the second target rectangle region of generation according to the beginning and end of the black-pixel region;
Second extracting sub-module, for extracting all black font regions included in the second target rectangle region;
Identify submodule, for according to trained default interacting deep learning network to each black font region into line character Identification, to obtain the invoice identification information.
CN201810051333.5A 2018-01-18 2018-01-18 Invoice information identification method, invoice information identification device, equipment and storage medium Active CN108268641B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810051333.5A CN108268641B (en) 2018-01-18 2018-01-18 Invoice information identification method, invoice information identification device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810051333.5A CN108268641B (en) 2018-01-18 2018-01-18 Invoice information identification method, invoice information identification device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN108268641A true CN108268641A (en) 2018-07-10
CN108268641B CN108268641B (en) 2020-11-13

Family

ID=62775914

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810051333.5A Active CN108268641B (en) 2018-01-18 2018-01-18 Invoice information identification method, invoice information identification device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN108268641B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325491A (en) * 2018-08-16 2019-02-12 腾讯科技(深圳)有限公司 Identification code recognition methods, device, computer equipment and storage medium
CN110751088A (en) * 2019-10-17 2020-02-04 深圳金蝶账无忧网络科技有限公司 Data processing method and related equipment
CN111104844A (en) * 2019-10-12 2020-05-05 中国平安财产保险股份有限公司 Multi-invoice information input method and device, electronic equipment and storage medium
CN111784423A (en) * 2020-07-31 2020-10-16 广东电网有限责任公司梅州供电局 Invoice matching method and device, electronic equipment and storage medium
CN113033565A (en) * 2021-03-10 2021-06-25 大象慧云信息技术有限公司 Electronic invoice data processing method and system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008167314A (en) * 2006-12-28 2008-07-17 Ricoh Co Ltd Image processing apparatus
CN105450411A (en) * 2014-08-14 2016-03-30 阿里巴巴集团控股有限公司 Method, device and system for utilizing card characteristics to perform identity verification
CN105678292A (en) * 2015-12-30 2016-06-15 成都数联铭品科技有限公司 Complex optical text sequence identification system based on convolution and recurrent neural network
CN106228675A (en) * 2016-07-22 2016-12-14 金蝶软件(中国)有限公司 The method and apparatus identifying true from false of bills
CN106960306A (en) * 2017-02-23 2017-07-18 杭州仟金顶卓筑信息科技有限公司 Architectural engineering material management system remote acknowledgement signature data entry method
CN107067006A (en) * 2017-04-20 2017-08-18 金电联行(北京)信息技术有限公司 A kind of method for recognizing verification code and system for serving data acquisition
CN107145814A (en) * 2017-04-19 2017-09-08 畅捷通信息技术股份有限公司 invoice input method, invoice input device and terminal
CN107480681A (en) * 2017-08-02 2017-12-15 四川长虹电器股份有限公司 High concurrent bank slip recognition System and method for based on deep learning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008167314A (en) * 2006-12-28 2008-07-17 Ricoh Co Ltd Image processing apparatus
CN105450411A (en) * 2014-08-14 2016-03-30 阿里巴巴集团控股有限公司 Method, device and system for utilizing card characteristics to perform identity verification
CN105678292A (en) * 2015-12-30 2016-06-15 成都数联铭品科技有限公司 Complex optical text sequence identification system based on convolution and recurrent neural network
CN106228675A (en) * 2016-07-22 2016-12-14 金蝶软件(中国)有限公司 The method and apparatus identifying true from false of bills
CN106960306A (en) * 2017-02-23 2017-07-18 杭州仟金顶卓筑信息科技有限公司 Architectural engineering material management system remote acknowledgement signature data entry method
CN107145814A (en) * 2017-04-19 2017-09-08 畅捷通信息技术股份有限公司 invoice input method, invoice input device and terminal
CN107067006A (en) * 2017-04-20 2017-08-18 金电联行(北京)信息技术有限公司 A kind of method for recognizing verification code and system for serving data acquisition
CN107480681A (en) * 2017-08-02 2017-12-15 四川长虹电器股份有限公司 High concurrent bank slip recognition System and method for based on deep learning

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325491A (en) * 2018-08-16 2019-02-12 腾讯科技(深圳)有限公司 Identification code recognition methods, device, computer equipment and storage medium
US11494577B2 (en) 2018-08-16 2022-11-08 Tencent Technology (Shenzhen) Company Limited Method, apparatus, and storage medium for identifying identification code
CN111104844A (en) * 2019-10-12 2020-05-05 中国平安财产保险股份有限公司 Multi-invoice information input method and device, electronic equipment and storage medium
CN111104844B (en) * 2019-10-12 2023-11-14 中国平安财产保险股份有限公司 Multi-invoice information input method and device, electronic equipment and storage medium
CN110751088A (en) * 2019-10-17 2020-02-04 深圳金蝶账无忧网络科技有限公司 Data processing method and related equipment
CN111784423A (en) * 2020-07-31 2020-10-16 广东电网有限责任公司梅州供电局 Invoice matching method and device, electronic equipment and storage medium
CN111784423B (en) * 2020-07-31 2023-08-25 广东电网有限责任公司梅州供电局 Invoice matching method and device, electronic equipment and storage medium
CN113033565A (en) * 2021-03-10 2021-06-25 大象慧云信息技术有限公司 Electronic invoice data processing method and system
CN113033565B (en) * 2021-03-10 2021-11-19 大象慧云信息技术有限公司 Electronic invoice data processing method and system

Also Published As

Publication number Publication date
CN108268641B (en) 2020-11-13

Similar Documents

Publication Publication Date Title
CN108268641A (en) Invoice information recognition methods and invoice information identification device, equipment and storage medium
CN110020592B (en) Object detection model training method, device, computer equipment and storage medium
CN105426356B (en) A kind of target information recognition methods and device
Türkyılmaz et al. License plate recognition system using artificial neural networks
CN105574513A (en) Character detection method and device
CN111488873B (en) Character level scene text detection method and device based on weak supervision learning
JP2008537198A (en) Intelligent import of information from a foreign application user interface using artificial intelligence
CN110598019B (en) Repeated image identification method and device
CN111401374A (en) Model training method based on multiple tasks, character recognition method and device
CN109447080B (en) Character recognition method and device
Veres et al. Choosing the Method of Finding Similar Images in the Reverse Search System.
US11600088B2 (en) Utilizing machine learning and image filtering techniques to detect and analyze handwritten text
CN112215236B (en) Text recognition method, device, electronic equipment and storage medium
JPWO2015146113A1 (en) Identification dictionary learning system, identification dictionary learning method, and identification dictionary learning program
Kumar et al. Distortion, rotation and scale invariant recognition of hollow Hindi characters
CN113792659B (en) Document identification method and device and electronic equipment
Wicht et al. Camera-based sudoku recognition with deep belief network
CN114826681A (en) DGA domain name detection method, system, medium, equipment and terminal
Viet et al. A robust end-to-end information extraction system for Vietnamese identity cards
CN115713669B (en) Image classification method and device based on inter-class relationship, storage medium and terminal
Vidhyalakshmi et al. Text detection in natural images with hybrid stroke feature transform and high performance deep Convnet computing
CN107240185B (en) A kind of crown word number identification method, device, equipment and storage medium
Lopez-Alanis et al. Rule-based aggregation driven by similar images for visual saliency detection
KR20190093752A (en) Method and system for scene text detection using deep learning
Xu et al. Application of Neural Network in Handwriting Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Invoice information recognition method, invoice information recognition device, equipment and storage medium

Effective date of registration: 20210709

Granted publication date: 20201113

Pledgee: Haidian Beijing science and technology enterprise financing Company limited by guarantee

Pledgor: ELE-CLOUD INFORMATION TECHNOLOGY Co.,Ltd.

Registration number: Y2021990000594

PE01 Entry into force of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20220913

Granted publication date: 20201113

Pledgee: Haidian Beijing science and technology enterprise financing Company limited by guarantee

Pledgor: ELE-CLOUD INFORMATION TECHNOLOGY CO.,LTD.

Registration number: Y2021990000594

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Invoice information identification method and invoice information identification device, equipment and storage medium

Effective date of registration: 20220913

Granted publication date: 20201113

Pledgee: Haidian Beijing science and technology enterprise financing Company limited by guarantee

Pledgor: ELE-CLOUD INFORMATION TECHNOLOGY CO.,LTD.

Registration number: Y2022990000625

PC01 Cancellation of the registration of the contract for pledge of patent right

Granted publication date: 20201113

Pledgee: Haidian Beijing science and technology enterprise financing Company limited by guarantee

Pledgor: ELE-CLOUD INFORMATION TECHNOLOGY CO.,LTD.

Registration number: Y2022990000625