CN108717545B - Bill identification method and system based on mobile phone photographing - Google Patents

Bill identification method and system based on mobile phone photographing Download PDF

Info

Publication number
CN108717545B
CN108717545B CN201810482124.6A CN201810482124A CN108717545B CN 108717545 B CN108717545 B CN 108717545B CN 201810482124 A CN201810482124 A CN 201810482124A CN 108717545 B CN108717545 B CN 108717545B
Authority
CN
China
Prior art keywords
invoice
bill
key information
type
list
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810482124.6A
Other languages
Chinese (zh)
Other versions
CN108717545A (en
Inventor
李小英
王卓静
张帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dajingfang Network Technology Co.,Ltd.
Original Assignee
Beijing Dazhangfang Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dazhangfang Network Technology Co ltd filed Critical Beijing Dazhangfang Network Technology Co ltd
Priority to CN201810482124.6A priority Critical patent/CN108717545B/en
Publication of CN108717545A publication Critical patent/CN108717545A/en
Application granted granted Critical
Publication of CN108717545B publication Critical patent/CN108717545B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/273Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion removing elements interfering with the pattern to be recognised
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion

Abstract

The invention provides a bill identification method based on mobile phone photographing, which comprises the following steps: s1, after learning various types of bills, the intelligent recognition system in the mobile phone stores the key information of various types of bills and establishes a bill key information database; s2, scanning various mixed bills into electronic images by photographing through a mobile phone, uploading the electronic images to an intelligent recognition system to obtain keywords, and automatically recognizing and correcting inclined and rotating pictures by the intelligent recognition system; s3, comparing the obtained electronic layout image with the stored key information or key words according to the scanned information to obtain the bill type of the bill, S4, carrying out secondary identification on the bill which cannot be identified or is checked by the tax bureau after image processing. The invention does not need manual input and bill type sorting, greatly improves the efficiency and accuracy, saves the cost and time and liberates the manpower.

Description

Bill identification method and system based on mobile phone photographing
Technical Field
The invention relates to the technical field of bill identification methods, in particular to a bill identification method and a bill identification system based on mobile phone photographing.
Background
With the implementation of the improvement and increase of tax system structure in China, value-added tax is the most important circulated tax variety in China at present, and the tax collection range of the value-added tax further covers most of the second industry covered originally and most of the second industry and the third industry at present.
At present, the collection management of the value-added tax is more strict, the quantity of value-added tax invoices is greatly increased, the manual entry is too slow, the time consumption for checking the truth is high, the efficiency is low, and the error rate is high. Meanwhile, more kinds of bills also have the problems that for example, various bank receipt, machine-issued bill, train ticket, quota invoice and the like are traditionally and manually input. And after the enterprise financial staff finish the authentication deduction of bills, the enterprise financial staff also needs to perform the work of bill scanning, data entry, manual proofreading and the like. Traditional manual entry mode, the user need invest in a large amount of human costs and time cost, has not only raised the operation cost, enters the speed and is difficult to promote moreover, and the error rate is difficult to reduce, brings many negative effects to improving business processing ageing, promotion enterprise service quality.
However, only one bill is identified, which is not in accordance with the practical use condition, and usually, enterprises can enter various bills to be paid, such as value-added tax bills, machine-issued bills, quota invoice train tickets, bank bills and the like. Therefore, a system for identifying the mixed scanning bill is inevitably developed by utilizing the modern information technology means.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides the bill identification method and the bill identification system based on mobile phone photographing, which are used for identifying the mixed scanning of various types of bills, have very high identification rate, save the labor cost and the time cost and improve the efficiency.
Specifically, the invention provides a bill identification method based on mobile phone photographing, which comprises the following steps:
s1, after learning various types of bills, the intelligent recognition system stores the key information of each type of bill, recognizes the different key information of each type of bill and defines keywords for bank bills, machine issuing bills, train tickets and quota invoices, establishes a bill key information database through continuous learning and storage in the bill scanning process, wherein the bill key information database comprises a recognition sequence list, a keyword list, a key information list and a corresponding bill type list, the keyword list, the key information list and the corresponding bill type list are in one-to-one correspondence, and the bill key information database is described in the following table:
Figure BDA0001665789950000021
s2, generating a clear electronic layout image by mobile phone photographing and uploading the clear electronic layout image to an intelligent recognition system, automatically carrying out intelligent edge detection on the uploaded electronic layout image by the intelligent recognition system, removing parts irrelevant to bills in the electronic image, reserving the part of the bills, automatically and intelligently recognizing and correcting the inclined and rotated images, adjusting the electronic layout image to the set optimal size by the intelligent recognition system according to the condition that the sizes of the electronic images are not consistent caused by mobile phone photographing of different brands, and adjusting the electronic layout image to the set optimal darkness by the intelligent recognition system according to the condition that the images are too bright or too dark during photographing;
s3, comparing the obtained electronic domain image with stored key information or key words according to the scanned information to obtain the bill type of the bill, wherein the comparison sequence is performed according to the sequence of the identification sequence list, if the bill type is a value-added tax invoice, the inspection is performed, if the inspection is successful, the inspection result is returned to the intelligent identification terminal to be displayed, and if the inspection is failed, the invoice is classified as an inspection error class; if the bill type is other than the value-added tax invoice, directly returning the invoice type of the invoice to the intelligent identification terminal for displaying, and if the invoice type of the invoice cannot be identified, classifying the invoice with the invoice type which cannot be identified into an unidentifiable type and returning an identification result;
s4, carrying out secondary recognition on invoices which cannot be recognized or are checked wrongly after image processing, wherein the image processing method is determined according to the specific reasons which cannot be recognized, specifically comprises locking the position of key information, and carrying out block cutting, red seal elimination, line removal or machine learning training on incomplete numbers according to the coordinates of pixel points;
and S5, repeating the steps S1-S3 after the invoice of the unrecognizable class or the inspection error class is secondarily recognized, and acquiring the final bill type and the key information corresponding to the bill type.
Preferably, step S3 specifically includes the following steps:
s31, extracting key information directly from the obtained electronic domain image, if the key information can be extracted directly, comparing the scanned key information with key information columns of value-added tax general invoices, roll invoices, value-added tax electronic general invoices, motor vehicle sales unified invoices or value-added tax special invoices in a key information list stored in a bill key information database, if the invoices belong to one of the value-added tax general invoices, the roll invoices, the value-added tax electronic general invoices, the motor vehicle sales unified invoices or the value-added tax special invoices, checking is carried out, if the checking is successful, the invoice type and the key information corresponding to the invoice type are returned, and if the checking is failed, the invoice is classified into a checking error class and the invoice type and the corresponding key information are returned; if the invoice does not belong to one of the value-added tax general invoice, the roll invoice, the value-added tax electronic general invoice, the motor vehicle sales unified invoice or the value-added tax special invoice, extracting keywords, acquiring key information corresponding to the keywords according to the extracted keywords and entering the step S32;
s32, comparing the extracted keywords with the keyword column of the bank bill in the keyword list stored in the bill keyword information database, if the invoice belongs to the bank bill, identifying the keyword contained in the keywords according to the keywords, returning the bill type and the corresponding keyword information, and if the invoice does not belong to the bank bill, entering the step S33;
s33, comparing the extracted keywords with the keyword column of the machine invoice stored in the keyword list in the bill keyword information database, if the invoice belongs to the machine invoice, identifying the keyword information contained in the keywords according to the keywords, returning the bill type and the corresponding key information, and if the invoice does not belong to the machine invoice, entering the step S34;
s34, comparing the extracted keywords with the keyword column of the train ticket in the keyword list stored in the key information database of the ticket, if the invoice belongs to the train ticket, identifying the key information contained in the keywords according to the keywords, returning the type of the ticket and the corresponding key information, and if the invoice does not belong to the train ticket, entering the step S35;
s35, comparing the extracted keywords with the keyword column of the quota invoice in the keyword list stored in the bill keyword information database, if the invoice belongs to the quota invoice, identifying the key information contained in the keywords according to the keywords, returning the bill type and the corresponding key information, and if the invoice does not belong to the quota invoice, entering the step S36;
and S36, if the invoice type of the invoice cannot be identified, classifying the invoice with the unidentified invoice type into an unidentified class and returning an identification result.
Preferably, the machine learning training of the missing digits is specifically to learn digits that are easily recognized as errors, including 6 and 8,1 and 0,5 and 9, and 2 and 0.
Preferably, learning the numbers that are susceptible to error recognition specifically comprises the steps of:
pretreatment: finding partial sub-images of the ROI of the image and carrying out size normalization processing;
extracting features, namely converting the image into a feature vector;
and (4) classifying and identifying, namely classifying by adopting a k-nearest neighbor classification method, and finally finishing identification according to a classification result to accurately identify the numbers which are easy to identify errors.
Preferably, the specific steps of feature extraction are as follows: opening the picture, then carrying out noise reduction processing, then graying the picture, finally setting a threshold value, storing the binary value into a 32 x 32 array, wherein each point is a pixel value, and converting 1024(32 x 32) numerical values into (1, 1024) vectors.
Preferably, the method for checking the value-added tax invoice in S3 is to send the keyword to a national value-added tax invoice checking platform of the national tax administration to check authenticity.
Preferably, the bill identification system based on mobile phone mixed scanning comprises a scanning device, an identification terminal and an intelligent identification system, wherein the scanning device and the identification terminal are respectively in communication connection with the intelligent identification system,
the intelligent identification system comprises a picture processing unit, a picture processing unit and a picture processing unit, wherein the picture processing unit is used for processing pictures;
the key information extraction unit is used for extracting key information of the picture according to a related algorithm;
the identification unit is used for identifying the bill according to the key information to obtain the bill type;
the checking unit is used for checking the value-added tax invoice;
and the communication unit is used for communicating with the intelligent terminal.
Preferably, a machine learning unit is further included for machine learning training the missing digits, in particular learning the digits susceptible to error recognition, including 6 and 8,1 and 0,5 and 9, and 2 and 0.
Preferably, the scanning device is a mobile phone, and the intelligent identification system is a mobile phone APP.
Compared with the prior art, the invention has the following beneficial effects:
the intelligent identification system adopted by the invention can realize the shooting identification of the bill by the mobile phone without manual input or bill type arrangement, and enterprise financial staff do not need to perform works such as bill scanning, data entry, manual proofreading and the like after finishing the authentication deduction of the bill, thereby greatly improving the efficiency and accuracy, saving the cost and time and liberating the manpower.
Compared with the prior art, the invention has the greatest leap that the mobile phone can photograph and identify various bills, and not only can identify a single bill, but also has the advantages of more abundant and intelligent identification types, time cost saving, efficiency improvement,
secondly, the recognition accuracy is greatly improved, the whole ticket is recognized for the first recognition, the intelligent recognition system can automatically recognize and correct the inclined and rotating pictures, the intelligent recognition system processes the images of the wrong tickets for the recognition, locks the key information position, cuts the tickets into blocks according to the coordinates of pixel points, eliminates the red seal, removes lines, performs machine learning training on incomplete numbers, and performs secondary recognition. Thereby improving the recognition accuracy.
The intelligent identification system adopted by the invention can realize mobile phone photographing and intelligent identification of the bill, does not need to return to a company for reimbursement, is convenient for office staff to travel and purchase in daily offices, provides reliable data for financial reimbursement, checks the compliance of the invoice at any time and any place, inquires the authenticity of the invoice, saves cost and time, improves efficiency and liberates manpower.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
Detailed Description
Exemplary embodiments, features and aspects of the present invention will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The invention relates to a bill identification method based on mobile phone photographing, which comprises the following steps:
s1, after learning various types of bills, the intelligent recognition system stores the key information of each type of bill, recognizes the different key information of each type of bill, defines key words for bank bills, machine issuing bills, train tickets and quota invoices, and establishes a bill key information database through continuous learning and storage in the bill scanning process, wherein the bill key information database comprises a recognition sequence list, a key word list, a key information list and a corresponding bill type list, and the key word list, the key information list and the corresponding bill type list are in one-to-one correspondence.
Specifically, the bill key information database is described in the following table:
Figure BDA0001665789950000051
Figure BDA0001665789950000061
the specific learning process is to scan a large number of bills, distinguish the key information of the bills, associate the key information of the bills with the actual bill types, and define keywords for certain specific invoices, such as bank bills, machine-issued bills, train tickets and quota invoices, wherein the keywords are defined in the learning process of the invoices and correspond to the keywords with the key information. In other words, the keywords defined for some tickets contain the required keyword information, and as long as the keywords can be scanned, the keyword information contained in the keywords can be obtained. The learning of the database is based on a large number of scans, and in practical application, the list can be directly defined and embedded into the database or more types of invoice types can be added into the database.
And S2, scanning various mixed bills into electronic images through a mobile phone, uploading the electronic images to an intelligent recognition system to obtain keywords, and automatically recognizing and correcting the inclined and rotated images by the intelligent recognition system. The electronic version image may be a black and white image or a color image.
S3, comparing the obtained electronic domain image with stored key information or key words according to the scanned information to obtain the bill type of the bill, wherein the comparison sequence is performed according to the sequence of the identification sequence list, if the bill type is a first type invoice and a second type invoice in the identification sequence list, wherein the first type invoice and the second type invoice in the identification sequence list both belong to a value-added tax invoice, and are replaced by the value-added tax invoice, the inspection is performed, if the inspection is successful, the inspection result is returned to the intelligent identification terminal to be displayed, and if the inspection is failed, the invoice is classified as an inspection error; and if the bill type is other than the value-added tax invoice, directly returning the invoice type of the invoice to the intelligent identification terminal for displaying, and if the invoice type of the invoice cannot be identified, classifying the invoice with the unidentifiable invoice type into an unidentifiable invoice and returning an identification result.
The method comprises the steps of scanning an obtained electronic domain image to obtain information, wherein the information obtained by scanning is a previously defined keyword or key information, positioning a two-dimensional code of a scanned invoice, analyzing the content stored in the two-dimensional code to obtain the information hidden in the two-dimensional code, comparing the information according to a corresponding sequence, and judging the invoice type of the invoice.
Preferably, step S3 specifically includes the following steps:
s31, extracting key information directly from the obtained electronic domain image, if the key information can be extracted directly, comparing the scanned key information with key information columns of value-added tax general invoices, roll invoices, value-added tax electronic general invoices, motor vehicle sales unified invoices or value-added tax special invoices in a key information list stored in a bill key information database, if the invoices belong to one of the value-added tax general invoices, the roll invoices, the value-added tax electronic general invoices, the motor vehicle sales unified invoices or the value-added tax special invoices, checking is carried out, if the checking is successful, the invoice type and the key information corresponding to the invoice type are returned, and if the checking is failed, the invoice is classified into a checking error class and the invoice type and the corresponding key information are returned; if the invoice does not belong to one of the value-added tax general invoice, the roll invoice, the value-added tax electronic general invoice, the motor vehicle sales unified invoice or the value-added tax special invoice, extracting keywords, acquiring key information corresponding to the keywords according to the extracted keywords and entering the step S32;
s32, comparing the extracted keywords with the keyword column of the bank bill in the keyword list stored in the bill keyword information database, if the invoice belongs to the bank bill, identifying the keyword contained in the keywords according to the keywords, returning the bill type and the corresponding keyword information, and if the invoice does not belong to the bank bill, entering the step S33;
s33, comparing the extracted keywords with the keyword column of the machine invoice stored in the keyword list in the bill keyword information database, if the invoice belongs to the machine invoice, identifying the keyword information contained in the keywords according to the keywords, returning the bill type and the corresponding key information, and if the invoice does not belong to the machine invoice, entering the step S34;
s34, comparing the extracted keywords with the keyword column of the train ticket in the keyword list stored in the key information database of the ticket, if the invoice belongs to the train ticket, identifying the key information contained in the keywords according to the keywords, returning the type of the ticket and the corresponding key information, and if the invoice does not belong to the train ticket, entering the step S35;
s35, comparing the extracted keywords with the keyword column of the quota invoice in the keyword list stored in the bill keyword information database, if the invoice belongs to the quota invoice, identifying the key information contained in the keywords according to the keywords, returning the bill type and the corresponding key information, and if the invoice does not belong to the quota invoice, entering the step S36;
and S36, if the invoice type of the invoice cannot be identified, classifying the invoice with the unidentified invoice type into an unidentified class and returning an identification result.
And S4, carrying out secondary identification on invoices which cannot be identified or are checked by the tax administration wrongly after image processing, wherein the image processing method is determined according to the specific reasons which cannot be identified, and specifically comprises the steps of locking the position of key information, and carrying out block cutting, red seal elimination, line removal or machine learning training on incomplete numbers according to the coordinates of pixel points.
Preferably, the machine learning training of the missing digits is specifically to learn digits that are easily recognized as errors, including 6 and 8,1 and 0,5 and 9, and 2 and 0.
Preferably, learning the numbers that are susceptible to error recognition specifically comprises the steps of:
pretreatment: finding partial sub-images of the ROI of the image and carrying out size normalization processing;
extracting features, namely converting the image into a feature vector;
and (4) classifying and identifying, namely classifying by adopting a k-nearest neighbor classification method, and finally finishing identification according to a classification result to accurately identify the numbers which are easy to identify errors.
Preferably, the specific steps of feature extraction are as follows: opening the picture, then carrying out noise reduction processing, then graying the picture, finally setting a threshold value, storing the binary value into a 32 x 32 array, wherein each point is a pixel value, and converting 1024(32 x 32) numerical values into (1, 1024) vectors.
Preferably, the method for checking the value-added tax invoice in S3 is to send the keyword to a national value-added tax invoice checking platform of the national tax administration to check authenticity.
Preferably, the bill identification system based on mobile phone mixed scanning comprises a scanning device, an identification terminal and an intelligent identification system, wherein the scanning device and the identification terminal are respectively in communication connection with the intelligent identification system,
the intelligent identification system comprises a picture processing unit, a picture processing unit and a picture processing unit, wherein the picture processing unit is used for processing pictures;
the key information extraction unit is used for extracting key information of the picture according to a related algorithm;
the identification unit is used for identifying the bill according to the key information to obtain the bill type;
the checking unit is used for checking the value-added tax invoice;
and the communication unit is used for communicating with the intelligent terminal.
Preferably, a machine learning unit is further included for machine learning training the missing digits, in particular learning the digits susceptible to error recognition, including 6 and 8,1 and 0,5 and 9, and 2 and 0.
Preferably, the scanning device is a mobile phone. Clear electronic color images are generated through mobile phone photographing and uploaded to an intelligent recognition system, intelligent edge detection is automatically carried out by the intelligent recognition system aiming at the uploaded electronic color images, parts irrelevant to bills in electronic images are removed, the parts of the bills are reserved, inclined and rotating pictures can be automatically and intelligently recognized and corrected by the intelligent recognition system, the intelligent recognition system intelligently adjusts the size of the electronic images to the optimal size aiming at the condition that the electronic images are inconsistent in size caused by mobile phone photographing of different brands and models, the pictures are too bright or too dark when being photographed, the intelligent recognition system intelligently adjusts the optimal darkness to the optimal value, key information is extracted through intelligent recognition processing, checking is carried out, and checking results are returned to a mobile phone end for display. Aiming at the condition that the recognition is wrong due to the picture problem, the intelligent recognition system carries out image processing on the picture, locks the position of key information, cuts blocks according to the coordinates of pixel points, eliminates red marks, removes lines, carries out machine learning training on incomplete numbers and carries out secondary recognition.
Detailed description of the preferred embodiment 1
Taking a value-added tax special invoice as an example, the key information of the value-added tax special invoice obtained by scanning is as follows: and (3) invoice code: 5XXX1XX1XX, invoice number: XXXX5XX4, date: 20171027, amount: 88288.29.
specific example 2
Taking a value-added tax common invoice as an example, the key information of the common invoice obtained by scanning is as follows: and (3) invoice code:
5XXX17XXX0, invoice number: 0XXX4XX8, date: 20171017, check verification code: 551000.
specific example 3
Taking a value-added tax electronic common invoice as an example, the key information of the common invoice obtained by scanning is as follows: and (3) invoice code:
01 XXXXXXX 0111, invoice number: 17XXXX54, date: 20171017, check verification code: 3XXXX 7.
Specific example 4
Taking a bank bill as an example, the key information of the bank bill obtained by scanning is as follows: the name of the bank: china agricultural Bank, document name: enterprise online bank commission, payee: chongqing city XXXX, Inc., payer: sichuan XXXXXXX, Ltd, date: 20180206, amount: 10.00, remarks: and (5) the enterprise online bank transaction commission charge.
Specific example 5
Taking a passenger car machine invoice as an example, the key words of the machine invoice are as follows: machine invoice, key information is: amount of money: 195.00.
specific example 6
Taking a train ticket as an example, the keywords of the train ticket are as follows: railway, 12306, hard seat, soft seat, business seat, first class seat, second class seat, soft sleeping, hard sleeping key information is: the starting place: beijing, destination: zhengzhou, date: 20170818, amount: 93.00.
specific example 7
Taking a quota invoice as an example, the key words of the quota invoice are the quota invoice, and the key information is as follows: amount of money: 100.00.
compared with the prior art, the invention has the following beneficial effects:
the intelligent identification system adopted by the invention can realize the shooting identification of the bill by the mobile phone without manual input or bill type arrangement, and enterprise financial staff do not need to perform works such as bill scanning, data entry, manual proofreading and the like after finishing the authentication deduction of the bill, thereby greatly improving the efficiency and accuracy, saving the cost and time and liberating the manpower.
Compared with the prior art, the invention has the greatest leap that the mobile phone can photograph and identify various bills, and not only can identify a single bill, but also has the advantages of more abundant and intelligent identification types, time cost saving, efficiency improvement,
secondly, the recognition accuracy is greatly improved, the whole ticket is recognized for the first recognition, the intelligent recognition system can automatically recognize and correct the inclined and rotating pictures, the intelligent recognition system processes the images of the wrong tickets for the recognition, locks the key information position, cuts the tickets into blocks according to the coordinates of pixel points, eliminates the red seal, removes lines, performs machine learning training on incomplete numbers, and performs secondary recognition. Thereby improving the recognition accuracy.
The intelligent identification system adopted by the invention can realize mobile phone photographing and intelligent identification of the bill, does not need to return to a company for reimbursement, is convenient for office staff to travel and purchase in daily offices, provides reliable data for financial reimbursement, checks the compliance of the invoice at any time and any place, inquires the authenticity of the invoice, saves cost and time, improves efficiency and liberates manpower.
Finally, it should be noted that: the above-mentioned embodiments are only used for illustrating the technical solution of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (9)

1. A bill identification method based on mobile phone photographing is characterized in that: which comprises the following steps:
s1, after an intelligent recognition system arranged in the mobile phone automatically recognizes and intelligently analyzes and learns various types of bills, storing key information of various types of bills, recognizing different key information of various types of bills, defining key words for bank bills, machine-issued bills, train tickets and quota invoices, establishing a bill key information database through continuous training and storing in the recognition process of the bills, wherein the bill key information database comprises a recognition sequence list, a key word list, a key information list and a corresponding bill type list, the key word list, the key information list and the corresponding bill type list are in one-to-one correspondence, and the bill key information database is as follows:
the identification sequence list is divided into a first class, a second class, a third class, a fourth class, a fifth class and a sixth class;
when the identification sequence list is of a first type, the bill type list is a value-added tax common invoice, a roll invoice and a value-added tax electronic common invoice, the keyword list is null, and the key information list is an invoice code, an invoice number, a date and an inspection code;
when the identification sequence list is of a second type, the bill type list is a motor vehicle sales unified invoice and a value-added tax special invoice, the keyword list is absent, and the key information list is an invoice code, an invoice number, a date and an amount;
when the identification sequence list is of a third type, the bill type list is a bank bill, the keyword list is enterprise online bank commission, a payment receiving and account entering notice, a client receives and accounts, a settlement account payment voucher, a transfer remittance commission, a cash payment sheet, the key information list is a bank name, a bill name, a payee name, a payer name, a date, an amount and remark information;
when the identification sequence list is of the fourth type, the bill type list is machine invoicing, the keyword list is machine invoicing, and the key information list is money;
when the identification sequence list is of a fifth type, the bill type list is a train ticket, the keyword list is a railway, 12306, a hard seat, a soft seat, a business seat, a first seat, a second seat, a hard bed and a soft bed, and the key information list is a departure place, a destination, a date and a money amount;
when the identification sequence list is of the sixth type, the bill type list is a quota invoice, the keyword list is the quota invoice, and the key information list is the money amount;
s2, generating a clear electronic layout image by mobile phone photographing and uploading the clear electronic layout image to an intelligent recognition system, automatically carrying out intelligent edge detection by the intelligent recognition system aiming at the uploaded electronic layout image, removing parts irrelevant to bills in the electronic image, reserving the part of the bills, automatically and intelligently recognizing and correcting the inclined and rotated images, adjusting the electronic layout image to the set optimal size by the intelligent recognition system aiming at the condition that the sizes of the electronic images are not consistent due to mobile phone photographing of different brands and models, and adjusting the electronic layout image to the set optimal darkness by the intelligent recognition system aiming at the condition that the images are too bright or too dark during photographing;
s3, comparing the obtained electronic domain image with stored key information or key words according to the scanned information to obtain the bill type of the bill, wherein the comparison sequence is performed according to the sequence of the identification sequence list, if the bill type is the first type and the second type of invoices in the identification sequence list, the examination is performed, if the examination is successful, the examination result is returned to the intelligent identification terminal to be displayed, and if the examination is failed, the invoice is classified as an examination error type; if the bill type is an invoice type other than the first type and the second type of invoices, directly returning the invoice type of the invoice to the intelligent identification terminal for displaying, and if the invoice type of the invoice cannot be identified, classifying the invoice with the unrecognized invoice type into the unrecognized invoice type and returning an identification result; the method specifically comprises the following steps:
s31, extracting key information directly from the obtained electronic domain image, if the key information can be extracted directly, comparing the scanned key information with key information columns of value-added tax general invoices, roll invoices, value-added tax electronic general invoices, motor vehicle sales unified invoices or value-added tax special invoices in a key information list stored in a bill key information database, if the invoices belong to one of the value-added tax general invoices, the roll invoices, the value-added tax electronic general invoices, the motor vehicle sales unified invoices or the value-added tax special invoices, checking is carried out, if the checking is successful, the invoice type and the key information corresponding to the invoice type are returned, and if the checking is failed, the invoice is classified into a checking error class and the invoice type and the corresponding key information are returned; if the invoice does not belong to one of the value-added tax general invoice, the roll invoice, the value-added tax electronic general invoice, the motor vehicle sales unified invoice or the value-added tax special invoice, extracting keywords, acquiring key information corresponding to the keywords according to the extracted keywords and entering the step S32;
s32, comparing the extracted keywords with the keyword column of the bank bill in the keyword list stored in the bill keyword information database, if the invoice belongs to the bank bill, identifying the keyword contained in the keywords according to the keywords, returning the bill type and the corresponding keyword information, and if the invoice does not belong to the bank bill, entering the step S33;
s33, comparing the extracted keywords with the keyword column of the machine invoice stored in the keyword list in the bill keyword information database, if the invoice belongs to the machine invoice, identifying the keyword information contained in the keywords according to the keywords, returning the bill type and the corresponding key information, and if the invoice does not belong to the machine invoice, entering the step S34;
s34, comparing the extracted keywords with the keyword column of the train ticket in the keyword list stored in the key information database of the ticket, if the invoice belongs to the train ticket, identifying the key information contained in the keywords according to the keywords, returning the type of the ticket and the corresponding key information, and if the invoice does not belong to the train ticket, entering the step S35;
s35, comparing the extracted keywords with the keyword column of the quota invoice in the keyword list stored in the bill keyword information database, if the invoice belongs to the quota invoice, identifying the key information contained in the keywords according to the keywords, returning the bill type and the corresponding key information, and if the invoice does not belong to the quota invoice, entering the step S36;
s36, if the invoice type of the invoice cannot be identified, classifying the invoice with the unidentified invoice type into an unidentified class and returning an identification result;
s4, carrying out secondary recognition on invoices which cannot be recognized or are checked wrongly after image processing, wherein the image processing method is determined according to the specific reasons which cannot be recognized, and the specific method of image processing comprises locking the position of key information, and carrying out block cutting, red seal elimination, line removal or machine learning training on incomplete numbers according to the coordinates of pixel points;
and S5, repeating the steps S1-S3 after the invoice of the unrecognizable class or the inspection error class is secondarily recognized, and acquiring the final bill type and the key information corresponding to the bill type.
2. The bill identification method based on mobile phone photographing as claimed in claim 1, wherein: the machine learning training of the incomplete digits is specifically to learn digits which are easy to identify errors, including 6 and 8,1 and 0,5 and 9, and 2 and 0.
3. The bill identification method based on mobile phone photographing as claimed in claim 2, wherein: learning a number that is easily identified as an error specifically includes the steps of:
pretreatment: finding partial sub-images of the ROI of the image and carrying out size normalization processing;
extracting features, namely converting the image into a feature vector;
and (4) classifying and identifying, namely classifying by adopting a k-nearest neighbor classification method, and finally finishing identification according to a classification result to accurately identify the numbers which are easy to identify errors.
4. The bill identification method based on mobile phone photographing as claimed in claim 3, wherein: the specific steps of the feature extraction are as follows: opening the image, performing noise reduction processing, graying the image, setting a threshold value, storing the binary image into a 32 × 32 array, wherein each point is a pixel value, and converting 1024(32 × 32) numerical values into (1, 1024) vectors.
5. The bill identification method based on mobile phone photographing as claimed in claim 1, wherein: the method for checking in the S3 is to send the key information to a national value-added tax invoice checking platform of the national tax administration to check the authenticity.
6. A bill identifying system used for the bill identifying method according to claim 1, characterized in that: which comprises a scanning device, an identification terminal and an intelligent identification system, wherein the scanning device and the identification terminal are respectively in communication connection with the intelligent identification system,
the intelligent identification system comprises a picture processing unit, a picture processing unit and a picture processing unit, wherein the picture processing unit is used for processing pictures;
the key information extraction unit is used for extracting key information of the picture according to the keywords;
the identification unit is used for identifying the bill according to the key information to obtain the bill type;
the checking unit is used for checking the value-added tax invoice;
and the communication unit is used for communicating with the identification terminal.
7. The document identification system of claim 6, wherein: the machine learning unit is used for performing machine learning training on incomplete numbers, particularly learning numbers which are easy to identify errors, wherein the numbers which are easy to identify errors comprise 6 and 8,1 and 0,5 and 9 and 2 and 0.
8. The document identification system of claim 7, wherein: learning a number that is easily identified as an error specifically includes the steps of:
pretreatment: finding partial sub-images of the ROI of the image and carrying out size normalization processing;
extracting features, namely converting the image into a feature vector;
and (4) classifying and identifying, namely classifying by adopting a k-nearest neighbor classification method, and finally finishing identification according to a classification result to accurately identify the numbers which are easy to identify errors.
9. The document identification system of claim 6, wherein: the scanning device is a mobile phone, and the intelligent identification system is a mobile phone APP.
CN201810482124.6A 2018-05-18 2018-05-18 Bill identification method and system based on mobile phone photographing Active CN108717545B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810482124.6A CN108717545B (en) 2018-05-18 2018-05-18 Bill identification method and system based on mobile phone photographing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810482124.6A CN108717545B (en) 2018-05-18 2018-05-18 Bill identification method and system based on mobile phone photographing

Publications (2)

Publication Number Publication Date
CN108717545A CN108717545A (en) 2018-10-30
CN108717545B true CN108717545B (en) 2020-12-18

Family

ID=63900021

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810482124.6A Active CN108717545B (en) 2018-05-18 2018-05-18 Bill identification method and system based on mobile phone photographing

Country Status (1)

Country Link
CN (1) CN108717545B (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11030450B2 (en) * 2018-05-31 2021-06-08 Vatbox, Ltd. System and method for determining originality of computer-generated images
CN111275035B (en) * 2018-12-04 2023-10-31 北京嘀嘀无限科技发展有限公司 Method and system for identifying background information
CN109472919A (en) * 2018-12-28 2019-03-15 远光软件股份有限公司 A kind of bill takes over method and associated terminal and storage device
CN109726783A (en) * 2018-12-28 2019-05-07 大象慧云信息技术有限公司 A kind of invoice acquisition management system and method based on OCR image recognition technology
CN111931473A (en) * 2019-05-13 2020-11-13 阿里巴巴集团控股有限公司 Bill processing method and device
CN110147838B (en) * 2019-05-20 2021-07-02 苏州微创关节医疗科技有限公司 Product specification inputting and detecting method and system
CN111178345A (en) * 2019-05-20 2020-05-19 京东方科技集团股份有限公司 Bill analysis method, bill analysis device, computer equipment and medium
CN110619056A (en) * 2019-06-19 2019-12-27 深圳壹账通智能科技有限公司 Invoice input method, device, equipment and computer storage medium
CN110334640A (en) * 2019-06-28 2019-10-15 苏宁云计算有限公司 A kind of ticket processing method and system
CN110427853B (en) * 2019-07-24 2022-11-01 北京一诺前景财税科技有限公司 Intelligent bill information extraction processing method
CN110675234A (en) * 2019-08-23 2020-01-10 国信电子票据平台信息服务有限公司 Electronic newspaper bill generation method and electronic equipment
CN110675546B (en) * 2019-09-06 2022-07-08 深圳壹账通智能科技有限公司 Invoice picture identification and verification method, system, equipment and readable storage medium
CN111104853A (en) * 2019-11-11 2020-05-05 中国建设银行股份有限公司 Image information input method and device, electronic equipment and storage medium
CN111199222B (en) * 2019-12-30 2023-08-25 航天信息软件技术有限公司 Bill management method and electronic equipment
CN112135002A (en) * 2020-07-31 2020-12-25 钱微 Bill filling system for financial management and working method thereof
CN112541461A (en) * 2020-12-21 2021-03-23 四川新网银行股份有限公司 Automatic auditing method and device for consumption credentials without fixed format template
CN112699860B (en) * 2021-03-24 2021-06-22 成都新希望金融信息有限公司 Method for automatically extracting and sorting effective information in personal tax APP operation video
CN113240503A (en) * 2021-04-08 2021-08-10 福建升腾资讯有限公司 Reimbursement invoice management method, device and medium based on intelligent equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150339739A1 (en) * 2012-04-26 2015-11-26 Chengdu Santai Holding Group Co., Ltd. Corporate bill selling system with anti-counterfeiting verification

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102750541B (en) * 2011-04-22 2015-07-08 北京文通科技有限公司 Document image classifying distinguishing method and device
CN102208092A (en) * 2011-05-25 2011-10-05 重庆市电力公司永川供电局 Financial bill reimbursement automatic processing method
CN104050450A (en) * 2014-06-16 2014-09-17 西安通瑞新材料开发有限公司 Vehicle license plate recognition method based on video
CN105809814A (en) * 2014-12-30 2016-07-27 航天信息股份有限公司 Invoice certification system supporting multiple invoice types and method
CN105046553A (en) * 2015-07-09 2015-11-11 胡昭 Cloud intelligent invoice recognition inspection system and method based on mobile phone
CN105654072B (en) * 2016-03-24 2019-03-01 哈尔滨工业大学 A kind of text of low resolution medical treatment bill images automatically extracts and identifying system and method
CN107977665A (en) * 2017-12-15 2018-05-01 北京科摩仕捷科技有限公司 The recognition methods of key message and computing device in a kind of invoice

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150339739A1 (en) * 2012-04-26 2015-11-26 Chengdu Santai Holding Group Co., Ltd. Corporate bill selling system with anti-counterfeiting verification

Also Published As

Publication number Publication date
CN108717545A (en) 2018-10-30

Similar Documents

Publication Publication Date Title
CN108717545B (en) Bill identification method and system based on mobile phone photographing
CN108777021B (en) Bill identification method and system based on scanner mixed scanning
CN110298338B (en) Document image classification method and device
US8879846B2 (en) Systems, methods and computer program products for processing financial documents
US9342741B2 (en) Systems, methods and computer program products for determining document validity
CN108960223B (en) Method for automatically generating voucher based on intelligent bill identification
US20220292858A1 (en) Methods for mobile image capture of vehicle identification numbers in a non-document
US7983468B2 (en) Method and system for extracting information from documents by document segregation
CA2589947C (en) Machine character recognition verification
CN106228675A (en) The method and apparatus identifying true from false of bills
CN112395996A (en) Financial bill OCR recognition and image processing method, system and readable storage medium
CN114202759A (en) Multi-currency paper currency crown word number identification method and device based on deep learning
CN114511866A (en) Data auditing method, device, system, processor and machine-readable storage medium
CN112215225B (en) KYC certificate verification method based on computer vision technology
CN111462388A (en) Bill inspection method and device, terminal equipment and storage medium
CN111881880A (en) Bill text recognition method based on novel network
US20010047331A1 (en) Method for processing remittance payment documents
CN111582115A (en) Financial bill processing method, device and equipment and readable storage medium
CN105956590A (en) Character recognition method and character recognition system
Shi et al. An invoice recognition system using deep learning
CN115205882A (en) Intelligent identification and processing method for expense voucher in medical industry
CN113066223A (en) Automatic invoice verification method and device
CN111223230A (en) Invoice file authenticity identification method based on CRNN algorithm
CN111488852A (en) Intelligent payment checking early warning system and method based on image recognition
US11900755B1 (en) System, computing device, and method for document detection and deposit processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 501-018, floor 5, No. 15, wanquanzhuang Road, Haidian District, Beijing 100089

Patentee after: Dajingfang Network Technology Co.,Ltd.

Address before: 100000 405, No. 15, wanquanzhuang Road, Haidian District, Beijing

Patentee before: BEIJING DAZHANGFANG NETWORK TECHNOLOGY Co.,Ltd.